Abstract
		Contents
		
		Introduction 
		Relevance of the master's work topic 
		The scientific innovation of the work 
		Expected practical results 
		Object and subject of research 
		Point, Line and Edge Detection 
		Gradient Operators 
		The Laplacian 
		Edge Linking and Boundary Detection. Local Processing 
		Basic Global Thresholding 
		Region growing 
		Segmentation by Morphological Watersheds 
		The Use of Motion in Segmentation. Spatial Techniques 
		Conclusions 
		References 
		
		 
	
		Research of image segmentation methods
		
		
		Introduction
		
		
		Segmentation subdivides an image into its constituent regions or objects. 
		The level to which the subdivision is carried depends on the problem being 
		solved. That is, segmentation should stop when the objects of interest have 
		been isolated.
		
  
		Segmentation of nontrivial images is one of the most difficult tasks in image 
		processing. Segmentation accuracy determines the eventual success or failure 
		of computerized analysis procedures.
		 
		
		
		Relevance of the master's work topic
		
		Segmentation of video frames is very important in the video data digital 
		processing.  This procedure is applied at the stage of frame generation. 
		There are three types of frames used in video compression: I‑frames 
		(Intra), P‑frames (predicted), and B‑frames (bidirectional). An I‑frame 
		is an 'Intra-coded picture', in effect a fully-specified picture, like a 
		conventional static image file. P‑frames and B‑frames hold only the 
		changes in the image. And these changes aren’t applied the whole frame, 
		but to each slice of picture. So there is a need to segment intraframe 
		into slices to achieve good compression rate.		
		 
		
			
		The scientific innovation of the work
		
		The scientific innovation of this work is to develop an improved 
		segmentation method of video sequence frame.	
		 
		
		
		Expected practical results
		
		As a result of research it's expected to achieve optimal layer 
		separation of keyframe for generating following reference frames.		
		 
		
		
		Object and subject of research
		
		The object of research is the image frame sequences containing different 
		fragments. The subject of research is the method of image 
		segmentation into layers or regions.		
		 
		
		Main results
		
		Point, Line and Edge Detection
		
		Image segmentation algorithms generally are based in on one of two basic 
		properties of intensity values: discontinuity and similarity. Let’s 
		begin with the methods suitable for detecting gray-level discontinuities 
		such as points, lines, and edges. The most common way to look for 
		discontinuities is to run a mask through the image. 
  
		
		 
		(1.1) 
		
		The detection of isolated points in an image is straightforward in 
		principle. For example, 3x3 mask is used: 
		 
		
			
		
				| -1 | 
				-1 | 
				-1 | 
			 
			
				| -1 | 
				8 | 
				-1 | 
			 
			
				| -1 | 
				-1 | 
				-1 | 
			 
		 
		
		
		Figure 1.1. A general 3х3 mask.
		 
		
		
		Point has been detected at the location on which the mask is centered if  
		 
		(1.2) 
		where T is a nonnegative threshold and R is given by Eq. (1.1). This 
		formulation measures the weighted differences between the center point 
		and its neighbors. The idea is that an isolated point will be quite 
		different from its surroundings. Mask coefficients sum to zero, 
		indicating that the mask response will be zero in areas of constant gray 
		level.
		
  
		For line detection such masks are used:
		 
		
			
				| 
				
				 | 
				
				
				 | 
				
				
			 | 
				
				
			 | 
			 
			
				| Horizontal | 
				+45 | 
				Vertical | 
				-45 | 
			 
		 
		
		
		Figure 1.2. Masks for line detection.
		 
		
		
		The response will be the most strong to lines one pixel thick. Each mask 
		sets the line direction. Coefficients in each mask sum to zero. It is 
		possible to vary lines thick.
  
		Edge detection is the most common approach for detecting meaningful 
		discontinuities in gray level. Models of ideal and ramp digital edges:  
		
		  
		Figure 1.3. (a) Model of an ideal digital edge; 
		(b) model of an ideal digital edge. 
		The degree of an image edge blurring is determined by factors such as 
		the quality of the image acquisition system, the sampling rate, and 
		illumination conditions under which the image is acquired. Figure 1.4 
		shows horizontal gray-level profile of the edge, its first and second 
		derivatives. The first derivative is positive at the points of 
		transition into and out of the ramp as we move from left to right along 
		the profile; and is zero in areas of constant gray level. The second 
		derivative is positive at the transition associated with the dark side 
		of the edge, negative at the transition associated with the light side 
		of the edge, and zero along the ramp and in areas of constant gray 
		level.
		
  
		It’s possible to conclude that the magnitude of the first derivative can 
		be used to detect the presence of an edge at a point in an image. 
		Similarly, the sign of the second derivative can be used to determine 
		whether an edge pixel lies on the dark or light side of an edge.
		
  
		Note two additional properties of the second derivative around an edge: 
		(1) It produces two values for every edge in an image (an undesirable 
		feature); and (2) an imaginary straight line joining the extreme 
		positive and negative values of the second derivative would cross zero 
		near the midpoint of the edge. The zero-crossing property of the second 
		derivative is quite useful for locating the centers of thick edges.   
		  
		Figure 1.4. Horizontal gray-level profile of the 
		edge, its first and second derivatives. 
		
		Real images are often corrupted by noise.  
		
		  
		Figure 1.5. Gray level profile of a ramp edge 
		corrupted by noise; noise effects on first and second derivatives. 
		
		 
		
		
		Gradient Operators
		
		The gradient of an image f(x, y) at location (x, y) is defined as the 
		vector 
		
		 
		(1.3) 
		
		Magnitude of this vector  
		 
		(1.4) 
		This quantity gives the maximum rate if increase of f(x, y) per unit 
		distance.  
		The direction of the gradient vector  
		 
		(1.5) 
		
		Computation of the gradient of an image is based on obtaining the 
		partial derivatives df/dx and df/dy at every pixel location. First-order 
		partial derivatives can be computed by:  
		а) Roberts cross-gradient operators: : Gx = (z9-z5); 
		Gy = (z8 – z6);  
		б) Prewitt operators: Gx = (z7+z8+z9) – (z1 + z2 + z3); 
		Gy = (z3+z6+z9) – (z1 + z4 + z7);  
		в) Sobel operators : Gx = (z7+2*z8+z9) – (z1 + 2*z2 + z3); 
		Gy = (z3+2*z6+z9) – (z1 + 2*z4 + z7). 
		 
	
		
		
			
				
				
					
					
						| z1 | 
						z2 | 
						z3 | 
					 
					
						| z4 | 
						z5 | 
						z6 | 
					 
					
						| z7 | 
						z8 | 
						z9 | 
					 
				 
				 | 
				
				
				 | 
				
				
				 | 
			 
			
			| Image region | 
			Roberts | 
			 
		 
		 
		
		
		
		Figure 1.6. Image region and different masks for 
		computation of the gradient.
		
		Masks of size 2x2 are awkward to implement because they do not have a 
		clear center. The Prewitt masks are simpler to implement than the Sobel 
		masks, but the latter have slightly superior noise-suppression 
		characteristics, an important issue when dealing with derivatives.
  
		
		An approach used frequently is to approximate the gradient by absolute 
		values:  
		
		 
		(1.6)
		 
		
		
		The Laplacian
		
		The Laplacian of a 2-D function f(x, y) is a second order derivative 
		defined as  
		
		 
		(1.7) 
		
		For a 3x3 region, one of the two forms encountered most frequently in 
		practice is 
		
		 
		(1.8) 
		
		A digital approximation including the diagonal neighbors given by 
		
		 
		(1.9)
		 
		
		
			
				| 
				
				 | 
				
				
					
					
						| -1 | 
						-1 | 
						-1 | 
					 
					
						| -1 | 
						8 | 
						-1 | 
					 
					
						| -1 | 
						-1 | 
						-1 | 
					 
				 
				 | 
			 
		 
		
		
		Figure 1.7. Laplacian masks for equations  (1.8) 
		and (1.9)
		
		 
		
	
	
	
	Edge Linking and Boundary Detection. Local Processing
		
		
		One of the simplest approaches for linking edge points is analyze the 
		characteristics of pixels in a small neighborhood (3x3 or 5x5) about 
		every point (x, y) in an image that has been labeled an edge point by 
		one of the techniques discussed in the previous section. All points that 
		are similar according to a set of predefined criteria are linked, 
		forming an edge of pixels that share those criteria.   
		
		 
		(1.10) 
		where E is a nonnegative threshold.  
		
		 
		(1.11) 
		
		where A is a nonnegative angle threshold. The direction of the edge at 
		(x, y) is perpendicular to the direction of the gradient vector at that 
		point.
  
		
		A point in the predefined neighborhood of (x, y) is linked to the pixel 
		at (x, y) if both magnitude and direction criteria are satisfied. This 
		process is repeated at every location in the image.
		 
		
					
		
		Basic Global Thresholding
		
		
		The simplest of all thresholding techniques is to partition the image 
		histogram by using a simple global threshold T.
  
		
		The following algorithm can be used to obtain T automatically:  
		1. Select an initial estimate for T. 
		2. Segment the image using T. This will produce two groups of pixels: 
		G1 consisting of all pixels with gray 
		level values > T and G2 consisting of pixels with values >=T. 
		3. Compute the average gray level values µ1 and µ2 for the pixels in regions G1 and 
		G2. 
		4. Compute a new threshold value: T = (µ1 + µ2) / 2. 
		5. Repeat step 2 through 4 until the difference in T in successive iterations is 
		smaller than a predefined parameter T0. 
		
		
		Region growing
		
		Region growing is a procedure that group pixels or subregions into 
		larger regions based on predefined criteria. The basic approach is to 
		start with a set of “seed” points and from these grow regions by 
		appending to each seed those neighboring pixels that have properties 
		similar to the seed.
  
		Selecting a set of one or more starting points often can be based on the 
		nature of the problem. The selection of similarity criteria also depends 
		on the type of image data available.
  
		The problem in region growing is the formulation of a stopping rule. 
		Basically, growing a region should stop when no more pixels satisfy the 
		criteria for inclusion in that region. Additional criteria that increase 
		the power of a region-growing algorithm utilize the concept of size, 
		likeness between a candidate pixel and the pixels grown so far, and the 
		shape of the region being grown.
		 
		
		
		Segmentation by Morphological Watersheds
		
		The concept of watersheds is based on visualizing an image in three 
		dimensions: two spatial coordinates versus gray levels. In such a 
		“topographic” interpretation, three types of points: (a) points 
		belonging to a regional minimum; (b) points at which a drop of water, if 
		placed at the location of any of those points, would fall with certainly 
		to a single minimum; and (c) points at which water would be equally 
		likely to fall to more than one such minimum. For a particular regional 
		minimum, the set of points satisfying condition (b) is called the 
		catchment basin or watershed of that minimum. The points satisfying 
		condition (c) form crest lines on the topographic surface and are termed 
		divide lines or watershed lines.
  The basic idea is simple: 
		suppose that a hole is punched in each regional minimum and that the 
		entire topography is flooded from below by letting water rise through 
		the holes at a uniform rate. When the rising water in distinct catchment 
		basins is about to merge, a dam is built to prevent the merging. These 
		dam boundaries correspond to the divide lines of the watersheds.		
		 
		  
		Figure 1.8. Image segmentation by morphological 
		watersheds.
		
			
		
		
		The Use of Motion in Segmentation. Spatial Techniques
		
		
		One of the simplest approaches for detecting changes between two frames 
		f (x, y, ti) and f (x, y, tj), is to compare two images pixel by pixel. 
		One procedure for doing this is to form a difference image. Such image 
		includes only nonstationary components. 
		
  
		A difference image may be defined as
  
		
		 
		(1.18) 
		
		where T is a specified threshold. dij(x, y) has a value of 1 at spatial 
		coordinates (x, y) only if the gray-level difference between the two 
		images is appreciably different at those coordinates, as determined by 
		T.
		
  
		In dynamic image processing, all pixels in dij(x, y) with value 1 are 
		considered the result of object motion. This approach is applicable only 
		if the two images are registered spatially and if the illumination is 
		constant. In practice, 1-valued entries in dij(x, y) often arise as a 
		result of noise. Such isolated points may be ignored.
  
		Consider a sequence of image frames f(x, y, t1), f(x, y, t2),…, f(x, y, 
		tn) and let f(x, y, t1) be the reference image. An accumulative 
		difference image (ADI) is formed by comparing this reference image with 
		every subsequent image in the sequence. A counter for each pixel 
		location in the accumulative image is incremented every time a 
		difference occurs.
  
		Often useful is consideration of three types of accumulative difference 
		images: absolute, positive, and negative ADIs. Let R(x, y) denote the 
		reference image and let k denote tk, so that f(x, y, k) = f(x, y, tk). 
		We assume that R(x, y) = f(x, y, 1). Then, for any k > 1, and keeping in 
		mind that the values of the ADIs are counts, we define the following for 
		all relevant values of (x, y):
  
		 
		(1.19) 
		
		 
		(1.20) 
		
		 
		(1.21) 
		
		These ADIs start out with all zero values (counts). ADIs are the same 
		size as the images in the sequence.
		
  
		  
		Figure 1.9. Absolute, positive, and negative ADIs. 
		
		In practice, obtaining a reference image with only stationary elements 
		is not always possible, and building a reference from a set of images 
		containing one or more moving objects becomes necessary.
		 
		
		
		Conclusions
		
		The above material shows that there are many methods for image 
		segmentation. While the list of described methods isn’t exhaustive, 
		these methods are typical representatives of the segmentation technique. 
		You can also come to the conclusion that different techniques can be 
		combined. For example, it is possible to combine the spatial 
		segmentation with the segmentation of moving objects. 
		 
		
		
		References
		
			- 
			
			R.C. Gonzalez, R.E.Woods, Digital Image Processing, Prentice-Hall, Inc, 
			Upper Saddle River, New Jersey, pp. 567-642, 2002.  
			 
			 
			- 
			
			Segmentation (image processing) [Электронный ресурс] – 
			Режим доступа к статье: 
			http://en.wikipedia.org/wiki/Segmentation_(image_processing)
			 
			 
			- 
			
			Универсальная классификация алгоритмов сегментации изображений [Электронный ресурс] 
			/ С.В. Поршнев, А.О. Левашкина – Режим доступа к статье: 
			http://www.jurnal.org/articles/2008/inf23.html			
			 
			 
			- 
			
			
			D.J.Williams and M.Shas. “Edge Contours Using Multiple Scales”. 
			Computer Vision, Graphics and Image Processing, 51, 1990, 
			pp.256-274. 
			 
			- 
			
			
			V. Lacroix. “The Primary Raster: A Multiresolution Image 
			Description”. In Proceedings of the 10th International Conference on 
			Pattern Recognition, 1990, pp. 903-907.
			
			 
			 
			- 
			
			
			J.F.Canny. “A Computational Approach to Edge Detection”. IEEE 
			Transactions on Pattern Analysis and Machine Intelligence, 8(6), Nov 
			1986, pp. 679-698.
			
			 
			 
			- 
			
			
			D. Ziou and S. Tabbone. “A Multi-Scale Edge Detector”. Pattern 
			Recognition, 26(9), 1993, pp.1305-1314. 
			 
			- 
			
			
			T. Lindeberg. “Edge Detection and Ridge Detection with Automatic 
			Scale Selection”. In Proceedings of IEEE, International Conference 
			on Computer Vision and Pattern Recognition, San Francisco, 1996, pp. 
			465-470. 
			 
			- 
			
			William K. Pratt Digital Image Processing, 
			Wiley-Interscience Publication, 2001, pp. 551-589.			
			 
			 
			- 
			
			
			L. M. J. Florack, B. M. ter Haar Romeny, J. J. Koenderink, and M. A. 
			Viergever. “Scale and the diferential structure of images”. Image 
			and Vision Computing, 10(6), Jul 1992, pp.376-388. 
			 
		 
	
		 |