One Day Technical Meeting held on 16th October 1996 at the British Institute of Radiology, 36 Portland Place, London.
The traditional approach to labelling low-level image features is to apply a filter to the grey level image and set a hard threshold on the output of this filter. Determining an ``optimal'' threshold is problematic since there is usually variation within scenes. Furthermore, labelling with a hard threshold is sub-optimal since the grey-level pixel values are random variables; the Bayesian posterior probability is a more meaningful label.
This talk will describe a generalised approach to image feature detection which seeks to label pixels with the conditional probability of the feature occurring at that location and will be illustrated with results from edge and corner detection. The feature labelling problem has been recast as one in statistical pattern recognition where we have implemented the classifier with a neural network. Several pre-processing stages are employed to reduce the complexity of the pattern space to be learned.
In order to ensure sufficient density and coverage of the pattern space, we have generated the training examples for the classifier with grey-level image patch models as the use of training examples from real images will be shown to result in inadequate classifier generalisation. Performance will be described and comparisons drawn with common ``traditional'' edge and corner detectors.
Ongoing research on the labelling of general features will be discussed.
The quality of the outcome of the early stages of vision is of paramount importance if the higher levels are to be able to use a well-described set of symbols for a meaningful interpretation of a scene. This transformation of signals to symbols could be optimised to provide confident results. Following a brief review of issues of optimisation in human vision, we present and compare two approaches for optimising the low level feature extraction process for edge and line detection. One is based on a standard Hill-Climbing approach and the other uses Genetic algorithms. The frame- work presented is general and can be used for any edge or line detector.
We use an algorithm originally introduced by Prof. W. Forstner to detect circularly symmetrical features. It works on image intensity gradients, and identifies each pixel with a line passing through it and perpendicular to the direction of the gradients. It finds the point which has minimum sum of squared distances from those lines. Each distance is weighted by the likehood that the pixel comes from a line. The likehood is defined to be proportional to the intensity gradient.
The `corner' detector is further incorporated with an canny edge detector into a object localisation system. A deformable template, which consists of a `corner' and two elliptical edges, is defined. The system starts with initial estimates (priors) and the object model. It uses this information to guide the search for focus features in the image. The information recovered from the image processing is used to refine the estimates of the objection pose and position.
The results of the study suggest that relatively simple low level image processing algorithms, and simple 3D models, embedded in a Bayesian statistical reasoning architecture can provide a highly effective, albeit specialised object recognition system.
The work is part of an onging research project `model-driven vision under variable camera geometry'.
It will be argued that image measurements should satisfy two requirements of physical plausibility, the measurements must be of non-zero scale and non-zero imprecision; and two required invariances, nothing is lost by expanding the image and nothing is lost by increasing the contrast of the image. A model of image measurements satisfying these constraints, based on blurring the graph of the incident luminance, will be described. Blurring the graph horizontally corresponds to the loss of information due to increased viewing distance; blurring vertically to the loss of information as contrast is lowered.
Blurred graphs of the incident luminance define a local histogram at each point. These histograms may be summarized by picking a representative value e.g the mean, median or a mode. Such a procedure produces a single-valued image such as is normally dealt with in image processing. It will be shown that mode filtering of a particular type (stable mode filtering) produces interesting results. In particular the image defined by mode filtering has discontinuities. Experiment shows that these discontinuities correspond well to perceptual edges and theoretical work shows that they behave `nicely' with changing scale and imprecision.
The Hough Transform(HT) is recognized as a powerful tool in shape analysis. It is used to extract low level features, for example straight lines, and is useful even in conditions of noise and occlusion. However, the performance of the HT can be compromised due to the discrete nature of the image. Approximations are made to the true angle of a line, as digitally it will be represented by many short segments. It therefore becomes more difficult to extract the true angle from parameter space.
Research on the HT can be separated into three fundamental categories; new theory, performance aspects, and efficient computation of the HT. As part of the developments in new theory and efficient computation a current trend in software solutions is that of the Probabilistic Hough transforms (PHT's). Due to the one-to-many nature of the transform much redundant information will be accumulated. The PHT's seek to reduce the amount of redundant information in the transform space by sampling the image data in various ways. This leads to a large reduction in the run-time of the algorithm.
In addition to improving computation times, the PHT's techniques offer new understanding of the transformation process and suggest ways of potentially improving performance. The current work seeks to show that performance can be improved by inhibiting the transformation of nearest neighbour connections when using the PHT's. Preliminary results show a significant reduction in the adverse effects of correlated noise on peak detection.
This research began with the aim of providing a solution to the problem of muscle cell segmentation in stained histological images. Existing computational methods have not been able to cope with realistically complex images although the segmentation of such images is performed with apparent ease by the human eye. Solutions have been sought through the investigation of mechanisms contributing to segmentation in the human visual system, which displays an outstanding ability to rapidly group fragmented contours into closed contours which bound surface regions.
We have implemented and extended the model proposed by Heitger et al (1992). The original model, simulating neural activities especially in the area V2 of the visual cortex, is capable of responding to both real and illusory (foreground) contours and in some instances it can carry out figure-ground assignment. Our work proposes extensions to this model, which enable it to complete also fragmented *background* contours. It does so by combining elementary features such as edge and line fragments, line ends, corners and junctions. Unlike the original model, ours takes into account the polarity of all the relevant elementary features. With these extensions the model is able to complete fragmented boundaries such as may occur in muscle cell images.
Heitger, F., Rosenthaler, L., von der Heydt, R., Peterhans, E., Kubler (1992) Simulation of neural contour mechanisms: from simple to end-stopped cells. Vision Research 32, 963-981.
This paper presents an unsupervised segmentation method applicable to both 2D and 3D images. The segmentation is achieved by a bottom-up hierarchical analysis to progressively agglomerate pixels in the image into non-overlapped homogeneous regions characterised by linear model expressed as a linear polynomial plus additive Gaussian noise. Parameters of the region model are estimated in a hierarchical way based on the least squares method. The whole hierarchy consists of several levels of adjacency graphs each of which describes an irregular partition of the image. Beginning from the adjacency graph describing the original input image, a clustering operation is iteratively applied to the current highest level of the hierarchy to agglomerate connected sub-graphs into the nodes in the graph at the higher level until no further agglomeration can be made. The clustering operation is performed based on the nearest neighbour criterion in which the nearest neighbour of a region is determined under the framework of a statistical inference through fitting the linear model to regions. The classification obtained at the each level of the hierarchy is optimal with respect to the implied definition of the nearest neighbour. The adjacency graph at the top of the hierarchy then describes the result of the segmentation.
Experiments have been applied to both 2D and 3D real images, and satisfactory results have been obtained.
Texture is a scale dependent property. However, the way a texture appears from different scales cannot necessarily be deduced by blurring a single image of the texture obtained at a certain distance. This issue is particularly acute for colour textures since the human vision system blurs colours using different point spread functions for different distances. In the work we shall present we experiment with colour textures which are captured by the camera at different distances and compare them with the ways they would look if they were blurred by the human eye. We use the perceptual blurring approach to segment colour textured images by using the most salient colour of each texture as a feature.