Image parsing is the problem of assigning an object label to each pixel. It unifies the image segmentation and object recognition problems. For instance, for a database of horse images, image parsing can be thought of as the task of classifying each pixel as part of a horse or nonhorse. In more complicated problems, image parsing might require multiple labels, e.g. roads, cars, houses etc. in outdoors scenes. Clearly, pixels can not be classified in this manner based only on their intensities or even local feature descriptors. Contextual information plays a critical role in resolving ambiguities. Image parsing can be posed as a supervised learning problem where a classifier is learnt from training data consisting of images and corresponding label maps. Autocontext and convolutional networks are two promising approaches that apply context to image parsing in the supervised learning setting.
Cite this article:
Pramod Kumar Pandey, P.K. Singh. A Comprehensive Approach for Image to Text Description. Int. J. Tech. 1(2): July-Dec. 2011; Page 117-120