Clustering and Searching WWW Images Using

Link and Page Layout Analysis

Xiaofei He, Deng Cai, Ji-Rong Wen, Wei-Ying Ma and Hong-Jiang Zhang

Microsoft Technical Report, MSR-TR-2004-38, April 2004.

paper

¡¡

Due to the rapid growth of the number of digital images on the Web, there is an increasing demand for effective and efficient method for organizing and retrieving the images available. This paper describes iFind, a system for clustering and searching WWW images. By using a vision-based page segmentation algorithm, a web page is partitioned into blocks, and the textual and link information of an image can be accurately extracted from the block containing that image. The textual information is used for image representation. By extracting the page-to-block, block-toimage, block-to-page relationships through link structure and page layout analysis, we construct an image graph. Our method is less sensitive to noisy links than previous methods like PicASHOW, and hence the image graph can better reflect the semantic relationship between images. Using the notion of Markov Chain, we can infer the semantic degrees of the images, i.e. ImageRanks, which characterize the importance of the images. The ImageRanks are combined with the relevance scores to produce the final ranking for image search. With the graph models, we can also use techniques from spectral graph theory for image clustering and embedding, or 2-D visualization. Some experimental results on 11.6 million images downloaded from the Web are provided in the paper.

¡¡

Design of our WWW image search system


Image Blocks

By using a vision-based page segmentation algorithm, a web page is partitioned into blocks, and the textual and link information of an image can be accurately extracted from the block containing that image. The textual information is used for image indexing.


Image Graph

The block-to-page link structure. The red arrows denote the links from image block to web pages.

By extracting the page-to-block, block-to-image, block-to-page relationships through link structure and page layout analysis, we can construct an image graph. Our method is less sensitive to noisy links than previous methods like PicASHOW, and hence the image graph can better reflect the semantic relationship between images.


ImageRank

With the graph models, we use techniques from spectral graph theory and Markov Chain theory for image ranking, clustering and embedding. By embedding we mean that each image can be endowed with a vector representation. A Markov Chain can be induced from the image-to-image graph. The stationary distribution of the Markov Chain gives a quantitative evaluation of the importance of each image. We call it ImageRank. This measurement is useful for ranking the image search result similar to what google does for web page search. That is, in addition to content and textual similarities, the image search could incorporate other measurements such as semantic richness, degree of representative or importance, to produce the final ranking. It is also useful for browsing purpose.