P.h.D Thesis
The large volume of images shared on websites and personal archives has presented challenges in massive multimedia management. Due to the well-known semantic gap between human-understandable high-level semantics and machine-generated low-level features, recent years have seen significant research effort in multimedia content understanding and indexing. Computer vision algorithms for individual tasks such as object recognition, detection, and segmentation have achieved impressive results. The next challenge is to integrate these algorithms and address the problem of complete scene understanding, which involves recognizing all the objects of interest and their spatial extent or shape. True semantic understanding of an image primarily involves scene classification and semantic segmentation. The former aims to determine the categories to which an image belongs, while the latter provides a semantic label for each pixel, describing the category of object it appears in. Solutions for semantic interpretation and understanding of images will enable and enhance a wide variety of computer vision applications. While humans can perform these tasks easily, the sheer quantity of data involved can make it prohibitively laborious for a computer. This thesis proposes novel approaches for semantic scene categorization, segmentation, and retrieval that enable a device with limited resources to understand images automatically. The proposed computer vision solutions use machine-learning algorithms to build robust and reusable systems. Since learning is a key component of biological vision systems, the design of automatic artificial systems that can learn is one of the most important trends in modern computer vision research.
Download my PhD thesis here!
Comments