Department of Computer Science and Engineering
As digital cameras with Global Positioning System (GPS) capability become available and people geotag their photos using other means, it is of great interest to annotate semantic events (e.g., hiking, skiing, party) characterized by a collection of geotagged photos with timestamps and GPS information at the capture. We address this emerging event classification problem by mining informative features derived from image contents and spatio-temporal traces of GPS coordinates that characterize the underlying movement patterns of various event types, both based on the entire collection as opposed to individual photos. Considering that events are better described by the co-occurrence of objects and scenes, we bundle primitive features such as color and texture histograms or GPS features to form the discriminative compositional feature. A data mining method is proposed to efficiently discover discriminative compositional features of small classification errors. A theoretical analysis is also presented to guide the selection of the data mining parameters. Upon compositional feature mining, we perform the multiclass AdaBoost to further integrate the mined compositional features. Finally, the GPS and visual modalities are united through a confidence-based fusion. Based on a dataset of more than 3000 geotagged images, experimental results have shown the synergy of all of the components in our proposed approach to event classification.
Yuan, Junsong, Jiebo Luo, and Ying Wu. "Mining compositional features from GPS and visual cues for event recognition in photo collections." IEEE Transactions on Multimedia 12.7 (2010): 705-716.