【学术报告】Computer Vision with a Myriad of Eyes----信息工程研究所

题目：Computer Vision with a Myriad of Eyes

报告人：Prof. Jiebo Luo, University of Rochester

时间：2013年6月6日（星期四）下午16：00- 17：30

地点：中国科学院信息工程研究所3号楼3221室

Abstract:

A recent trend in computer vision is driven by images and video generated by heterogeneous and multi-perspective visual sensing networks. We present a few examples of research along this line. First, we will present an interesting framework for event recognition. Semantic event recognition based only unconstrained still images available on the Internet or in personal repositories is a challenging problem. With GPS information, we obtain satellite images corresponding to picture locations and investigate their novel use to recognize the picture-taking environment. We then combine this inference with classical vision-based event detection methods and demonstrate the synergistic fusion of the two approaches. However, the current GPS data only identifies the camera location, leaving the viewing direction uncertain. Second, to determine the viewing direction for geotagged photos, we utilize both Google Street View and Google Earth satellite images: 1) visual matching between a user photo and any available street views in the vicinity determine the viewing direction, and 2) when only an overhead satellite view is available, near-orthogonal view matching between the user photo and satellite imagery computes the viewing direction. Third, we explore using phone-captured images for localization as it contains more context information than the embedded sensory GPS coordinates. The proposed approach is able to provide a comprehensive set of accurate geo-context based on the current image and its associated sensory GPS location. The geo-context includes the real location of mobile user and scene, the viewing angle, and the distance between the user and scene. Finally, we take advantage of the aforementioned techniques to build applications to enable people to enjoy ubiquitous location-based services (LBS) using their phones. Specifically, we first perform joint geovisual clustering in the cloud to generate scene clusters, with each scene represented by a 3D model. The 3D scene models are then indexed using a visual vocabulary tree structure. The phone-captured image is used to retrieve the relevant scene models, then aligned with the models, and further registered to the real-world map. We showcase three novel applications: 1) accurate augmented reality, 2) collaborative localization for rendezvous routing, and 3) routing for photographing. The evaluations through user studies indicate these applications are effective in facilitating the perfect rendezvous for mobile users. Furthermore, a new source of visual data comes from public webcams deployed in urban environments. We will present some ongoing work on crowd analytics using such data.

Biography:

Jiebo Luo is a professor of Computer Science at the University of Rochester. Prior to joining Rochester in Fall 2011, he was a Senior Principal Scientist with the Kodak Research Laboratories. His research interests include image processing, computer vision, machine learning, social media data mining, medical imaging, and pervasive computing. Dr. Luo has authored over 200 technical papers and holds over 70 US patents. Dr. Luo has been actively involved in numerous technical conferences, including serving as the general chair of ACM CIVR 2008, program co-chair of ACM Multimedia 2010, IEEE CVPR 2012 and IEEE ICIP 2017, area chair of IEEE ICASSP 2009-2012, ICIP 2008-2012, CVPR 2008 and ICCV 2011, and an organizer of ICME 2006/2008/2010 and ICIP 2002. Currently, he serves on several IEEE SPS Technical Committees (IMDSP, MMSP, and MLSP) and conference steering committees (ACM ICMR and IEEE ICME). He is the Editor-in-Chief of the Journal of Multimedia, and has served on the editorial boards of the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), the IEEE Transactions on Multimedia (TMM), the IEEE Transactions onCircuits and Systems for Video Technology (TCSVT), Pattern Recognition (PR), Machine Vision and Applications (MVA), and Journal of Electronic Imaging (JEI). He is a Fellow of the SPIE, IEEE and IAPR.