网站地图  |  联系我们  |  中国科学院
 首 页  机构概况 新闻中心 科研成果 研究队伍 合作交流 教育培训 党群园地 创新文化 信息公开
 
新闻中心
图片新闻
综合新闻
X-NUCA
科研动态
学术报告
媒体聚焦
通知公告
  当前位置:首页>新闻中心>学术报告
【学术报告】Computer Vision with a Myriad of Eyes
文章来源:一室  |  发布时间:2013-06-05  |  【 】 【打印】 【关闭】  |  浏览:

题目:Computer Vision with a Myriad of Eyes

报告人:Prof. Jiebo Luo, University of Rochester

时间:2013年6月6日(星期四)下午16:00- 17:30

地点:中国科学院信息工程研究所3号楼3221室

Abstract:

A recent trend in computer vision is driven by images and video generated by heterogeneous and multi-perspective visual sensing networks. We present a few examples of research along this line.  First, we will present an interesting framework for event recognition. Semantic event recognition based only unconstrained still images available on the Internet or in personal repositories is a challenging problem. With GPS information, we obtain satellite images corresponding to picture locations and investigate their novel use to recognize the picture-taking environment. We then combine this inference with classical vision-based event detection methods and demonstrate the synergistic fusion of the two approaches. However, the current GPS data only identifies the camera location, leaving the viewing direction uncertain. Second, to determine the viewing direction for geotagged photos, we utilize both Google Street View and Google Earth satellite images: 1) visual matching between a user photo and any available street views in the vicinity determine the viewing direction, and 2) when only an overhead satellite view is available, near-orthogonal view matching between the user photo and satellite imagery computes the viewing direction. Third, we explore using phone-captured images for localization as it contains more context information than the embedded sensory GPS coordinates. The proposed approach is able to provide a comprehensive set of accurate geo-context based on the current image and its associated sensory GPS location. The geo-context includes the real location of mobile user and scene, the viewing angle, and the distance between the user and scene. Finally, we take advantage of the aforementioned techniques to build applications to enable people to enjoy ubiquitous location-based services (LBS) using their phones. Specifically, we first perform joint geovisual clustering in the cloud to generate scene clusters, with each scene represented by a 3D model. The 3D scene models are then indexed using a visual vocabulary tree structure. The phone-captured image is used to retrieve the relevant scene models, then aligned with the models, and further registered to the real-world map. We showcase three novel applications: 1) accurate augmented reality, 2) collaborative localization for rendezvous routing, and 3) routing for photographing. The evaluations through user studies indicate these applications are effective in facilitating the perfect rendezvous for mobile users. Furthermore, a new source of visual data comes from public webcams deployed in urban environments. We will present some ongoing work on crowd analytics using such data.

Biography:

Jiebo Luo is a professor of Computer Science at the University of Rochester. Prior to joining Rochester in Fall 2011, he was a Senior Principal Scientist with the Kodak Research Laboratories. His research interests include image processing, computer vision, machine learning, social media data mining, medical imaging, and pervasive computing. Dr. Luo has authored over 200 technical papers and holds over 70 US patents. Dr. Luo has been actively involved in numerous technical conferences, including serving as the general chair of ACM CIVR 2008, program co-chair of ACM Multimedia 2010, IEEE CVPR 2012 and IEEE ICIP 2017,  area chair of IEEE ICASSP 2009-2012, ICIP 2008-2012, CVPR 2008 and ICCV 2011, and an organizer of ICME 2006/2008/2010 and ICIP 2002. Currently, he serves on several IEEE SPS Technical Committees (IMDSP, MMSP, and MLSP) and conference steering committees (ACM ICMR and IEEE ICME). He is the Editor-in-Chief of the Journal of Multimedia, and has served on the editorial boards of the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), the IEEE Transactions on Multimedia (TMM), the IEEE Transactions onCircuits and Systems for Video Technology (TCSVT), Pattern Recognition (PR), Machine Vision and Applications (MVA), and Journal of Electronic Imaging (JEI). He is a Fellow of the SPIE, IEEE and IAPR. 

 
 
版权所有 © 中国科学院信息工程研究所 备案序号:京ICP备11011297号-1
单位地址:北京市海淀区闵庄路甲89号 邮编:100093