作为新一代人工智能的核心组成部分,具身智能(Embodied AI)强调人的认知和智力活动不仅仅是大脑孤立的计算,而是大脑、身体和环境的相互作用,智能体需要与环境进行交互并通过反馈进一步让智能体学习并使其更智能,因此,具身智能的发展离不开以图像图形技术为基础的场景数字化建模和交互模拟。
围绕这一主题,中国图象图形学学会(CSIG)将于2022年3月30日(周三)09:00-12:00举办CSIG图像图形技术国际在线研讨会第三期(Embodied AI专题)。会议邀请了来自美国、加拿大和中国4位国际知名学者来介绍具身智能领域的最新研究成果,并围绕该领域当下挑战及未来趋势开展讨论。期待学术与工业界同行的积极参与!
腾讯会议:182-324-517
会议直播地址:
https://meeting.tencent.com/l/PTatQESfNEHf
会议日程
讲者简介
Roozbeh Mottaghi
University of Washington
Roozbeh Mottaghi is the Research Manager of the Perceptual Reasoning andInteraction Research (PRIOR) group at AI2 and an Affiliate Associate Professorin Paul G. Allen School of Computer Science and Engineering at the Universityof Washington. Prior to joining AI2, he was a post-doctoral researcher in theComputer Science Department at Stanford University. He obtained his PhD inComputer Science in 2013 from the University of California, Los Angeles. Hisresearch is mainly focused on Computer Vision and Machine Learning. Morespecifically, he is interested in Embodied AI, physical reasoning viaperception, and learning via interaction.
Talk title: Embodied ComputerVision: Learning via Interaction
Abstract: The re-emergence ofdeep neural networks has led to significant progress in computer vision overthe past decade. We now have robust methods and architectures for imageclassification, object detection, and other core computer vision tasks. Whilethese tasks form the foundation of computer vision, they are not the end goal.It is now time to take a step further and develop the next generation of tasks:tasks that require reasoning beyond pattern recognition. A popular class ofthese tasks is Embodied AI tasks that require an agent to understand the dynamicsof the world around it, interact with the surrounding environment, and learnfrom its interactions.
I will talk aboutthe issues associated with our current view of data, models, and tasks and howwe should adapt to the new paradigm in computer vision. I will talk about therecent advances in AI2-THOR, our platform for Embodied AI, enabling a plethoraof Embodied tasks such as learning physics, interactive instruction following,and commonsense reasoning. I will next focus on a recent work that shows minimalinteraction with the environment improves a state-of-the-art object detector byabout 12 points in AP (it takes the object detection community about two yearsto obtain such improvement). This shows an example that interaction with theenvironment is the key to improving computer vision models further.
Manolis Savva
Simon Fraser University
Manolis Savva is an Assistant Professor of Computing Science at SimonFraser University, and a Canada Research Chair in Computer Graphics. His research focuses on analysis,organization and generation of 3D content, forming a path to holistic 3D sceneunderstanding revolving around people, and enabling applications in computergraphics, computer vision, and robotics. Prior to his current position he was a visiting researcher at FacebookAI Research, and a postdoctoral researcher at Princeton University. He received his Ph.D. from StanfordUniversity under the supervision of Pat Hanrahan, and his B.A. in Physics andComputer Science from Cornell University. The impact of his work has been recognized by a number of awardsincluding the SGP 2020 Dataset Award for ScanNet, an ICCV 2019 Best Paper AwardNomination for Habitat, and the SGP 2018 Dataset Award for ShapeNet.
Talk title: 3D Simulation forEmbodied AI: three emerging directions
Abstract: 3D simulators areincreasingly being used to develop and evaluate "embodied AI" (agentsperceiving and acting in realistic environments). Much of the prior workin this space has treated simulation platforms as "black boxes"within which learning algorithms are to be deployed. However, the designchoices and resulting system characteristics of the simulation platformsthemselves can greatly impact both the feasibility and the outcomes of experimentsinvolving simulation. In this talk, I will describe a number of recentprojects that outline three emerging directions for 3D simulation platforms.
Jiajun Wu
Stanford University
Talk title: Benchmarking Everyday Household Activities inVirtual, Interactive, and Ecological Environments
Abstract: In this talk, I'll present BEHAVIOR, abenchmark for embodied AI with 100 activities in simulation, spanning a rangeof everyday household chores such as cleaning, maintenance, and foodpreparation. These activities are designed to be realistic, diverse, andcomplex, aiming to reproduce the challenges that agents must face in the realworld. Building such a benchmark poses three fundamental difficulties for eachactivity: definition (it can differ by time, place, or person), instantiationin a simulator, and evaluation. BEHAVIOR addresses these with (1) anobject-centric, predicate logic-based description language for expressing anactivity's initial and goal conditions, (2) simulator-agnostic featuresrequired by an underlying environment to support BEHAVIOR, and (3) a set ofmetrics to measure task progress and efficiency. I will also demonstrate arealization of the BEHAVIOR benchmark in the iGibson simulator, as well asvarious benchmarking results.
主持人简介
胡瑞珍
深圳大学
来源:CSIG国际合作与交流工委会