Lixin Yang (杨理欣)

Lixin YANG (杨理欣)

Research Assistant Professor Shanghai Jiao Tong University, School of Artificial Intelligence 上海交通大学人工智能学院助理研究员, 硕士生导师 Member of Machine Vision and Intelligence Group (MVIG) at SJTU Email: siriusyang at sjtu dot edu dot cn Office: Bldg. SAI, No. 1954 Huashan Rd., Xuhui Dist., Shanghai, 200230, China

About. I’m a Research Assistant Professor in Shanghai Jiao Tong University (SJTU), affiliated with the School of Artificial Intelligence (SAI), where I joined in September 2024. I obtained Ph.D. degree in Computer Science from SJTU in 2023, advised by Prof. Cewu Lu at the Machine Vision and Intelligence Group and M.S. degree in Mechanical Engineering, SJTU. My research interests include 3D Vision and Robotics. Currently, I am focusing on modeling and imitating the hand manipulating objects, including 3D hand | object pose | shape estimation, grasp | motion generation, imitation learning, dexterous manipulation.

Join Us. I am looking for Master Student at SJTU SAI and self-motivated research interns. Contact me if you are interested in the above topics. 诚意科研研究实习生(带薪), 我们一起做有意思的科研。

Email / Google Scholar / GitHub / Twitter

News

[2026.01] Two papers have been accepted to ICRA 2026 🇦🇹
[2025.07] Generalizable Multi-view Hand Reconstruction POEMv2 is accepted by TPAMI 2025
[2025.06] Dense Policy is accepted by ICCV 2025 🇺🇸
[2025.05] Motion-before-Action is accepted by RA-L.
[2025.01] HybrIK-X is accepted by TPAMI.
[2024.12] Invited Talk at ROSCon China Workshop, thanks Yao Mu for hosting.
[2024.09] One paper on articulated object image manipulation got accepted by NeurIPS 2024 🇨🇦.
[2024.07] One paper: SemGrasp is accepted by ECCV 2024 🇮🇹.
[2024.02] One paper: OakInk2 is accepted by CVPR 2024 🇺🇸.
[2024.02] The Contact Potential Field is accepted by TPAMI.
[2023.12] One paper: FAVOR is accepted by AAAI 2024 🇨🇦.
[2023.10] One paper: Color-NeuS is accepted by 3DV 2024 🇨🇭.
[2023.08] I defend my doctoral thesis and earn my Ph.D!
[2023.08] I am honored to be an invited speaker at the HANDS workshop at ICCV 2023.
[2023.07] One paper: CHORD is accepted by ICCV 2023 🇫🇷.
[2023.06] Invited Talk at 智东西公开课 | 研讨会: 3D手部重建及具身智能交互. 视频 (中文).
[2023.02] One paper: POEM is accepted by CVPR 2023.
[2022.10] I have taken the wonderful journey of marriage alongside my cherished wife.
[2022.10] Invited Talk at International Digital Economy Academy (IDEA), Thanks Ailing Zeng for hosting.
[2022.09] One paper: DART got accepted by NeurIPS 2022 - Datasets and Benchmarks Track.
[2022.07] Invited Talk at 智东西公开课 | AI新青年讲座: 基于图像的手物交互重建与虚拟人手生成. 视频 (中文).
[2022.04] Invited Talk at MPI-IS Perceiving Systems. Thanks Yuliang Xiu for hosting. (info).
[2022.03] Two paper were accepted by CVPR 2022: one Oral, one poster.
[2021.07] One paper got accepted by ICCV 2021.

Publications

*=equal contribution, #=corresponding author

No publications under the selected topic.

	LIDEA: Human-to-Robot Imitation Learning via Implicit Feature Distillation and Explicit Geometry Alignment Yifu Xu, Bokai Lin, Xinyu Zhan, Hongjie Fang, Yong-Lu Li, Cewu Lu, Lixin Yang# arXiv, 2026 project / arXiv A human-to-robot learning that bridges the embodiment gap with transitive 2D feature distillation and explicit 3D geometry alignment.
	LaMP: Learning Vision-Language-Action Policy with 3D Scene Flow as Latent Motion Prior Xinkai Wang, Chenyi Wang, Yifu Xu, Mingzhe Ye, Fucheng Zhang, Jialin Tian, Xinyu Zhan, Lifeng Zhu, Cewu Lu, Lixin Yang# arXiv, 2026 project / arXiv A dual-expert Vision-Language-Action framework that injects one-step partially denoised 3D scene flow as a latent motion prior.

TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning

Jiacheng Liu*, Pengxiang Ding*, Qihang Zhou, Yuxuan Wu, Da Huang, Zimian Peng, Wei Xiao, Weinan Zhang, Lixin Yang#, Cewu Lu#, Donglin Wang#,

ICRA, 2026
project / arXiv

A cross-embodiment framework that transfers wheeled-humanoid data to bipedal VLA models via morphology-agnostic 6D end-effector trajectories and a heuristic-enhanced online DAgger controller.

Multi-view Hand Reconstruction with a Point-Embedded Transformer

Lixin Yang, Licheng Zhong, Pengxiang Zhu, Xinyu Zhan, Junxiao Kong, Jian Xu, Cewu Lu#

TPAMI, 2025
paper / arXiv / code

POEM-v2: a generalizable multi-view 3D hand reconstruction model trained on large-scale multi-view datasets. It enables accurate, flexible, and occlusion-robust hand mesh recovery across arbitrary multi-view setups.

AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons

Hongjie Fang*, Chenxi Wang*, Yiming Wang*, Jingjing Chen*, Shangning Xia, Jun Lv, Zihao He, Xiyan Yi, Yunhan Guo, Xinyu Zhan, Lixin Yang, Weiming Wang, Cewu Lu#, Hao-Shu Fang#

CoRL, 2025 (Oral Presentation)
project / arXiv

AirExo-2, a low-cost exoskeleton system for large-scale in-the-wild demonstration collection. It transforms the collected in-the-wild demonstrations into pseudo-robot demonstrations. RISE-2, a generalizable imitation policy that integrates 2D and 3D perceptions.

Dense Policy: Bidirectional Autoregressive Learning of Actions

Yue Su*, Xinyu Zhan*, Hongjie Fang, Han Xue, Hao-Shu Fang, Yong-Lu Li, Cewu Lu, Lixin Yang#,

ICCV, 2025
project / arXiv

a bidirectionally expanded learning approach that enhances auto-regressive policies for robotic manipulation. It employs a lightweight encoder-only architecture to iteratively unfold the action sequence from an initial single frame into the target sequence in a coarse-to-fine manner with logarithmic-time inference.

Motion Before Action: Diffusing Object Motion as Manipulation Condition

Yue Su*, Xinyu Zhan*, Hongjie Fang, Yong-Lu Li, Cewu Lu, Lixin Yang#,

RA-L, 2025 & ICRA, 2026
project / arXiv / code

A two cascaded diffusion processes for object motion generation and robot action generation under object motion guidance.

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang, Cewu Lu#, Bo Dai

ECCV, 2024 (Oral Presentation)
project / arXiv

A MLLM-based method that infuses language instructions into grasp generation; & A new language-pose dataset, CapGrasp, featuring detailed caption of grasping poses.

OakInk2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

Xinyu Zhan*, Lixin Yang*, Yifei Zhao, Kangrui Mao, Hanlin Xu, Zenan Lin, Kailin Li, Cewu Lu#

CVPR, 2024
project / arXiv

A 4D motion dataset focusing on bimanual object manipulation tasks involved in complex daily activities; & A three-tiered task abstraction: Object Affordance, Primitive Task, and Complex Task, to systematically organize manipulation tasks.

FAVOR: Full-Body AR-Driven Virtual Object Rearrangement Guided by Instruction Text

Kailin Li*, Lixin Yang*, Zenan Lin, Jian Xu, Xinyu Zhan, Yifei Zhao, Pengxiang Zhu, Wenxiong Kang, Kejian Wu, Cewu Lu#

AAAI, 2024
project / arXiv / code / data

A full-body human motion dataset that captures text-guided desktop object rearrangement through MoCap and AR glasses; & A pipeline for generating avatar's motion of object rearrangement driven by text instruction.

Color-NeuS: Reconstructing Neural Implicit Surfaces with Color

Licheng Zhong*, Lixin Yang*, Kailin Li, Haoyu Zhen, Mei Han, Cewu Lu#

3DV, 2024
project / arXiv / code / data

Reconstructing 3D implicit surfaces with accurate, view-independent surface color by decoupling view-dependent shading from geometry. It combines a global color network and a relighting network to preserve volume rendering performance while enabling colored mesh extraction.

CHORD: Category-level in-Hand Object Reconstruction via Shape Deformation

Kailin Li*, Lixin Yang*, Haoyu Zhen, Zenan Lin, Xinyu Zhan, Licheng Zhong, Jian Xu, Kejian Wu, Cewu Lu#

ICCV, 2023
project / arXiv / code / tool

A single-view hand-held object reconstruction method that exploits the categorical shape prior to reconstruct the shape of intra-class objects; & A new synthetic dataset, COMIC, that contains the category-level collection of objects with diverse shape, materials, interacting poses, and viewing directions.

POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo

Lixin Yang, Jian Xu, Licheng Zhong, Xinyu Zhan, Zhicheng Wang, Kejian Wu, Cewu Lu#

CVPR, 2023
arXiv / code

A multi-view hand mesh recovery (HMR) method with Transformer. It leverages the "power of points", including Basis Points Set, point's positional encoding and point-Transformer, to unify and merge information from sparsely arranged cameras.

DART: Articulated Hand Model with Diverse Accessories and Rich Textures

Daiheng Gao*, Yuliang Xiu*, Kailin Li*, Lixin Yang*, Feng Wang, Peng Zhang, Bang Zhang, Cewu Lu, Ping Tan

NeurIPS, 2022 - Datasets and Benchmarks Track
project / arXiv / code / video

A MANO-derived hand model that contains exquisite hand-crafted texture maps, varying in appearance and covering different kinds of blemishes, make-ups, and accessories.

OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction

Lixin Yang*, Kailin Li*, Xinyu Zhan*, Fei Wu, Anran Xu, Liu Liu, Cewu Lu#

CVPR, 2022
project / paper / arXiv / code

A dataset that focuses on human grasp based on object's affordance. It contains two knowledge base: 1) Object affordance knowledge (Oak) and 2) Interaction knowledge (Ink). A new model: Tink, for transferring interaction pose from one object to another.

ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis

Lixin Yang*, Kailin Li*, Xinyu Zhan, Jun Lv, Wenqiang Xu, Jiefeng Li, Cewu Lu#

CVPR, 2022 (Oral Presentation)
paper / arXiv / code

An online data syhthesis tool for articulated hand(-object) pose estimation. An grasping systhesis method that can generate dexterous hand grasping poses for arbitrary object.

Learning a Contact Potential Field to Model the Hand-Object Interaction

Lixin Yang, Xinyu Zhan, Kailin Li, Wenqiang Xu, Junming Zhang, Jiefeng Li, Cewu Lu#

TPAMI, 2024
paper

A novel contact representation (CPF) that used to imporve physical hand-object interaction. A hybrid learning-fitting framework (MIHO) that aligns the top-down pose estimation with bottom-up contact modeling.

CPF: Learning a Contact Potential Field to Model the Hand-Object Interaction

Lixin Yang, Xinyu Zhan, Kailin Li, Wenqiang Xu, Jiefeng Li, Cewu Lu#

ICCV, 2021
project / paper / supp / arXiv / code / 知乎

HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery

Jiefeng Li, Siyuan Bian, Chao Xu, Zhicun Chen, Lixin Yang, Cewu Lu#

TPAMI, 2025
arXiv

A hybrid inverse kinematics method for 3D body mesh recovery, combining 3D keypoint estimation and body mesh recovery. HybrIK-X extends this to model hands and faces, offering fast, accurate whole-body pose estimation.

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu#

CVPR, 2021
project / paper / supp / arXiv / code

HandTailor: Towards High-Precision Monocular 3D Hand Recovery

Jun Lv, Wenqiang Xu, Lixin Yang, Sucheng Qian, Chongzhao Mao, Cewu Lu#

BMVC, 2021
paper / arXiv / code

BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass Networks

Lixin Yang, Jiasen Li, Wenqiang Xu, Yiqun Diao, Cewu Lu#

BMVC, 2020
paper / arXiv / code

Other Collaborations

GaPT-DAR: Category-level Garments Pose Tracking via Integrated 2D Deformation and 3D Reconstruction Li Zhang, Mingliang Xu, Jianan Wang, Qiaojun Yu, Lixin Yang, Yong-Lu Li, Cewu Lu, Rujing Wang, Liu Liu# CVPR, 2025 project / paper Category-level garment pose tracking via integrated 2D deformation learning and 3D reconstruction.
General Articulated Objects Manipulation in Real Images via Part-Aware Diffusion Process Zhou Fang, Yong-Lu Li#, Lixin Yang, Cewu Lu# NeurIPS, 2024 paper Part-aware diffusion for articulated object manipulation and controlled image editing in real scenes.

Talks

Human-Robot Data Companion: Pipeline and Representation.
[2025.09] SII TechFest workshop: Embodied AI Reasoning and Scaling. Thank Panpan Cai for hosting.

Tutorial: Multimodal Embodied Perception and Action Prediction
[2025.07]USTC Summer School 2025. Advances in Computer Graphics. Thank Yumeng Liu for hosting.

Empowering Object Manipulation with Human Knowledge.
[2024.12] ROSCon China Workshop. Thank Yao Mu for hosting.
[2024.12] PKU-Agibot Lab Thank Hongwei Fan for hosting.

Paving the Way for Understanding Human Interactions with Objects: The OakInk2 Dataset.
[2023.08] ICCV 2023 HANDS Workshop, Thank Linlin Yang for hosting.

3D手部重建及具身智能交互.
[2023.09] ByteDance PICO. Thank Chao Wen for hosting.
[2023.06] 智东西公开课研讨会. Co-speakers: Xinyu Chen and Daiheng Gao.

Inferring Human Hand-Object Interaction through Visual Input
[2022.10] International Digital Economy Academy , Thank Ailing Zeng for hosting.

基于图像的手物交互重建与虚拟人手生成
[2022.07] 智东西公开课: AI新青年讲座

Leverage Kinematic and Contact constraints for understanding hand-object interaction
[2022.04] MPI-IS Perceiving Systems. Thank Yuliang Xiu for hosting.

Services

Conference reviewer:

Robotics: Science and Systems (RSS)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
IEEE International Conference on Computer Vision (ICCV)
European Conference on Computer Vision (ECCV)
ACM SIGGRAPH
Conference on Neural Information Processing Systems (NeurIPS)
International Conference on Learning Representations (ICLR)
International Conference on Machine Learning (ICML)
Association for the Advancement of Artificial Intelligence (AAAI)
Conference of the European Association for Computer Graphics (Eurographics)

Journal reviewer:

IEEE Transaction on Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
International Journal of Computer Vision (IJCV)
Pattern Recognition (PR)

Other Academic service:

Understanding HANDS in Action Workshop Organizer (ECCV 2024, ICCV 2025, ECCV 2026)

Fundings and Awards

国家自然科学基金青年科学基金项目（C类）2026.01 - 2028.12，国家自然科学基金委员会，主持

上海交通大学Explore X基金，2025.01-2027.12，上海交通大学，主持

上海市启明星扬帆计划，2024.12-2027.11，上海市科学技术委员会，主持

上海市科委“科技创新行动计划”项目，2023.12-2025.11，上海市科学技术委员会，课题负责人

2023年吴文俊人工智能科学技术奖，自然科学奖一等奖，2023，第四完成人（第一学生完成人）

website template