3DOR 2025 Program

Keynote Speakers

Stefanos Zafeiriou

Generative Models for Digital Humans
The past four years have witnessed the development of very powerful generative models such as the series of GPT models for language generation and diffusion models for image and video generation driven by language and other signals. In this presentation, I will discuss recent developments in contemporary machine learning models for generating photorealistic digital humans, as well as driving them. I will also discuss potential applications and challenges of these applications.

Stefanos Zafeiriou is a Professor of Machine Learning and Computer Vision with the Department of Computing, Imperial College London. Prof. Zafeiriou was an EPSRC Early Career Research Fellow from 2019 until 2024 and currently is holds a prestigious Turing AI World-leading Research Fellowship. From 2016 to 2020, he was a Distinguishing Research Fellow with the University of Oulu, Finland, under Finish Distinguishing Professor Program. He co-founded many start-ups three of which have exited (Facesoft, Ariel AI & Contex AI). In industry he held many positions including the GenAI and Perception Lead at Google AndroidXR. He has co-authored more than 250 papers mainly on novel machine learning methodologies applied to various domains published in the most prestigious journals in his field of research. He has 42K+ citations to his work, h-index 86. He was a recipient of the Prestigious Junior Research Fellowships from Imperial College London in 2011, the President’s Medal for Excellence in Research Supervision for 2016, the President’s Medal for Entrepreneurship in 2022, the Google Faculty Research Awards, and the Amazon Web Services (AWS) Machine Learning (ML) Research Award.

Yi-Zhe Song

Sketching the Future: From 2D Control to 3D Creation in AI Systems
This keynote examines how sketch-based interfaces democratise AI-powered creative tools, progressing from 2D recognition to immersive 3D generation. Drawing from our decade-long research journey, I demonstrate why sketching represents an essential human-AI interface through its unique balance of simplicity and expressive power. Beginning with foundational work in sketch recognition, I establish core principles that enable sketch-based AI systems. These insights drove practical applications in fine-grained image retrieval, where simple drawings unlock visual searches more intuitively than text. The talk then explores how 2D sketches enable 3D capabilities. We show that tablet sketches can effectively retrieve and generate complex 3D models, bridging dimensional barriers without requiring specialised expertise. Moving to VR sketching, I present advances in 3D sketch representation learning that reveal how spatial strokes encode geometric information differently from 2D drawings. The talk concludes with our latest frameworks enabling high-resolution generation on consumer hardware, which will serve our vision for accessible creative AI: where sketch-based interfaces make advanced capabilities available to all users regardless of technical expertise.

Yi-Zhe Song is Professor of AI and Computer Vision at the Centre for Vision Speech and Signal Processing (CVSSP) and co-director of the Surrey People-Centred AI Institute. As founder and leader of the SketchX Lab (est. 2012), he has driven groundbreaking research in sketch understanding, including the first deep neural network to surpass human performance in sketch recognition (BMVC 2015 Best Paper Award). His work spans fine-grained sketch-based image retrieval, domain generalisation, and bridging sketch with mainstream computer vision, with recent contributions in sketch-based object recognition earning a Best Paper nomination at CVPR 2023. He serves as Associate Editor for IEEE TPAMI and IJCV, and has been Area Chair for ECCV, CVPR, and ICCV. Prof. Song established and directs Surrey's MSc in AI programme, following a similar initiative he created at Queen Mary University of London.

Ayush Tewari

World Models and Physical Intelligence
Humans can effortlessly construct rich mental representations of the 3D world from sparse input, such as a single image. This is a core aspect of intelligence that helps us understand and interact with our surroundings and with each other. My research aims to build similar computational models – artificial intelligence methods that can perceive properties of the 3D structured world from images and videos. Despite remarkable progress in 2D computer vision, 3D perception remains an open problem due to some unique challenges, such as limited 3D training data and uncertainties in reconstruction. In this talk, I will discuss these challenges and explain how my research addresses them by posing vision as an inverse problem, and by designing generative models with physics-inspired inductive biases. I will then discuss how these efforts advance us toward scalable and generalizable visual perception and how they advance application domains such as robotics and computer graphics.

Ayush Tewari is an assistant professor at the University of Cambridge. He was previously a postdoctoral researcher at MIT CSAIL with Bill Freeman, Josh Tenenbaum, and Vincent Sitzmann. His research interests lie in visual perception, developing methods that infer rich structured representations of the visual world from images and videos, much like the mental models humans infer to interact with and navigate their surroundings.

Workshop Program

Day 1 (Sep. 4, 2025)

08:30 – 09:00
Breakfast

09:00 – 09:15
Opening

09:15 – 10:15
Keynote 1 — Generative Models for Digital Humans
by Stefanos Zafeiriou (Imperial College, UK)

10:15 – 11:00
Coffee break

11:00 – 13:00
Full Paper Session I

ColorQUICCI: Local Radial Descriptor Incorporating Shape and Color
Milan Kresović, Bart Iver van Blokland, Theoharis Theoharis, Jon Yngve Hardeberg
OrthoCAD-322K: A cross-modal approach for retrieving 3D CAD models from orthographic views using a graph-based framework on a developed large-scale dataset
Swapnil Nagnath Mahajan, Karthik Krishna M, Ramanathan Muthuganapathy
DWCNet: Denoising-While-Completing Network — Robust Point Cloud Completion against Corruptions
Keneni Worku Tesema, Lyndon Hill, Mark W. Jones, Gary K.L. Tam
ScanMove: Motion Prediction and Transfer for Unregistered Body Meshes
Thomas Besnier, Sylvain Arguillère, Mohamed Daoudi

13:00 – 14:30
Lunch

14:30 – 15:30
Keynote 2 — World Models and Physical Intelligence
by Ayush Tewari (University of Cambridge, UK)

15:30 – 16:00
Coffee break

16:00 – 17:00
Short Paper Session

PhyDeformer: High-Quality Non-Rigid Garment Registration with Physics-Awareness
Boyang Yu, Frederic Cordier, Hyewon Seo
Coupling Self-Distillation with Test Time Augmentation for effective LiDAR-Based 3D Semantic Segmentation
Dimitrios Antonarakos, Georgios Zamanakos, Ilias Papadeas, Ioannis Pratikakis
SHREC’25 track: Retrieval and segmentation of multiple relief patterns
Gabriele Paolini, Claudio Tortorici, Stefano Berretti

17:00 – 18:00
Academic-Industrial Round Table Discussion (All participants)

18:00
Social Dinner

Day 2 (Sep. 5, 2025)

08:30 – 09:00
Breakfast

09:00 – 10:00
Keynote 3 — Sketching the Future: From 2D Control to 3D Creation in AI Systems
by Yi-Zhe Song (CVSSP, University of Surrey, UK)

10:00 – 11:30
Full Paper Session II

PBF-FR: Partitioning Beyond Footprints for Facade Recognition in Urban Point Clouds
Chiara Romanengo, Daniela Cabiddu, Michela Mortara
Point cloud segmentation for 3D Clothed Human Layering
Pietro Musoni, Davide Garavaso, Federico Masi, Umberto Castellani
Canonical Pose Reconstruction from Single Depth Image for 3D Non-rigid Pose Recovery on Limited Datasets
Fahd Alhamazani, Paul Rosin, Yu-Kun Lai

11:30 – 13:30
SHREC Session

Track 1: Partial Retrieval Benchmark
Bart Iver van Blokland, Isaac Aguirre, Ivan Sipiran, Silvia Biasotti, Giorgio Palmieri
Track 2: Protein Surface Shape Retrieval including Electrostatic potential
Taher YACOUB et al.
Track 3: GS-3DORC: Towards the Advancements of 3D Gaussian Splatting Object Part Retrieval
Minh-Triet Tran, Thien-Phuc Tran, Minh-Quang Nguyen, Thanh-Khoi Nguyen, Nam-Quan Nguyen, Tam V. Nguyen
Track 4: Retrieval of Optimal Objects for Multi-modal Enhanced Language and Spatial Assistance (ROOMELSA)
Trong-Thuan Nguyen et al.

13:30 – 14:30
Lunch + 3DOR 2026 planning + workshop closing

Page updated

Google Sites

Report abuse