Home /  About Me /  People /  Research

Weidi Xie (谢伟迪)

I'm joining Shanghai Jiao Tong University as an Associate Professor.

I'll keep the affiliation with Visual Geometry Group (VGG) at Oxford, where I have spent seven wonderful years, completed DPhil and worked as a Senior Research Fellow afterwards.

I was fortunate to have been supervised by Professor Andrew Zisserman and Professor Alison Noble.

Email  /  Google Scholar  /  Twitter  /  Bilibili  /  LinkedIn


I'm generally interested in understanding how visual perception emerges. In particular, on topics:

  • learning visual representations from self-supervised training.

  • multi-modal co-training with visual apperance, motion, audio, textual description, etc.

  • open-world, object-centric representation learning.

  • learning visual representation for embodied agents.

  • To Prospective Student:

    If you are enthusiastic to work with me on the above topics, please drop me an email with a CV and Research Proposal.


  • June 2022, Adaptive 3D Localization of 2D Freehand Ultrasound Brain Images. To appear at MICCAI2022

  • May 2022, Transforming the Interactive Segmentation for Medical Imaging. To appear at MICCAI2022   (Early Accept)

  • Apr 2022, Quantum Self-supervised Learning. Accepted by Quantum Science and Technology

  • Mar 2022, Unsupervised Salient Object Detection with Spectral Cluster Voting. CVPR2022 Workshop

  • Dec 2021, Temporal Alignment Networks for Long-term Video. CVPR2022.

  • Dec 2021, Label, Verify, Correct: A Simple Few Shot Object Detection Method. CVPR2022.

  • Dec 2021, It's About Time: Analog Clock Reading in the Wild. CVPR2022.

  • Dec 2021, Prompting Visual-Language Models for Efficient Video Understanding. Preprint.

  • Oct 2021, Segmenting Invisible Moving Objects. BMVC2021.

  • Oct 2021, Audio-Visual Synchronisation In The Wild. BMVC2021.

  • Oct 2021, All You Need Are a Few Pixels: Semantic Segmentation with PixelPick. ICCV2021, ILDAV Workshop,   (Best Paper Award)

  • Sep 2021, ImplicitVol: Sensorless 3D Ultrasound Reconstruction with Deep Implicit Representation. Preprint

  • Sep 2021, Self-supervised Tumor Segmentation through Layer Decomposition. Preprint

  • July 2021, Self-supervised Video Object Segmentation by Motion Grouping. ICCV2021.   (Best Paper Award at CVPR Workshop)