I'm a PhD student at Mila and University of Montreal, advised by Aishwarya Agrawal. This summer I'll join Qualcomm Research in Amsterdam as an intern. I work on generative models, world models, and reinforcement learning. My research is supported by the Fonds de recherche du Québec – Nature et technologies (FRQNT).
Research
Scaling on internet text has taken us remarkably far, further than most of us expected. And so the path to physical intelligence is now vivid. The manifesto is worth reading: the era of experience is here. We need models that can simulate the world from human-data imitation and go well beyond, creating new experiences in imagination.
The generative paradigm, diffusion and its family, is one of the most beautiful things in modern ML, and it actually works. The tools we already use to generate images and video are, with the right training signal, world models. To get them to work for an agent in a big world, we need better representations and better policy gradient algorithms for scaling experience.
A few deep-learning "recipes" I keep coming back to: the target network trick from RL, which later echoed through SSL and flow-based models; and asymmetric views in self-supervised learning.
Selected publications
See Google Scholar for the full list. *denotes equal contribution.
-
One Flow-Transformer for Imagination and Control
-
Grounding Computer-Use Agents from Demonstrations
-
The Promise of RL for Autoregressive Image Editing
-
Rendering-Aware RL for Vector Graphics
-
CTRL-O: Language-Controllable Object-Centric Representations
-
VisMin: Visual Minimal-Change Understanding
-
Hard Negatives to Enhance Visio-Linguistic Compositional Understanding
Misc
Talks
Slides from a few talks I put extra care into. Mostly for our group at Mila.
-
Few-step diffusion modeling
-
Score-based generative models and diffusion models
Academic Service
-
Co-organizerMila, 2026
-
IFT 6765 – Links between Computer Vision and LanguageGraduate Student InstructorMila, 2025