I’m a PhD student at Mila - Quebec AI Institute and University of Montreal with Aishwarya Agrawal. I’m also a visiting researcher at Multimodal Foundation Models Team, ServiceNow Research with Sai Rajeswar.
Prior to Mila, I received an MSc in CS from University of Saskatchewan and BSc in CS from Noakhali Science and Technology University.
Research
I’m interested in building AI systems that understand the physical world. My research focuses on two key areas: (1) learning rich visual representations that capture the underlying structure of the world, and (2) developing generative world models, especially diffusion and flow-based approaches that can simulate and predict dynamics. I’m also excited about scaling foundation models and RL to superhuman performance. All of this ties back to bigger questions around alignment and economic impact.
Highlighted Work
See my Google Scholar for a full list of publications.
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi*, Rabiul Awal*, Ankur Sikarwar*, Amirhossein Kazemnejad*, Ge Ya Luo, Juan Rodriguez, Sai Rajeswar, Siva Reddy, Chris Pal, Benno Krojer, Aishwarya Agrawal
Preprint
Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Juan A. Rodriguez∗, Haotian Zhang∗, Abhay Puri, Aarash Feizi, Rishav Pramanik, Pascal Wichmann, Arnab Mondal, Mohammad Reza, Rabiul Awal + 6 others
Preprint
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation
Rabiul Awal, Mahsa Massoud, Aarash Feizi, Zichao Li + 9 others
EMNLP'25 (Main Conference)Spotlight at MAR Workshop @ CVPR'25
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Rajiv Didolkar*, Andrii Zadaianchuk*, Rabiul Awal*, Maximilian Seitzer, Efstratios Gavves, Aishwarya Agrawal
CVPR'25 Spotlight at MAR Workshop @ CVPR'25
VisMin: Visual Minimal-Change Understanding
Rabiul Awal*, Saba Ahmadi*, Le Zhang*, Aishwarya Agrawal
NeurIPS'24
Contrasting intra-modal and ranking cross-modal hard negatives to enhance visio-linguistic fine-grained understanding
Le Zhang, Rabiul Awal, Aishwarya Agrawal
CVPR'24 Spotlight at O-DRUM Workshop @ CVPR'23
CulturalVQA: Benchmarking Vision Language Models for Cultural Knowledge
Shravan Nayak, Kanishk Jain, Rabiul Awal + 5 others
EMNLP'24 Oral
Blog
My writings on how-to-cs-grad, ai research, and systematic issues in Bangladesh.