Rabiul Awal Photo

I’m a PhD student at Mila - Quebec AI Institute and University of Montreal with Aishwarya Agrawal. I’m also a visiting researcher at Multimodal Foundation Models Team, ServiceNow Research with Sai Rajeswar.

Prior to Mila, I received an MSc in CS from University of Saskatchewan and BSc in CS from Noakhali Science and Technology University.

Research

Building AI systems that truly understand the physical world is an exciting frontier. My research focuses: (1) learning rich visual representations that capture the true structure of the world, and (2) developing controllable generative (diffusion) world models. This naturally connects to my interest in alignment and economic utility.

Highlighted Work

See my Google Scholar for a full list of publications.

WebMMU

WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation

Rabiul Awal et al.

DL4C Workshop @ ICLR'25

CTRL-O

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Aniket Rajiv Didolkar∗, Andrii Zadaianchuk∗, Rabiul Awal∗, Maximilian Seitzer, Efstratios Gavves, Aishwarya Agrawal

CVPR'25

VisMin

VisMin: Visual Minimal-Change Understanding

Rabiul Awal*, Saba Ahmadi*, Le Zhang*, Aishwarya Agrawal

NeurIPS'24

Compositional CLIP

Contrasting intra-modal and ranking cross-modal hard negatives to enhance visio-linguistic fine-grained understanding

Le Zhang, Rabiul Awal, Aishwarya Agrawal

CVPR'24

CulturalVQA

CulturalVQA: Benchmarking Vision Language Models for Cultural Knowledge

S. Nayak, K. Jain, R. Awal, et al.

EMNLP'24 (Oral)

Blog

My writings on how-to-cs-grad, ai research, and systematic issues in Bangladesh.