Maitreya Patel

Research Scientist, Adobe.

profile_photo.jpg

I am a Research Scientist at Adobe, working on generative models for visual content creation. Previously, I completed my Ph.D. at Arizona State University, advised by Yezhou Yang and Chitta Baral.

My research is focused on building unified generative models (omni models) — single architectures that natively understand and generate across modalities — and on the large-scale training infrastructure that makes them possible. This spans architecture design for unified tokenization and multimodal generation, pre-training and RL alignment at scale, and inference-time steering for controllability and reliability. I believe true World Models must be generalizable, efficient, controllable, responsible, and grounded in physical laws.

Past Affiliations

Adobe Sony AI Arizona State University ZS Associates

News

Mar 1, 2026 🎉 VibeToken accepted at CVPR 2026. :fire: :fire:
Sep 18, 2025 EraseFlow accepted at NeurIPS’25 as Spotlight. :fire: :fire:
Jul 25, 2025 🚀🚀 FlowChef and RefEdit are accepted at ICCV 2025! We’ll also host a tutorial. See you at Hawaii!
Jun 5, 2025 📝 Released RefEdit - a referring expression based image editing framework. Check out our paper and page! ✨
Jan 22, 2025 Voilà has been accepted at ICLR’25. :fire:

Selected Publications

  1. VibeToken: Scaling 1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations
    Maitreya Patel , Jingtao Li, Weiming Zhuang, Yezhou Yang, and Lingjuan Lv

    In CVPR 2026

  2. EraseFlow: Learning Concept Erasure Policies via GFlowNet-Driven Alignment
    Abhiram Kusumba *, Maitreya Patel *, Kyle MinChanghoon KimChitta Baral, and Yezhou Yang

    In NeurIPS (Spotlight) 2025

  3. RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model for Referring Expression
    Bimsara Pathiraja *, Maitreya Patel *, Shivam Singh, Yezhou Yang, and Chitta Baral

    In ICCV 2025

  4. Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
    Maitreya Patel , Song Wen, Dimitris N. Metaxas, and Yezhou Yang

    In ICCV 2025

  5. Voilà: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
    Nilay Yilmaz,  Maitreya Patel , Yiran Luo, Tejas GokhaleChitta Baral, Suren Jayasuriya, and 1 more author

    In ICLR (Main Conference) – 2025

  6. TripletCLIP: Improving Compositional Reasoning of CLIP via Vision-Language Negatives
    Maitreya Patel , Abhiram Kusumba, Sheng Cheng, Changhoon KimChitta Baral, and Yezhou Yang

    In NeurIPS (Main Conference) – 2024

  7. λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
    Maitreya Patel , Sangmin Jung, Chitta Baral, and Yezhou Yang
    Media Coverages:  AK   , MarkTechPost

    In Transactions on Machine Learning Research (TMLR) 2024

  8. Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
    Sheng Cheng,  Maitreya Patel , and Yezhou Yang

    In EMNLP (findings) – 2024

  9. ECLIPSE:A Resource-Efficient Text-to-Image Prior for Image Generations
    Maitreya Patel Changhoon Kim, Sheng Cheng, Chitta Baral, and Yezhou Yang

    In CVPR – 2024

  10. WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
    Changhoon Kim*Kyle Min* Maitreya Patel , Sheng Cheng, and Yezhou Yang
    Media Coverages:  AK  

    In CVPR – 2024

  11. ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
    Maitreya Patel Tejas GokhaleChitta Baral, and Yezhou Yang

    In AAAI’24 | Diffusion Workshop at NeurIPS – 2023

  12. Benchmarking generalization via in-context instructions on 1,600+ language tasks
    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, and  others

    In EMNLP, Main Conference – 2022