Maitreya Patel

Ph.D. Student, School of Computing & AI, Arizona State University.

profile_photo.jpg

I am a senior Ph.D. student at Arizona State University (ASU). I am working alongside Yezhou Yang and Chitta Baral. I closely collaborate with Tejas Gokhale and Changhoon Kim.

My research focuses on the theoretical foundations of visual generative models and their applications in conditional sampling, including image/video editing, inverse problems, and personalization. I am also interested in representation learning, large-scale multimodal foundational models, and inference-time steering to enhance the controllability and reliability of generative models. I believe true World Models must be generalizable, efficient, controllable, responsible, and grounded in physical laws.

Alongside my research, I am writing The Stochastic Journey — a blog series that delves into the mathematical foundations of generative models, tracing their roots in stochastic calculus, probability theory, and differential equations.

I always look for self-motivated students who want to focus on either fundamental problems or responsibility aspects of Generative AI. If you have prior experience in related fields, feel free to reach out to me if you are interested.

News

Jan 22, 2025 Voilà has been accepted at ICLR’25. :sparkles:
Nov 29, 2024 🚀 🚀 Releasing FlowChef, Steering Rectified Flow Models in the Vector Field for Controlled Image Generation, for trianing-, inversion-, and gradient-free controlled image generations. :fire: :sparkles:
Oct 31, 2024 λ-ECLIPSE, the resource-effecient Multi-Subject Text-to-Image Model accepted at TMLR. :fire: :fire:
Sep 20, 2024 One paper (lead author) accepted at NeurIPS (main conference). :fire: :sparkles:
Sep 20, 2024 One paper accepted at EMNLP (findings). :sparkles:

Selected Publications

  1. Steering Rectified Flow Models in the Vector Field for Controlled Image Generation
    Maitreya Patel , Song Wen, Dimitris N. Metaxas, and Yezhou Yang

    In ArXiv 2024

  2. Voilà: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
    Nilay Yilmaz,  Maitreya Patel , Yiran Luo, Tejas GokhaleChitta Baral, Suren Jayasuriya, and 1 more author

    In ICLR (Main Conference) – 2025

  3. TripletCLIP: Improving Compositional Reasoning of CLIP via Vision-Language Negatives
    Maitreya Patel , Abhiram Kusumba, Sheng Cheng, Changhoon KimChitta Baral, and Yezhou Yang

    In NeurIPS (Main Conference) – 2024

  4. λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
    Maitreya Patel , Sangmin Jung, Chitta Baral, and Yezhou Yang
    Media Coverages:  AK   , MarkTechPost

    In Transactions on Machine Learning Research (TMLR) 2024

  5. Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
    Sheng Cheng,  Maitreya Patel , and Yezhou Yang

    In EMNLP (findings) – 2024

  6. ECLIPSE:A Resource-Efficient Text-to-Image Prior for Image Generations
    Maitreya Patel Changhoon Kim, Sheng Cheng, Chitta Baral, and Yezhou Yang

    In CVPR – 2024

  7. WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
    Changhoon Kim*Kyle Min* Maitreya Patel , Sheng Cheng, and Yezhou Yang
    Media Coverages:  AK  

    In CVPR – 2024

  8. ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
    Maitreya Patel Tejas GokhaleChitta Baral, and Yezhou Yang

    In AAAI’24 | Diffusion Workshop at NeurIPS – 2023

  9. CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
    Maitreya Patel Tejas GokhaleChitta Baral, and Yezhou Yang

    In EMNLP, Main Conference – 2022

  10. Benchmarking generalization via in-context instructions on 1,600+ language tasks
    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, and  others

    In EMNLP, Main Conference – 2022

  11. MSpeC-Net: Multi-Domain Speech Conversion Network
    Harshit Malaviya, Jui Shah,  Maitreya Patel , Jalansh Munshi, and Hemant A Patil

    In 45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

  12. CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion
    Maitreya Patel , Mirali Purohit, Jui Shah, and Hemant A Patil

    In 28th European Signal Processing Conference (EUSIPCO) 2020

  13. Weak Speech Supervision: A case study of Dysarthria Severity Classification
    Mirali Purohit, Mihir Parmar Maitreya Patel , Harshit Malaviya, and Hemant A Patil

    In 28th European Signal Processing Conference (EUSIPCO) 2020

  14. Novel adaptive generative adversarial network for voice conversion
    Maitreya Patel Mihir Parmar, Savan Doshi, Nirmesh J Shah, and Hemant A Patil

    In 11th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019

  15. Effectiveness of cross-domain architectures for whisper-to-normal speech conversion
    Mihir Parmar, Savan Doshi, Nirmesh J Shah,  Maitreya Patel , and Hemant A Patil

    In 27th European Signal Processing Conference (EUSIPCO) 2019