Maitreya Patel

Ph.D. Student, School of Computing & AI, Arizona State University.

profile_photo.jpg

I am a Ph.D. student at Arizona State University (ASU). I am working alongside Yezhou Yang and Chitta Baral. I closely collaborate with Tejas Gokhale and Changhoon Kim.

My focus lies in the domain of Robust and Reliability for Vision-Language. Currently, I specialize in computer vision, specifically generative and diffusion models, concept algebra, model attribution, and few-shot learning. I firmly believe that advancing Representation Learning is essential for enhancing the compositionality and reliability of machine learning systems in the long run.

I always look for self-motivated students who want to focus on either fundamental problems or responsibility aspects of Generative AI. If you have prior experience in related fields, feel free to reach out to me if you are interested.

News

Oct 31, 2024 λ-ECLIPSE, the resource-effecient Multi-Subject Text-to-Image Model accepted at TMLR. :fire: :fire:
Sep 20, 2024 One paper (lead author) accepted at NeurIPS (main conference). :fire: :sparkles:
Sep 20, 2024 One paper accepted at EMNLP (findings). :sparkles:
Jul 13, 2024 Organizing the NeurIPS'24 workshop on Responsibly Building the Next Generation of Multimodal Foundational Models. Link
May 28, 2024 🎬 Started intern at Adobe as Research Intern for Summer 2024.

Selected Publications

  1. TripletCLIP: Improving Compositional Reasoning of CLIP via Vision-Language Negatives
    Maitreya Patel , Abhiram Kusumba, Sheng Cheng, Changhoon KimChitta Baral, and Yezhou Yang

    In NeurIPS (Main Conference) – 2024

  2. λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
    Maitreya Patel , Sangmin Jung, Chitta Baral, and Yezhou Yang
    Media Coverages:  AK   , MarkTechPost

    In Transactions on Machine Learning Research (TMLR) 2024

  3. Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
    Sheng Cheng,  Maitreya Patel , and Yezhou Yang

    In EMNLP (findings) – 2024

  4. ECLIPSE:A Resource-Efficient Text-to-Image Prior for Image Generations
    Maitreya Patel Changhoon Kim, Sheng Cheng, Chitta Baral, and Yezhou Yang

    In CVPR – 2024

  5. WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
    Changhoon Kim*Kyle Min* Maitreya Patel , Sheng Cheng, and Yezhou Yang
    Media Coverages:  AK  

    In CVPR – 2024

  6. ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
    Maitreya Patel Tejas GokhaleChitta Baral, and Yezhou Yang

    In AAAI’24 | Diffusion Workshop at NeurIPS – 2023

  7. CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
    Maitreya Patel Tejas GokhaleChitta Baral, and Yezhou Yang

    In EMNLP, Main Conference – 2022

  8. Benchmarking generalization via in-context instructions on 1,600+ language tasks
    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, and  others

    In EMNLP, Main Conference – 2022

  9. MSpeC-Net: Multi-Domain Speech Conversion Network
    Harshit Malaviya, Jui Shah,  Maitreya Patel , Jalansh Munshi, and Hemant A Patil

    In 45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020

  10. CinC-GAN for Effective F0 prediction for Whisper-to-Normal Speech Conversion
    Maitreya Patel , Mirali Purohit, Jui Shah, and Hemant A Patil

    In 28th European Signal Processing Conference (EUSIPCO) 2020

  11. Weak Speech Supervision: A case study of Dysarthria Severity Classification
    Mirali Purohit, Mihir Parmar Maitreya Patel , Harshit Malaviya, and Hemant A Patil

    In 28th European Signal Processing Conference (EUSIPCO) 2020

  12. Novel adaptive generative adversarial network for voice conversion
    Maitreya Patel Mihir Parmar, Savan Doshi, Nirmesh J Shah, and Hemant A Patil

    In 11th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019

  13. Effectiveness of cross-domain architectures for whisper-to-normal speech conversion
    Mihir Parmar, Savan Doshi, Nirmesh J Shah,  Maitreya Patel , and Hemant A Patil

    In 27th European Signal Processing Conference (EUSIPCO) 2019