Leading AI Research at Scale

I'm an Applied Research Scientist Lead at Meta AI, where I spearhead the development of multimodal foundation models including Chameleon, Llama 3/4, DinoV2 etc. My work spans trillion token scale pretraining, mixture of expert architectures, and next generation conversational AI agents.

With a Masters from Carnegie Mellon (Department Rank #1) and experience across Meta, Amazon Alexa, and Citadel, I bridge cutting-edge research with real-world applications.

100 Publications
11k+ citations
6 Major Models
3 Top Companies
🧠

Multimodal Foundation Models

Leading development of next-gen models that understand text, images, and audio simultaneously
💬

Large Language Models (LLMs)

Architecting trillion-parameter models with advanced training techniques and optimization
👁️

Computer Vision & NLP

Bridging visual understanding with natural language processing for comprehensive AI systems
🎵

Speech & Audio Processing

Advanced audio AI systems from Amazon Alexa to next-gen conversational agents
🎯

Reinforcement Learning

Training AI agents to make optimal decisions through reward-based learning systems
🛡️

AI Safety & Evaluation

Ensuring responsible AI deployment through rigorous testing and safety protocols
Professional Experience

Research & Industry Trajectory

A focused journey through leading tech institutions, specializing in multimodal foundation models, large-scale AI infrastructure, and quantitative research.

M Meta AI
2022 - Present
Applied Research Scientist Lead

Spearheading multimodal foundation model development. Instrumental in creating Llama 3/4 and Chameleon models. Conducting pioneering research in trillion-token pretraining and advanced conversational AI agents.

A Amazon Alexa
2021 - 2022
Applied Scientist

Developed comprehensive visual-language navigation benchmarks. Engineered efficient multimodal transformers and optimized video processing applications for Alexa's core AI systems.

C Citadel LLC
2019 - 2021
Quantitative Research Analyst

Architected automated ML pipelines for high-frequency trading strategies. Optimized distributed computing frameworks to handle large-scale financial data processing.

Research Impact & Publications

Advancing the frontiers of AI and machine learning through impactful research contributions

100+ Publications
11,000+ Citations
16 H-index
20 Top-tier Venues
NeurIPS 2023

MAViL: Masked Audio-Video Learners

Novel multi-modal learning framework that jointly processes audio and video through masked reconstruction, enabling robust cross-modal understanding.

ArXiv 2024

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Pioneering foundation model architecture that seamlessly integrates multiple modalities through early fusion, enabling unified reasoning across text, images, and code.

COLM 2024

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Innovative approach to creating mixture-of-experts models by combining specialized language models, achieving superior performance with efficient parameter utilization.

ICLR 2024

Demystifying CLIP data

Comprehensive analysis of CLIP training data, providing crucial insights into data quality, bias, and their impact on model performance and fairness.

CVPR 2024

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models

Thorough evaluation framework for vision-language models, revealing important limitations and proposing improvements for better multi-modal understanding.

Let's Build the Future of AI Together

Interested in collaborating on cutting-edge AI research? Looking for expertise in multimodal foundation models or large-scale AI systems? I'm always excited to discuss innovative projects and research opportunities.

Areas of Collaboration

Foundation Model Research Multimodal AI Systems AI Safety & Evaluation Speaking Engagements Technical Consulting