Vasu Sharma

Leading AI Research at Scale

I'm an Applied Research Scientist Lead at Meta AI, where I spearhead the development of multimodal foundation models including Chameleon, Llama 3/4, DinoV2 etc. My work spans trillion token scale pretraining, mixture of expert architectures, and next generation conversational AI agents.

With a Masters from Carnegie Mellon (Department Rank #1) and experience across Meta, Amazon Alexa, and Citadel, I bridge cutting-edge research with real-world applications.

20 Publications

7k+ citations

4 Major Models

3 Top Companies

🧠

Multimodal Foundation Models

Leading development of next-gen models that understand text, images, and audio simultaneously

💬

Large Language Models (LLMs)

Architecting trillion-parameter models with advanced training techniques and optimization

👁️

Computer Vision & NLP

Bridging visual understanding with natural language processing for comprehensive AI systems

🎵

Speech & Audio Processing

Advanced audio AI systems from Amazon Alexa to next-gen conversational agents

🎯

Reinforcement Learning

Training AI agents to make optimal decisions through reward-based learning systems

🛡️

AI Safety & Evaluation

Ensuring responsible AI deployment through rigorous testing and safety protocols

Research & Industry Experience

M

Meta AI

Applied Research Scientist Lead

2022-Present

Leading multimodal foundation model development. Creating Llama 3/4 and Chameleon models. Research in trillion-token pretraining and conversational AI agents.

A

Amazon Alexa

Applied Scientist

2021-2022

Developed visual-language navigation benchmarks. Created efficient multimodal transformers and video processing applications.

C

Citadel LLC

Quantitative Research Analyst

2019-2021

Built automated ML pipelines for trading strategies. Optimized frameworks for large-scale financial data processing.

Education

Masters in Language Technologies

Carnegie Mellon University

GPA: 4.19/4.33 • Dept. Rank: 1

B.Tech Computer Science

IIT Kanpur

GPA: 9.9/10.0

Key Research Internships

University of Toronto

2016

Carnegie Mellon

2014

EPFL Switzerland

2017

Xerox Research Labs Europe

2015

Research Impact & Publications

Advancing the frontiers of AI and machine learning through impactful research contributions

0 Publications

0 Citations

0 H-index

4 Top-tier Venues

TMLR 2024

DINOv2: Learning Robust Visual Features without Supervision

A breakthrough self-supervised learning approach that learns powerful visual representations without requiring labeled data, achieving state-of-the-art performance across multiple computer vision tasks.

NeurIPS 2023

MAViL: Masked Audio-Video Learners

Novel multi-modal learning framework that jointly processes audio and video through masked reconstruction, enabling robust cross-modal understanding.

ArXiv 2024

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Pioneering foundation model architecture that seamlessly integrates multiple modalities through early fusion, enabling unified reasoning across text, images, and code.

COLM 2024

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Innovative approach to creating mixture-of-experts models by combining specialized language models, achieving superior performance with efficient parameter utilization.

ICLR 2024

Demystifying CLIP data

Comprehensive analysis of CLIP training data, providing crucial insights into data quality, bias, and their impact on model performance and fairness.

CVPR 2024

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models

Thorough evaluation framework for vision-language models, revealing important limitations and proposing improvements for better multi-modal understanding.

View Full Publication List Google Scholar Profile

Let's Build the Future of AI Together

Interested in collaborating on cutting-edge AI research? Looking for expertise in multimodal foundation models or large-scale AI systems? I'm always excited to discuss innovative projects and research opportunities.

✉

Email

sharma.vasu55@gmail.com

🌐

Portfolio

Visit My Full Portfolio

📄

Resume

Download Full Resume

Areas of Collaboration

Foundation Model Research Multimodal AI Systems AI Safety & Evaluation Speaking Engagements Technical Consulting

💬 Start a Conversation 👁 View Portfolio ⬇ Download Resume