xAI18.04.2026

Member of Technical Staff - Multimodal Understanding

Palo Alto

Обязанности

  • 01Design, build, and optimize large-scale distributed systems for multimodal pre-training, post-training, inference, data processing, and tokenization at web/petabyte scale
  • 02Develop high-throughput pipelines for data acquisition, preprocessing, filtering, generation, decoding, loading, crawling, visualization, and management (images, videos, audio + text)
  • 03Advance multimodal capabilities including spatial-temporal compression, cross-modal alignment, world modeling, reasoning, emergent abilities, audio/image/video understanding & generation, real-time video processing, and noisy data handling
  • 04Drive data quality and studies: curation (human/synthetic), filtering techniques, analysis, and scalable pipelines to support trillion-parameter models
  • 05Create evaluation frameworks, internal benchmarks, reward models, and metrics that capture real-world usage, failure modes, interactive dynamics, and human-AI synergy
  • 06Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling paradigms for state-of-the-art performance
  • 07Build research tooling, user-friendly interfaces, prototypes/demos, full-stack applications, and enable rapid iteration based on feedback
  • 08Work across the stack (pre-training → SFT/RL/post-training) to enable reasoning, tool calling, agentic behaviors, orchestration, and seamless real-time interactions

Требования

  • 01Hands-on experience with multimodal pre-training, post-training, or fine-tuning (vision, audio, video, or cross-modal)
  • 02Expert-level proficiency in Python (core language), with strong experience in at least one of: JAX / PyTorch / XLA
  • 03Proven track record building or optimizing large-scale distributed ML systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups, hardware co-design)
  • 04Deep experience designing and running data pipelines at scale: curation, filtering, generation, quality studies, especially for noisy/real-world multimodal data
  • 05Strong fundamentals in evaluation design, benchmarks, reward modeling, or RL techniques (particularly for interactive/agentic behaviors)
  • 06Proactive self-starter who thrives in high-intensity environments and is passionate about pushing multimodal AI frontiers
  • 07Willingness to own end-to-end initiatives and do whatever it takes to deliver breakthrough user experiences

Условия

  • 01$180,000 - $440,000 USD Base salary
  • 02Equity
  • 03Comprehensive medical, vision, and dental coverage
  • 04Access to a 401(k) retirement plan
  • 05Short & long-term disability insurance
  • 06Life insurance
  • 07Various other discounts and perks