Poolside28.04.2026

Member of Engineering (Reinforcement Learning)

Полная занятостьУдалёнка

Обязанности

  • 01Research and experiment on ways to improve reasoning and code generation for LLMs
  • 02Own the full experiment life cycle from idea to experimentation and integration
  • 03Keep up with the latest research, and be familiar with the state of the art in LLMs, RL, and code generation
  • 04Translate research ideas into clean, reusable codebases that other researchers can build on
  • 05Design, analyze, and iterate on data generation and training of LLMs
  • 06Implement and iterate on RL training pipelines that scale reliably across domains
  • 07Diagnose training instabilities and failures, debug RL runs and propose mitigation methods
  • 08Write high-quality, reproducible and maintainable code

Требования

  • 01Experience with Large Language Models (LLM), including: Understanding of the Transformer architecture and scaling laws, Mid-training and post-training techniques, Experience training reasoning and/or agentic models, Hands-on use of LLMs, with a sense of their capabilities and limitations
  • 02Reinforcement Learning experience: Solid grasp of Reinforcement Learning concepts and familiarity with modern algorithms, Experience developing distributed, large-scale RL pipelines from data creation to evaluations
  • 03Research experience: Scientific publications in any of the following topics: Reinforcement Learning, LLMs and reasoning models, Ability to discuss the latest research with sufficient level of detail, Is reasonably opinionated
  • 04Engineering skills: Strong machine learning, algorithm skills and engineering background, Experience with distributed training, Excellent programming skills in Python, Familiarity with a deep learning framework (Pytorch or JAX)

Условия

  • 01Fully remote work & flexible hours
  • 0237 days/year of vacation & holidays
  • 03Health insurance allowance for you & dependents
  • 0416 weeks of flexible, full-pay parental leave
  • 05Well-being, always-be-learning & home office allowances
  • 06Company-provided equipment
  • 07Frequent team get togethers
  • 08Diverse & inclusive people-first culture