Anthropic14.04.2026
Full-Stack Software Engineer, Reinforcement Learning
San Francisco
Обязанности
- 01Build and extend web platforms for RL environment creation, management, and quality review — including environment configuration, versioning, and validation workflows
- 02Develop vendor-facing interfaces and tooling that let external partners create, submit, and iterate on training environments with minimal friction
- 03Design and implement platforms for human data collection at scale, including labeling workflows, quality assurance systems, and feedback mechanisms that surface reward signal integrity issues early
- 04Build evaluation dashboards and observability UIs that give researchers real-time insight into environment quality, training run health, and reward hacking
- 05Create backend services and APIs that connect environment authoring tools, data collection systems, and RL training infrastructure
- 06Build and expand scalable code data generation pipelines, producing diverse programming tasks with robust reward signals across languages and difficulty levels
- 07Develop onboarding automation and documentation tooling so new vendors and internal users ramp up in hours, not weeks
- 08Partner closely with RL researchers, data operations, and vendor management to translate ambiguous requirements into well-scoped, well-designed products
Требования
- 01Strong software engineering fundamentals and real full-stack range — comfortable owning a surface from database schema to frontend
- 02Proficient in Python and a modern web stack (React, TypeScript, or similar)
- 03Track record of shipping systems that solved a hard problem, not just shipped on time
- 04Operate with high agency: identify what needs to be done and drive it forward without waiting for a ticket
- 05Thrive in a fast-moving environment where priorities shift and the next problem is often one nobody has solved before
- 06Care about Anthropic's mission to build safe, beneficial AI and want your work to contribute directly to it
- 07Communicate clearly with researchers, operations teams, and engineers, and can turn vague asks into well-scoped work