Anthropic29.04.2026
Senior Staff+ Software Engineer, Node Infra
San Francisco
Обязанности
- 01Own the technical strategy and roadmap for node lifecycle management - ingestion, bring-up, health checking, and automated repair
- 02Drive cross-team initiatives to build and scale AI clusters across multiple clouds and accelerator families
- 03Design and operate the systems that detect, isolate, and remediate unhealthy hardware automatically, driving up fleet MTBI and minimizing stranded capacity
- 04Define infrastructure architecture, ensuring the hardest problems get solved - whether by you directly or by working through others
- 05Work closely with cloud providers and internal research/inference/product teams to shape long-term compute, data, and infrastructure strategy
- 06Establish and evolve operational excellence practices (incident response, postmortem culture, on-call)
- 07Support the growth of engineers around you through technical mentorship and coaching
Требования
- 01Deep expertise in distributed systems, reliability, and cloud platforms (e.g., Kubernetes, IaC, AWS/GCP/Azure)
- 02Strong proficiency in at least one systems language (e.g., Rust, Go, or Python), IaC proficiency with Terraform
- 03Hands-on experience with machine learning accelerators (GPUs, TPUs, or Trainium)
- 04Track record of leading complex, multi-quarter technical initiatives that span multiple teams or systems
- 05Ability to build alignment across senior stakeholders and communicate effectively at all levels
- 06Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
- 07Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
Условия
- 01Annual Salary: $320,000 — $405,000 USD
- 02Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time
- 03Visa sponsorship: We do sponsor visas