Baseten25 дней назад

Engineering Manager, Runtime Fabric

Полная занятостьУдалёнка

Обязанности

  • 01Recruit, hire, and develop a high-performing team of systems engineers with deep container and Linux expertise.
  • 02Foster a culture of technical rigor, open-source contribution, and continuous improvement.
  • 03Provide regular coaching, feedback, and career development support to your direct reports.
  • 04Partner with engineering leadership to define the long-term vision and roadmap for container runtime and storage infrastructure.
  • 05Guide the team in extending and hardening containerd, runc, and related OCI ecosystem projects to meet the GPU-specific requirements of production AI inference, including startup performance, GPU device access, and multi-tenant isolation.
  • 06Oversee the architecture and evolution of the Baseten Delivery Network: the tiered caching and weight delivery system that makes cold starts 2–3x faster and eliminates thundering herd failures during burst scaling events.
  • 07Drive the expansion of BDN's architecture, currently focused on model weights, to container images, training checkpoints, and deployment artifacts.
  • 08Provide technical oversight on GPU-aware isolation mechanisms for multi-tenant inference, including secure container runtimes, Linux namespace hardening, and longer-term micro-VM integration.
  • 09Ensure the team maintains end-to-end ownership of the container startup performance path, from snapshotter initialization through weight delivery to first inference request.
  • 10Champion the team's contributions back to the open-source containerd ecosystem alongside a team of core maintainers.
  • 11Act as the primary advocate for Runtime Fabrics across the organization, ensuring upstream and downstream teams have the integration support they need.
  • 12Collaborate with product and engineering stakeholders to prioritize investments based on business impact and infrastructure reliability.
  • 13Communicate team progress, technical trade-offs, and architectural decisions clearly to leadership.

Требования

  • 01Proven experience managing and growing engineering teams in a systems, infrastructure, or low-level runtime context.
  • 02Deep familiarity with the Linux container ecosystem: containerd, runc, OCI Runtime Spec, Linux namespaces, and cgroups, with the ability to engage credibly in code reviews and architectural discussions.
  • 03Contributions to containerd/containerd, opencontainers/runc, google/gvisor, kata-containers/kata-containers, or closely related open-source projects.
  • 04Strong systems programming background in Go and/or C/C++.
  • 05Experience with distributed storage systems, content-addressable storage, or large-scale caching infrastructure.
  • 06Understanding of how container images are structured, stored, and delivered at scale.
  • 07Strong written and verbal communication skills, with the ability to influence without authority across teams.

Условия

  • 01Competitive compensation, including meaningful equity.
  • 02100% coverage of medical, dental, and vision insurance for employee and dependents
  • 03Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • 04Paid parental leave
  • 05Fertility and family-building stipend through Carrot
  • 06Company-facilitated 401(k)
  • 07Exposure to a variety of projects