Baseten25 дней назад
Engineering Manager, Runtime Fabric
Полная занятостьУдалёнка
Обязанности
- 01Recruit, hire, and develop a high-performing team of systems engineers with deep container and Linux expertise.
- 02Foster a culture of technical rigor, open-source contribution, and continuous improvement.
- 03Provide regular coaching, feedback, and career development support to your direct reports.
- 04Partner with engineering leadership to define the long-term vision and roadmap for container runtime and storage infrastructure.
- 05Guide the team in extending and hardening containerd, runc, and related OCI ecosystem projects to meet the GPU-specific requirements of production AI inference, including startup performance, GPU device access, and multi-tenant isolation.
- 06Oversee the architecture and evolution of the Baseten Delivery Network: the tiered caching and weight delivery system that makes cold starts 2–3x faster and eliminates thundering herd failures during burst scaling events.
- 07Drive the expansion of BDN's architecture, currently focused on model weights, to container images, training checkpoints, and deployment artifacts.
- 08Provide technical oversight on GPU-aware isolation mechanisms for multi-tenant inference, including secure container runtimes, Linux namespace hardening, and longer-term micro-VM integration.
- 09Ensure the team maintains end-to-end ownership of the container startup performance path, from snapshotter initialization through weight delivery to first inference request.
- 10Champion the team's contributions back to the open-source containerd ecosystem alongside a team of core maintainers.
- 11Act as the primary advocate for Runtime Fabrics across the organization, ensuring upstream and downstream teams have the integration support they need.
- 12Collaborate with product and engineering stakeholders to prioritize investments based on business impact and infrastructure reliability.
- 13Communicate team progress, technical trade-offs, and architectural decisions clearly to leadership.
Требования
- 01Proven experience managing and growing engineering teams in a systems, infrastructure, or low-level runtime context.
- 02Deep familiarity with the Linux container ecosystem: containerd, runc, OCI Runtime Spec, Linux namespaces, and cgroups, with the ability to engage credibly in code reviews and architectural discussions.
- 03Contributions to containerd/containerd, opencontainers/runc, google/gvisor, kata-containers/kata-containers, or closely related open-source projects.
- 04Strong systems programming background in Go and/or C/C++.
- 05Experience with distributed storage systems, content-addressable storage, or large-scale caching infrastructure.
- 06Understanding of how container images are structured, stored, and delivered at scale.
- 07Strong written and verbal communication skills, with the ability to influence without authority across teams.
Условия
- 01Competitive compensation, including meaningful equity.
- 02100% coverage of medical, dental, and vision insurance for employee and dependents
- 03Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- 04Paid parental leave
- 05Fertility and family-building stipend through Carrot
- 06Company-facilitated 401(k)
- 07Exposure to a variety of projects