Crusoe11.03.2026

Production Engineer (Kubernetes)

Полная занятостьОфис

Обязанности

  • 01Building Kubernetes Platform: Focus on scaling tooling and features dedicated to Crusoe's Managed Kubernetes and Managed VM platforms for external customers
  • 02Collaboration and Planning: Collaborate with the team in morning stand-up meetings to discuss ongoing projects, recent incidents, and priorities for the day
  • 03Collaborate on action plans for deploying new data centers or retrofitting existing ones
  • 04Work closely with software engineers, advising on best practices for resilient code and reviewing changes before deployment
  • 05System Monitoring and Alerting: Review overnight alerts and system performance metrics to ensure everything is running smoothly
  • 06Analyze system logs and develop tools to enhance monitoring capabilities
  • 07Incident Response and Problem Solving: Engage in incident response drills, post-mortems, and root cause analysis sessions to learn from past issues and prevent future ones
  • 08Resolve common errors automatically through automation and proactive remediation
  • 09Performance Monitoring and Optimization: Stay focused on maintaining high SLIs and SLOs, ensuring that infrastructure remains robust and reliable for customers
  • 10Documentation and Knowledge Sharing: Document work, share insights with the team, and plan for the next day's challenges with a customer-centric mindset

Требования

  • 01Production Engineering Experience: 3-6 years of professional Production Engineer experience
  • 02Kubernetes: Experience building Kubernetes platforms or Kubernetes controllers
  • 03Server Hardware and Provisioning: Exposure to server-class hardware & provisioning
  • 04Distributed Systems Architecture: Understanding of distributed system architecture; exposure to common design patterns, reliability, and scaling
  • 05Infrastructure Design: Basic understanding of infrastructure design; familiarity with operational trade-offs of network, storage, and RPC serving designs
  • 06Programming Proficiency: Proficiency with at least one programming language (Python, Go, or similar)
  • 07Observability Tooling: Exposure to Observability tooling and philosophy: logging, monitoring, and alerting tools
  • 08Operating Systems: Experience with Unix/Linux environments
  • 09Networking Fundamentals: Understanding of network fundamentals: basics of TCP/IP and network programming
  • 10Information Security Awareness: Awareness of basic information security best practices
  • 11Education: Bachelor's Degree in Computer Science, related field, or self-educated in computer science fundamentals

Условия

  • 01Competitive benefits package including pension contributions, private health and dental insurance, income protection, life assurance
  • 02Compensation paid as salary or hourly
  • 03Compensation determined by education, experience, knowledge, skills, abilities, internal equity, and market data alignment
  • 04Equal Opportunity Employer