Grafana Labs13.04.2026

Staff Backend Engineer - Application Core Services, Stacks | USA | Remote

United States (Remote)

Обязанности

  • 01Design, build, and operate reconciliation systems, including the SSS backend, to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration
  • 02Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient
  • 03Improve operational efficiency by reducing deployment complexity (e.g., aiming for single PR regional SSS deployment) and contributing to the Stack Config Reconciliation project
  • 04Manage rollout mechanisms for provisioned plugins, dashboards, data sources, Grafana versions, release channels, and stack-level configuration
  • 05Support new region and cluster rollouts, including the operational paths required to bring stacks online safely in new Grafana Cloud regions
  • 06Improve incident response and recovery paths for stack misalignment, reconciliation failures, plugin rollout issues, and Hosted Grafana integration failures
  • 07Partner with Product, Hosted Grafana, Infrastructure, Support, and adjacent AppCore squads on customer-impacting stack lifecycle work
  • 08Contribute to roadmap planning, technical design, OnCall improvements, and long-term simplification of stack operations
  • 09Improve runbooks, dashboards, alerts, reconciliation safety, rollout controls, and recovery procedures for the production systems
  • 10Debug across service boundaries and make careful changes in systems that affect customer stacks

Требования

  • 01Experience designing, building, and operating scalable backend systems
  • 02Proficiency in solving complex workflow and systems problems
  • 03Experience improving reliability and developer experience
  • 04Ability to build software that directly supports both customers and internal stakeholders
  • 05Experience with cloud platforms (AWS, Azure, GCP) and cloud marketplaces
  • 06Knowledge of billing systems and customer usage calculation
  • 07Experience with provisioning automation and user portal development
  • 08Strong debugging skills across service boundaries
  • 09Experience with incident response and recovery procedures
  • 10Familiarity with Grafana stack (Mimir, Loki, Tempo) is a plus
  • 11Experience with reconciliation systems and drift detection
  • 12Ability to work in a remote-first, global team environment
  • 13Passion for open-source contributions and collaboration

Условия

  • 01Remote position with USA time zones (EST + CST only at this time)
  • 02Opportunity to contribute to open-source projects
  • 03Modern AI coding assistants support with company-funded usage budget
  • 04Global, remote-first company culture
  • 05Opportunity for career growth in an innovation-driven environment