Grafana Labs13.04.2026
Staff Backend Engineer - Application Core Services, Stacks | USA | Remote
United States (Remote)
Обязанности
- 01Design, build, and operate reconciliation systems, including the SSS backend, to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration
- 02Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient
- 03Improve operational efficiency by reducing deployment complexity (e.g., aiming for single PR regional SSS deployment) and contributing to the Stack Config Reconciliation project
- 04Manage rollout mechanisms for provisioned plugins, dashboards, data sources, Grafana versions, release channels, and stack-level configuration
- 05Support new region and cluster rollouts, including the operational paths required to bring stacks online safely in new Grafana Cloud regions
- 06Improve incident response and recovery paths for stack misalignment, reconciliation failures, plugin rollout issues, and Hosted Grafana integration failures
- 07Partner with Product, Hosted Grafana, Infrastructure, Support, and adjacent AppCore squads on customer-impacting stack lifecycle work
- 08Contribute to roadmap planning, technical design, OnCall improvements, and long-term simplification of stack operations
- 09Improve runbooks, dashboards, alerts, reconciliation safety, rollout controls, and recovery procedures for the production systems
- 10Debug across service boundaries and make careful changes in systems that affect customer stacks
Требования
- 01Experience designing, building, and operating scalable backend systems
- 02Proficiency in solving complex workflow and systems problems
- 03Experience improving reliability and developer experience
- 04Ability to build software that directly supports both customers and internal stakeholders
- 05Experience with cloud platforms (AWS, Azure, GCP) and cloud marketplaces
- 06Knowledge of billing systems and customer usage calculation
- 07Experience with provisioning automation and user portal development
- 08Strong debugging skills across service boundaries
- 09Experience with incident response and recovery procedures
- 10Familiarity with Grafana stack (Mimir, Loki, Tempo) is a plus
- 11Experience with reconciliation systems and drift detection
- 12Ability to work in a remote-first, global team environment
- 13Passion for open-source contributions and collaboration
Условия
- 01Remote position with USA time zones (EST + CST only at this time)
- 02Opportunity to contribute to open-source projects
- 03Modern AI coding assistants support with company-funded usage budget
- 04Global, remote-first company culture
- 05Opportunity for career growth in an innovation-driven environment