Sumo Logic18.04.2026
Site Reliability Engineer I
San Jose
Обязанности
- 01Own availability, the most important product feature, by continually striving for sustained operational excellence of Sumo’s planet-scale observability and security products
- 02Work with your global SRE team to optimize operations, increase efficiency in our use of cloud resources and our developers’ time, harden security posture, and increase the feature velocity of our developers
- 03Work closely with multiple teams on assisted engagements to optimize the operations of their microservices
- 04Continually improve the lifecycle of microservices and architectural components from inception and design, through deployment, operation, and refinement
- 05Participate in defining, evolving, and managing SLOs
- 06Write code and automation to reduce operational workload, increase efficiency, improve security posture, eliminate toil, and enable Sumo’s developers to deliver features more rapidly
- 07Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
- 08Facilitate blame-free root cause analysis meetings for incidents to learn and drive improvement
- 09Participate in and continually improve our global IRC (incident response coordination) for all products
- 10Drive root cause identification and issue resolution across teams
- 11Work inside of a fast-paced iterative environment
Требования
- 01Cloud native application development experience leveraging best practices and design patterns
- 02Strong debugging and troubleshooting skills across the entire technology stack
- 03Understanding of AWS Networking, Compute, Storage, and managed services
- 04Experience with modern CI/CD tooling like Kubernetes, Terraform, Ansible & Jenkins
- 05Versed in Infrastructure as Code practices using technologies like Terraform or CloudFormation
- 06Experience with full life cycle support of services, from creation to production support
- 07Ability to author production-ready code in at least one of the following: Java, Scala, or Go
- 08Experience with Linux systems and at home on the command line
- 09Understand and apply modern approaches to cloud-native software security
- 10Experience with agile frameworks, such as Scrum and Kanban, and how to operate within these frameworks to continually deliver value
- 11Flexible and willing to step into new roles and responsibilities
- 12Willingness to learn and use Sumo Logic products for solving reliability and security issues
- 13Bachelor’s or Master's Degree in Computer Science, Electrical Engineering, or another scientific or technical discipline
- 141+ years of industry experience
Условия
- 01Location San Jose, Costa Rica - Remote
- 02Employees will be responsible for complying with applicable federal privacy laws and regulations, as well as organizational policies related to data protection