SRE
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
Here are 665 public repositories matching this topic...
The open-source tool built for simplifying the deployment, monitoring, and scaling of data pipelines.
-
Updated
Jun 30, 2024 - Go
Terraform Pull Request Automation
-
Updated
Jun 30, 2024 - Go
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
-
Updated
Jun 29, 2024 - Python
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
-
Updated
Jun 29, 2024 - JavaScript
A prometheus exporter for pg-promise
-
Updated
Jun 29, 2024 - TypeScript
A prometheus exporter exposing metrics for KafkaJS
-
Updated
Jun 30, 2024 - TypeScript
A curated list of amazingly awesome open-source sysadmin resources.
-
Updated
Jun 29, 2024
A prometheus exporter for node-postgres
-
Updated
Jun 29, 2024 - TypeScript
A prometheus exporter exposing metrics for the official MongoDB Node.js driver.
-
Updated
Jun 29, 2024 - TypeScript
On-Call/DevOps Assistant - Get a head start on fixing alerts with AI investigation
-
Updated
Jun 30, 2024 - Python
Curated Self Study Guide for Computer Science, DevOps, SRE & SysAdmin
-
Updated
Jun 28, 2024 - HTML
This repository documents my journey through the Google IT Automation with Python Professional Certificate on Coursera. It includes Python scripts, exercises, and projects covering automation tasks like file management, image processing, regular expressions, and system administration.
-
Updated
Jun 28, 2024 - Python
A blazing fast tool for building data pipelines: read, process and output events. Our community: https://t.me/file_d_community
-
Updated
Jun 28, 2024 - Go
Kaytu's AI platform boosts cloud efficiency by analyzing historical usage and delivering intelligent recommendations—such as optimizing instance sizes—that maintain reliability. Pay for what you need, without compromising your apps.
-
Updated
Jun 30, 2024 - Go
🌳 A sustainable Terraform Package which manage all of things on GitHub
-
Updated
Jun 28, 2024 - HCL
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
-
Updated
Jun 28, 2024 - Groovy
🐒 🔥 Datadog Failure Injection System for Kubernetes
-
Updated
Jun 29, 2024 - C
A collection of git utilities, useful extra git scripts, tutorials and other useful articles.
-
Updated
Jun 28, 2024 - Shell
- Followers
- 116 followers
- Wikipedia
- Wikipedia