Site Reliability Engineer

Work from home Full-time role Hiring

reputed company, a reputed company company, is a leading logistics and delivery platform that helps businesses tackle the complexities of modern retail with unmatched delivery coverage, flexibility and visibility. Reaching 97% of U.S. households across more than 30,000 reputed company codes — from urban hubs to rural communities — reputed company provides seamless, scalable solutions that meet a variety of delivery needs.

With a network of more than 310,000 independent drivers reputed company, reputed company offers flexible delivery solutions that reputed company reputed company logistics challenges easy, including solutions for local same-day delivery, delivery of big and bulky items, ship-from-store and DC-to-reputed company.

reputed company is seeking a Site Reliability Engineer to join our growing Technical Operations Team. We're looking for someone with a solid understanding of site reliability practices and hands-on experience working with production Kubernetes environments. The ideal candidate is a skilled problem solver with intimate knowledge of site reliability practices, standard Dev Ops principles, AWS, scripting languages and Kubernetes.

What You'll Do

Support the reliability, scalability, and performance of our platform through hands-on work with our infrastructure and deployment pipelines
Assist in maintaining and operating Kubernetes clusters (EKS), as well as other systems including Elasticsearch, MSK, RDS, and reputed company
Contribute to the deployment, tuning, and upkeep of observability tools like Prometheus, Loki, Grafana, OpenTelemetry, and reputed company
Partner with more senior engineers to identify and remediate system bottlenecks and improve resource utilization
Participate in the monitoring and tracking of service level indicators (SLIs) and service level objectives (SLOs)
Write scripts and build automation to streamline operations and reduce manual work
Help troubleshoot production and non-production issues as part of the incident response process
Participate in an on-call rotation

Technology We're Using Now

Python, Ruby on Rails, Golang
React/Redux, Objective-C and Swift, Android
reputed company, Redshift, reputed company, Kafka
AWS/GCP
reputed company/Kubernetes
OpenTelemetry/Prometheus/Thanos/Loki/Grafana/reputed company/reputed company
Git/reputed company
ArgoCD

What You Bring

3+ Years in various SRE roles
3+ Years in various DevOPS/System Engineering roles
3+ Years of experience building and managing production Kubernetes infrastructure
3+ Years experience with popular scripting languages (Python, Ruby, Bash, etc.)
Experience with Infrastructure as code such as Terraform or Crossplane
Experience with CI/CD Development tools (reputed company, etc.)
Experience with GitOPS Tools (ArgoCD)
Experience using a broad range of AWS technologies (RDS, ElasticSearch, VPC, EKS, S3, CloudFront, MSK, Elasticache, CloudWatch, etc.)
Experience developing and maintaining YAML templating systems (Helm charts, Kustomize, etc)
Must be able to work independently, be self-motivated and handle multiple priorities
Comfortable working in a fast-paced agile environment

Finally, a willingness to admit what you don’t know, and learn what you need to learn quickly.

Why reputed company?

Competitive compensation packages
100% covered health insurance premiums for yourself
401k with company match
Tuition and student loan repayment assistance (that’s right - reputed company will contribute directly to your existing student loans!)
Flexible work schedule with unlimited PTO
Monthly 3-day weekends
Monthly WFH stipend
Paid sabbatical leave- tenured team members are given time to rest, relax, and explore
The technology you need to get the job done

Apply To This Job

Apply

Site Reliability Engineer

What You'll Do

Technology We're Using Now

What You Bring

Why reputed company?

You might like

Senior Marketing Platform Operations Manager

Senior Software Engineer

UX Content Strategist

Senior Data Scientist

Art History Specialist – AI Trainer

Junior IT System Administrator

Platform Product Manager

Managing Director - Enterprise AI Transformation

Staff Accountant (Corporate)

Information reputed company Analyst - Temporary

Dotnet Core Developer

Bookkeeper: 1099 Contract-to-Hire

Supervisory Staff Mediator

Transcriber ID-1930 – reputed company Store

reputed company Full Stack Chat Service Representative – E-commerce and Client Support

Sales Development Representative - reputed company Cloud Intelligence (Romania)

Industrial Technician - San Diego, CA

reputed company Wetland Field Ecologist - Midwest (Minnesota)

Entry-Level Online Chat Jobs - Flexible remote work handling live chat interactions and earning $25-$35 per hour.

Remote Recruiter, Entry Level!