All roles

Site Reliability Engineer 3

Remote · USA Full-time New today

About the position The Site Reliability Engineering team designs and builds the global infrastructure on which MongoDB deploys its services, focusing on the flagship MongoDB Atlas platform. As customers grow and globalize, services must satisfy demands for low-latency requests around the globe and comply with various data sovereignty requirements. The SRE Team’s mission is to build this increasingly complex infrastructure, while continually lowering the operational burden associated with it and increasing internal visibility into the health of the system. They are strong believers in infrastructure-as-code and self-healing systems. The SRE Team is fully integrated with all other engineering teams, working closely together with a soft and traversable boundary between their areas of responsibility. Candidates based in New York City are sought for a hybrid working model. MongoDB is built for change, empowering customers and people to innovate at the speed of the market. They have redefined the database for the AI era, enabling innovators to create, transform, and disrupt industries with software. MongoDB’s unified database platform—the most widely available, globally distributed database on the market—helps organizations modernize legacy workloads, embrace innovation, and unleash AI. Their cloud-native platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available across AWS, Google Cloud, and Microsoft Azure. With offices worldwide and nearly 60,000 customers—including 75% of the Fortune 100 and AI-native startups—relying on MongoDB for their most important applications, they are powering the next era of software. MongoDB's Leadership Commitment guides decisions, interactions, and success. To drive personal growth and business impact, they are committed to developing a supportive and enriching culture for everyone, offering employee affinity groups, fertility assistance, and a generous parental leave policy to support employee wellbeing and professional and personal journeys. MongoDB is committed to providing necessary accommodations for individuals with disabilities within their application and interview process and provides equal employment opportunities to all employees and applicants.

Responsibilities

  • Design and build the infrastructure for a global cloud service that comprises hundreds of thousands of MongoDB clusters, processes a billion metrics per day, and replicates tens of billions of database writes to our backup service
  • Design, implement, and troubleshoot the automation and monitoring of services that seamlessly spans the globe - including several cloud providers
  • Become an expert in infrastructure performance, helping us optimize from the application level all the way through the firmware
  • Build for resilience. Our goal is that nobody’s pager goes off, ever. Are we there yet? No. Are we really close? Very. While we work on that - participate in a weekly on-call rotation
  • Improve our infrastructure capabilities, optimizing for cost, simplicity, and maintainability

Requirements

  • 3+ years of experience running a mission critical service at scale in a Linux environment
  • Firm grasp of at least one modern programming language, beyond basic scripting
  • Familiarity with web and network protocols and standards (HTTP, TLS, DNS, etc)
  • Bachelor’s degree in Computer Science or equivalent experience
  • Experience writing automation tools & eagerness to "automate all the things"

Nice-to-haves

  • Experience building large applications from scratch, complete with CI/CD infrastructure
  • Experience in networking, security, hardware or OS performance tuning
  • Experience with at least one of the major cloud providers (Amazon Web Services, Google Compute, Microsoft Azure)
  • Experience managing kubernetes clusters or some other container orchestration infrastructure
  • Experience with observability of large scale distributed systems

Benefits

  • fertility assistance
  • generous parental leave policy
  • employee affinity groups

Apply tot his job Apply To this Job

Related roles

Sr. Site Reliability Engineer

Remote · USA Full-time

Senior SRE - INTL MX

Remote · USA Full-time

Junior Site Reliability Engineer

Remote · USA Full-time

SRE (Kubernetes)

Remote · USA Full-time

Principal SRE

Remote · USA Full-time

Site Reliability Engineer (SRE) – II

Remote · USA Full-time

Site Reliability Engineer III

Remote · USA Full-time

Intermediate Site Reliability Engineer, Environment Automation

Remote · USA Full-time

Lead Site Reliability Engineer (GCP & Hybrid Cloud) Hybrid

Remote · USA Full-time

Senior Infrastructure Engineer/SRE

Remote · USA Full-time

Experienced Data Entry Specialist – Entry-Level Opportunity with arenaflex

Remote · USA Full-time

Experienced Full Stack Customer Service Representative – Remote Work Opportunity at arenaflex

Remote · USA Full-time

Manager, Clinical Health Services, UM Prior Authorization - Aetna Medicaid

Remote · USA Full-time

Experienced Part-Time Remote Focus Group Panelist – National & Local Paid Studies

Remote · USA Full-time

Assistant Attorney General - Criminal Justice Division's Financial Crimes Team, Seattle

Remote · USA Full-time

Experienced Remote Data Entry Analyst – Unlock Endless Opportunities with arenaflex

Remote · USA Full-time

Dental Billing Specialist; ENG

Remote · USA Full-time

Experienced Entry-level Data Entry Associate – Remote Opportunity with arenaflex

Remote · USA Full-time

Experienced Part-Time and Full-Time Remote Data Entry Operator – High Accuracy and Confidentiality Required

Remote · USA Full-time

Experienced Customer Success Manager – Global Business Growth and Revenue Acceleration

Remote · USA Full-time