All roles

[Remote] Cloud Operations Engineer

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. O'Reilly Media is dedicated to sharing the knowledge of innovators and helping professionals develop expertise. As a Cloud Operations Engineer, you will work on systems and tooling that power the learning platform, focusing on infrastructure-as-code and maintaining Kubernetes while collaborating with product engineering teams.

Responsibilities

  • Maintaining and updating our Kubernetes cluster to ensure steady-state operations
  • Writing or extending Terraform modules to provision and manage cloud infrastructure
  • Contributing features to the Python CLI tooling we use to manage infrastructure workflows
  • Design, build, and maintain cloud infrastructure using infrastructure-as-code (Terraform) on GCP
  • Manage and evolve our Kubernetes platform, including cluster operations, workload configuration, and service mesh (Istio)
  • Develop and improve internal tooling that abstracts cloud complexity and improves the developer experience
  • Collaborate with product engineering teams to understand service deployment needs and deliver infrastructure solutions
  • Monitor platform health using Datadog; proactively identify and resolve performance, availability, and security issues
  • Participate in on-call rotation and incident response; drive blameless post-mortems and eliminate recurring issues at their root cause
  • Define and track service-level indicators and objectives (SLIs/SLOs) for critical platform components
  • Implement and refine alerting, dashboards, and runbooks that reduce mean time to resolution
  • Embed security best practices into infrastructure workflows (DevSecOps) — not as an afterthought, but as a design principle
  • Help maintain cloud security posture, IAM hygiene, and policy guardrails across our cloud environment
  • Stay current with cloud security developments and proactively surface risks to the team
  • Execute and maintain our automated disaster recovery processes
  • Work closely with product engineering teams to understand their needs and remove infrastructure friction
  • Document systems, processes, and architectural decisions clearly so knowledge is shared, not siloed
  • Recommend improvements to tooling, architecture, and processes — and help drive them to completion
  • Keep current with the evolving cloud-native ecosystem and bring relevant knowledge back to the team

Skills

  • Bachelor's degree in Computer Science or a related field
  • 5+ years of experience working in cloud infrastructure, platform engineering, or a related discipline
  • In lieu of degree, equivalent education and/or experience may be considered
  • Hands-on experience with Kubernetes in production environments (cluster management, workloads, networking)
  • Proficiency with infrastructure-as-code tools, particularly Terraform
  • Experience with at least one major cloud provider (GCP, AWS, or Azure)
  • Solid scripting and automation skills in Python, Bash, or a comparable language
  • Experience with modern observability platforms (Datadog, Grafana, or similar)
  • Strong understanding of Linux systems administration
  • Working knowledge of CI/CD concepts and tools (GitHub Actions, ArgoCD, Jenkins, or similar)
  • Excellent communication skills — you write clearly, ask good questions, and explain complex systems accessibly
  • AI-Augmented Development: Has the ability to demonstrate using AI-enabled development tools (e.g., Claude Code, Cursor) to streamline coding, debugging, and infrastructure-as-code authoring
  • Experience with service mesh technologies such as Istio or Linkerd
  • Familiarity with GitOps workflows and tools (ArgoCD, Flux)
  • Experience with DevSecOps practices and tooling (Snyk, Trivy, OPA, or similar)
  • Working knowledge of SQL databases (PostgreSQL or MySQL)
  • Familiarity with FinOps practices and cloud cost optimization
  • Experience building or consuming internal developer platforms (IDPs)
  • Configuration management experience (Ansible, Chef, or similar)
  • Relevant certifications (CKA, CKAD, AWS/GCP Professional, or similar)

Company Overview

  • Inspiring the future for more than 45 years We share the knowledge and teach the skills people need to change their world. It was founded in 1978, and is headquartered in Seattle, Washington, USA, with a workforce of 201-500 employees. Its website is http://dankaminsky.com.
  • Apply To This Job

    Related roles

    [Remote] Account Director, Central US (Remote)

    Remote · USA Full-time

    [Remote] SOLUTIONS ARCHITECT- [Clinician/ UX]

    Remote · USA Full-time

    [Remote] Lead Product Manager, First-Party Data Platform

    Remote · USA Full-time

    [Remote] Senior Technical Recruiter, AI/ML Research

    Remote · USA Full-time

    [Remote] Epic Clarity Analyst/ SQL Developer - Remote

    Remote · USA Full-time

    [Remote] Master Network Engineer - Security Infrastructure

    Remote · USA Full-time

    [Remote] Sr Epic Application Analyst - Epic Bones & Kaleidoscope-27665

    Remote · USA Full-time

    [Remote] Software Engineer | $75/hr Remote

    Remote · USA Full-time

    [Remote] Sr Epic Application Analyst - Epic Beaker-25571

    Remote · USA Full-time

    [Remote] Senior Machine Learning Engineer

    Remote · USA Full-time

    Remote Patient Monitoring Technician (RPM)- In Person Role

    Remote · USA Full-time

    Experienced Data Entry Clerk – Remote Opportunity with arenaflex

    Remote · USA Full-time

    New Client Acquisition Director

    Remote · USA Full-time

    Experienced Full Stack Data Entry Specialist – Remote Work From Home Opportunity

    Remote · USA Full-time

    Join Today: Data Entry Clerk/Entry Level ( Remote)

    Remote · USA Full-time

    Behavioral Health Specialist - Remote in Miami - Dade County FL

    Remote · USA Full-time

    Experienced Customer Service Representative – Remote Call Center Opportunity at arenaflex

    Remote · USA Full-time

    Tax Examiner (Paid Family Leave)

    Remote · USA Full-time

    Remote - Audit Associate

    Remote · USA Full-time

    Security Sales Advisor - Southern California

    Remote · USA Full-time