All roles

[Remote] Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability Job Details | Avaya

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. They are seeking a Site Reliability Engineer (SRE) to drive stability, reliability, and performance across their Azure and GCP-based platforms, focusing on operational excellence and proactive incident management.

Responsibilities

  • Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments
  • Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements
  • Maintain clear communication with cross-functional teams and leadership during major incidents
  • Build, tune, and maintain observability dashboards (Azure Monitor, GCP Operations Suite, Prometheus, Grafana, Datadog, Log Analytics)
  • Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure
  • Define SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact
  • Integrate AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation
  • Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery
  • Analyze trends to prevent recurring issues and support teams in resilience engineering

Skills

  • 5+ years in Site Reliability, DevOps, Cloud Operations, or Customer support roles
  • Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions
  • Expertise in Azure and GCP cloud operations and distributed system reliability
  • Understanding of Terraform, Ansible, and CI/CD pipelines (Jenkins, GitHub Actions)
  • Experience with observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.)
  • Solid grasp of incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations)
  • Excellent analytical, troubleshooting, and communication skills

Benefits

  • Performance-related bonus
  • Annual bonus that aligns with individual and company performance
  • Benefits

Company Overview

  • Avaya is a global leader in enterprise communications, hybrid cloud CCaaS and UC solutions for mission-critical, AI-agnostic workflows. It was founded in 2000, and is headquartered in Morristown, New Jersey, USA, with a workforce of 5001-10000 employees. Its website is http://www.avaya.com.
  • Apply To This Job

    Related roles

    [Remote] Senior Amazon Data Analyst

    Remote · USA Full-time

    [Remote] Account Executive

    Remote · USA Full-time

    [Remote] DevOps / DevSecOps Engineer - Azure | GCP | IaC | Security | Automation Job Details | Avaya

    Remote · USA Full-time

    [Remote] Engineering Technical Lead - I&C Embedded Software

    Remote · USA Full-time

    [Remote] RCM Customer Success Manager

    Remote · USA Full-time

    [Remote] Research Analyst - Robotics

    Remote · USA Full-time

    [Remote] Product Counsel, ERISA & Operations

    Remote · USA Full-time

    [Remote] Director Business Development-Biologics, West Coast U.S. Job Details | Syngene International Limited

    Remote · USA Full-time

    [Remote] Senior Sales Executive, Core Growth - MLF (KC/STL)

    Remote · USA Full-time

    [Remote] Sr Software Engineer, AI Experience

    Remote · USA Full-time

    Career Opportunities: Medicaid Instructional Designer and Trainer - Remote US (34745)

    Remote · USA Full-time

    Senior Data Scientist (f/m/x)

    Remote · USA Full-time

    Research Scientist – VLM Generalist

    Remote · USA Full-time

    Trust and Safety Lead

    Remote · USA Full-time

    Equipment Sales Associate - Philadelphia

    Remote · USA Full-time

    Experienced Permit Technician / Customer Service Processing Technician – Unlocking Career Growth Opportunities at arenaflex

    Remote · USA Full-time

    Experienced Online Data Entry Specialist – Dynamic Team at arenaflex

    Remote · USA Full-time

    Fully Remote Social Worker

    Remote · USA Full-time

    AI Engineer | EMEA/LATAM

    Remote · USA Full-time

    Client Manager, Customer Success

    Remote · USA Full-time