All roles

[Remote] Senior Site Reliability Engineer

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. Attain Finance is a leading consumer credit lender with over 50 years of expertise in providing credit solutions across the U.S. They are seeking a Senior Site Reliability Engineer to enhance the reliability and operational excellence of their software delivery systems. The role involves hands-on work across various technologies to ensure the stability and efficiency of their applications in production.

Responsibilities

  • Build and operate the delivery platform. Work across AWS, EKS, ArgoCD, Helm, GitHub Actions, Azure DevOps, Terraform, and Python
  • Fix the problems you own. Find root cause across the AWS and Kubernetes stack, fix it, and harden it so it stays fixed
  • Respond to incidents. Help stabilize during outages, drive root-cause analysis, and ship corrective actions for your systems
  • Standardize how we build and ship. Define reproducible container builds and GitOps paths on ArgoCD and Helm that replace manual deployment
  • Help consolidate the CI estate. Standardize pipelines across GitHub Actions and Azure DevOps for your services — remove brittle steps and silent failures and improve visibility
  • Support platform adoption. Build golden-path templates and tooling and help teams move services onto the platform
  • Use progressive delivery. Canary and blue green deploys (Argo Rollouts) and automated rollback for the services you operate
  • Build observability in. Wire golden-signal metrics, logs, and traces (Prometheus/Mimir, Loki, Tempo, OpenTelemetry) into your services, surfaced in Grafana with SLOs for your domain
  • Operate production systems. Troubleshoot failed to deploy, respond to alerts, and improve behavior from real incidents
  • Help meet SLOs and carry on call. Track reliability metrics for the services you operate and share the rotation
  • Built across environments. Design dev, test, and prod for safe promotion, recovery from failed deployments, and zero-downtime upgrades
  • Help set the standard. Build reference implementations for build, deploy, GitOps, promotion gates, and observability
  • Uphold compliance with the pipeline. Support deployment traceability, approval trails, and segregation of duties for PCI DSS, SOC 2, SOX, and GLBA
  • Cut toil and cost. Automate repetitive ops work and help tune EKS compute, CI runners, and observability cardinality
  • Unblock across teams. Get hands-on with Cloud, Security, Application Engineering, Data, and Product to keep delivery moving
  • Kill knowledge silos. Write docs, runbooks, and incident learnings, so engineers operate independently

Skills

  • Kubernetes, ArgoCD, Helm, Terraform, Python. Deep hands-on production experience
  • Hands-on AWS. Operate and debug EKS, ECS, EC2, ECR, IAM/IRSA, VPC networking, ALB/NLB, CloudWatch, Secrets Manager, and KMS
  • GitHub Actions and/or Azure DevOps. Build and operate CI/CD at scale
  • Grafana and the observability stack. Hands-on with Grafana dashboards and alerting, and the metrics, logs, and traces stack (Prometheus/Mimir, Loki, Tempo, OpenTelemetry)
  • Strong scripting. Python and Bash, with the ability to grow into systems-level coding
  • Production troubleshooting. Comfortable getting into a system under load, finding root cause, and fixing it
  • Production ownership. Uptime and reliability accountability
  • Incident response. You respond and help drive postmortems that yield real improvements
  • Standards contribution. You contribute to engineering standards and best practices
  • Compliance awareness. Experience in regulated or high-rigor environments or implementing audit and access controls in pipelines
  • Mentorship. Through code review, examples, and pairing
  • 5+ years in site reliability, platform, DevOps, or software engineering, with production ownership of systems or pipelines
  • Advanced GitOps. ArgoCD (or Flux), reusable Helm patterns, Argo Rollouts
  • CI consolidation or migration. Moving between CI systems, such as Azure DevOps to GitHub Actions
  • Self-hosted observability at scale. Running Grafana, Mimir, Loki, and Tempo in production
  • Supply chain security. SBOMs, artifact signing (Sigstore/cosign), SLSA provenance
  • Platform migrations. Contributing to modernization with minimal disruption
  • .NET / C#. Enough to containerize and reason about application workloads
  • Low-level Kubernetes. Cilium/eBPF, Karpenter, or self-hosted networking and autoscaling
  • Resilience testing. Chaos/failure injection or disaster recovery drills
  • AI-assisted tooling. Responsible use with output validation
  • Certification. AWS Solutions Architect, AWS DevOps Engineer, or CKA/CKAD
  • Degree in computer science or equivalent practical experience

Benefits

  • Flexible Paid Time Off Program
  • Medical
  • Dental
  • Vision
  • Life Insurance
  • Disability
  • Other voluntary coverages
  • 401k program, starting on the first of the month following 30 days of employment with a company match

Company Overview

  • Attain Finance offers consumer credit lending and personal loan services through multiple brands in the U.S. and Canada. It was founded in 1997, and is headquartered in California, Kentucky, USA, with a workforce of 1001-5000 employees. Its website is https://attainfinance.com.
  • Apply To This Job

    Related roles

    [Remote] LLM - AI Quality Analyst (Personalization) - Dutch

    Remote · USA Full-time

    [Remote] Remote Customer Service Rep

    Remote · USA Full-time

    [Remote] QA Tester (17557)

    Remote · USA Full-time

    [Remote] Support and Services Operations Manager

    Remote · USA Full-time

    [Remote] Junior Accountant

    Remote · USA Full-time

    [Remote] Director of Sales, Biopharma

    Remote · USA Full-time

    [Remote] Remote Equity Research Analyst - AI Trainer

    Remote · USA Full-time

    [Remote] Virtual Administrative Assistant

    Remote · USA Full-time

    [Remote] Recruiter (Unpaid Volunteer Position)

    Remote · USA Full-time

    [Remote] Customer Service Specialist

    Remote · USA Full-time

    Territory Manager - Lower South Island (Christchurch, CANZ)

    Remote · USA Full-time

    Senior Research Administrator

    Remote · USA Full-time

    Vanuatu: Infield Mentor (12-month Fixed-Term)

    Remote · USA Full-time

    Product Manager - Training Solutions Technology

    Remote · USA Full-time

    Remote Data Administrator

    Remote · USA Full-time

    Experienced Entry-Level Data Entry Specialist – Product Testing and Quality Assurance for arenaflex

    Remote · USA Full-time

    Experienced Remote Data Entry Specialist – Join the Magical World of Disney as a Data Entry Professional with Opportunities for Growth and Development

    Remote · USA Full-time

    Case Mgr Behavioral Health (Crisis Triage) - remote (PA/NJ/DE)

    Remote · USA Full-time

    [Work From Home] Back-End Engineer (Node.js + AWS)

    Remote · USA Full-time

    Experienced Customer Service Representative - Work From Home Opportunity at blithequark

    Remote · USA Full-time