All roles

[Remote] Platform Engineer (GPU)

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. Vero is an exciting AI infrastructure startup that collaborates closely with NVIDIA and other key organizations to shape the future of data centers. The Platform Engineer (GPU) will be responsible for the operation, optimization, and reliability of large-scale GPU clusters supporting AI/ML and HPC workloads, focusing on performance tuning and systems management.

Responsibilities

  • Support the reliability, performance, and day-to-day operations of large-scale GPU infrastructure supporting AI/ML and HPC workloads
  • Optimize Kubernetes platforms to maximize efficiency, utilization, and stability in production
  • Develop reusable Terraform and Ansible modules to enable scalable, low-drift deployments
  • Maintain high availability through strong observability, SLO/SLI ownership, and incident response practices
  • Troubleshoot complex cross-layer issues and manage platform lifecycle (upgrades, scaling, security, multi-tenancy) in production environments

Skills

  • 3+ years of experience in Platform Engineering, SRE, DevOps or infrastructure roles
  • Robust experience with GPU infrastructure & HPC clusters
  • Proven experience operating and scaling large distributed systems in high-availability environments
  • Kubernetes
  • Terraform & Ansible
  • Strong background in monitoring, observability and incident response (Prometheus, Grafana, etc.)
  • Slurm (or similar workload schedulers)

Benefits

  • Huge equity upside
  • Medical, dental, and vision insurance for the employee and family
  • Equity Scheme
  • Bonus
  • 401(k) with a generous employer match
  • Company-paid Life Insurance
  • Flexible Spending Account
  • Mental Wellness Benefits
  • Flexible PTO

Company Overview

  • We help founders and leaders build high-impact teams by connecting them with exceptional talent globally, with a focus in the US. It was founded in 2019, and is headquartered in London, City of London, GB, with a workforce of 11-50 employees. Its website is https://www.wearevero.io/.
  • Apply To This Job

    Related roles

    [Remote] Account Executive (High-Ticket Sales Closer)

    Remote · USA Full-time

    [Remote] Senior Contract Analyst

    Remote · USA Full-time

    [Remote] Remote Sales- No Experience Needed, Will Train, NO COLD CALLING

    Remote · USA Full-time

    [Remote] Internal Audit Operations Specialist

    Remote · USA Full-time

    [Remote] Reimbursement Coordinator I Non-Medicare PDGM

    Remote · USA Full-time

    [Remote] Sales - Sales and Outreach Intern

    Remote · USA Full-time

    [Remote] Research Development Mechanical Engineer

    Remote · USA Full-time

    [Remote] Virtual Finance Manager

    Remote · USA Full-time

    [Remote] Head of Strategic Accounts

    Remote · USA Full-time

    [Remote] Senior Financial Analyst – OCI Finance – Supply Chain

    Remote · USA Full-time

    Online Stock Photographer Jobs – U.S.-Based Roles

    Remote · USA Full-time

    Lead Clinical Research Associate

    Remote · USA Full-time

    Experienced Full Stack Customer Service Representative – Health Insurance Enrollment Support

    Remote · USA Full-time

    Part-Time GIS Analyst (Fisheries / Marine Science)

    Remote · USA Full-time

    Customer Application & Support Engineer / Field Service Engineer – Northern California (San Francisco Bay Area) – arenaflex

    Remote · USA Full-time

    California Licensed Therapist (LMFT / LCSW / LPCC / Psychologist) | 100% Remote | Salaried (W-2) The

    Remote · USA Full-time

    Home Health/Hospice Nurse Southside; Northside; Westside

    Remote · USA Full-time

    Senior Customer Technical Support Specialist – Remote Opportunity at arenaflex

    Remote · USA Full-time

    Investment Analyst | Upto $130/hr Hourly

    Remote · USA Full-time

    Customer Technical Service Engineer

    Remote · USA Full-time