All roles

Staff Software Engineer - Supernal

Remote · USA Full-time New today

Staff Software Engineer

About Supernal

Supernal helps small-to-medium businesses hire their first AI employee. Our AI teammates are built using intelligent, agentic workflows deployed on a proprietary platform. We deliver working, value-generating AI Employees—not tools—that handle real business processes alongside human teams.

The Role

We're looking for a Staff/Principal Software Engineer to own and evolve the core platform that powers our AI employees. This is a technical leadership position responsible for the systems that enable our agents to scale reliably: the Django backend, distributed task infrastructure, event-driven architecture, Kubernetes deployments, and observability stack.

You'll work across the full system—from database query optimization to Helm chart tuning to designing new platform abstractions. You'll be a force multiplier for the engineering team, driving architectural decisions, eliminating scaling bottlenecks, and establishing patterns that make the platform more robust and developer-friendly.

This role reports to the Director of Engineering and involves significant autonomy in shaping technical direction.

What You'll Own

  • Drive platform architecture decisions and align the team on scalable patterns and long-term maintainability

  • Review a high volume of code, design docs, and architectural proposals for scalability, reliability, security, and operability

  • Be a technical mentor and force multiplier: unblock engineers, raise the bar on production readiness, and establish platform best practices

  • Own and evolve the core backend platform (Django/DRF/ASGI) performance and correctness

  • Scale async execution across Celery + Dramatiq + Temporal/Cortex; implement resilient workflow patterns (retries, circuit breakers, graceful degradation)

  • Optimize PostgreSQL/pgvector (query tuning, connection pooling) and caching strategies

  • Maintain and improve Kubernetes deployment infrastructure (GKE, Helm, Terraform/OpenTofu) and CI/CD + rollout strategies. Own KEDA autoscaling policies and resource allocation across worker pools.

  • Own reliability of RabbitMQ, Redis, and PostgreSQL infrastructure; lead incident response and post-mortems

  • Extend OpenTelemetry + Datadog instrumentation, dashboards, alerts, and SLOs; profile and reduce latency/memory bottlenecks

What We're Looking For

Required

  • 10+ years building and operating production backend systems at scale

  • Deep expertise in Python (Django preferred) and relational databases (PostgreSQL)

  • Hands-on experience with Kubernetes, Helm, and cloud infrastructure (GCP preferred)

  • Strong background in distributed systems: message queues, event sourcing, workflow orchestration

  • Production experience with async task systems (Celery, Dramatiq, or similar)

  • Track record of debugging complex production issues across multiple services

  • Ability to work autonomously and drive technical initiatives without close supervision

  • Clear technical communication—able to explain tradeoffs and build consensus

Preferred

  • Experience with Temporal or similar workflow engines

  • Background in LLM infrastructure, RAG systems, or AI/ML platforms

  • Familiarity with OpenTelemetry, Datadog, or similar observability stacks

  • Experience with KEDA or other Kubernetes autoscaling solutions

  • Contributions to multi-tenant SaaS platform architecture

  • History of improving developer experience and platform abstractions

What Success Looks Like

  • Platform services maintain high availability with predictable performance under load

  • Scaling bottlenecks are identified and resolved proactively

  • New features ship faster because platform primitives are well-designed and documented

  • Incidents are rare, quickly detected, and thoroughly addressed

  • Engineers across the team adopt platform patterns and best practices

  • Technical debt is systematically identified and paid down

  • You're a trusted technical voice in architectural discussions

Compensation & Logistics

  • Compensation: Competitive salary commensurate with experience (Staff/Principal level)

  • Location: Remote

  • Type: Full-time

  • Requirements: Overlap with Americas timezones for collaboration; reliable high-speed internet

Apply To This Job

Related roles

Engineering Manager (AI) - Supernal

Remote · USA Full-time

Senior AI Engineer (Core) - Supernal

Remote · USA Full-time

Coordonnateur marketing de produit – sang (Temporaire, 4 mois)

Remote · USA Full-time

VP, Group Account Supervisor

Remote · USA Full-time

Growth Web Engineer (AI-Enabled)

Remote · USA Full-time

IT Analyst II | AMI Application Support (PHOENIX, AZ, US, 85004-3903)

Remote · USA Full-time

Marketing Data and CRM Manager

Remote · USA Full-time

SEO Content Writer

Remote · USA Full-time

SEO Content Writer

Remote · USA Full-time

Senior Software Engineer (Full-Stack, Front-end, JavaScript)

Remote · USA Full-time

Compliance Analyst

Remote · USA Full-time

Experienced Online Remote Customer Service Representative – Delivering Exceptional Assistance with blithequark

Remote · USA Full-time

Experienced Customer Service Representative – Work from Home Opportunity with arenaflex

Remote · USA Full-time

Senior AS400 / RPG Developer (Banking Client)

Remote · USA Full-time

高级研究员,生物研发

Remote · USA Full-time

Fraud Investigator – Middle Tennessee – Tennessee Comptroller of the Treasury – Nashville, TN

Remote · USA Full-time

Experienced Chat Support Manager – Remote Team Leadership and Customer Experience Expert

Remote · USA Full-time

Experienced Pharmacist Centralized Services Professional - Remote Full-Time Opportunity with arenaflex

Remote · USA Full-time

Health Information Specialist I - with Great Benefits

Remote · USA Full-time

Early Childhood Education Manager

Remote · USA Full-time