[Remote] DevOps Engineer (Agentic AI)
Note: The job is a remote job and is open to candidates in USA. Saviance is seeking a skilled DevOps Engineer with a strong interest in Agentic AI and Generative AI systems. In this role, you will design, automate, and manage cloud-native infrastructure that powers AI applications and large-scale data processing workloads, ensuring reliable and secure platforms for deploying AI-powered products.
Responsibilities
- Design, implement, and maintain cloud infrastructure across AWS, Azure, or GCP environments
- Build and manage CI/CD pipelines for rapid and reliable software delivery
- Automate infrastructure provisioning and configuration using Infrastructure as Code (IaC) tools
- Deploy, monitor, and optimize AI/ML and Agentic AI workloads in production environments
- Manage containerized applications using Docker and Kubernetes
- Implement observability solutions including logging, monitoring, alerting, and performance tracking
- Ensure platform reliability, security, scalability, and cost optimization
- Collaborate closely with software engineers, AI engineers, and product teams to streamline deployment workflows
- Support MLOps and LLMOps practices for model deployment, evaluation, and lifecycle management
- Troubleshoot infrastructure, networking, and deployment issues across distributed systems
Skills
- Strong experience in DevOps, Platform Engineering, or Site Reliability Engineering
- Hands-on experience with cloud platforms such as AWS, Azure, or GCP
- Proficiency with Infrastructure as Code tools such as Terraform or CloudFormation
- Experience building and managing CI/CD pipelines
- Strong knowledge of Docker, Kubernetes, and container orchestration
- Proficiency in Python, Shell scripting, or similar automation languages
- Understanding of networking, cloud security, load balancing, DNS, VPNs, and firewalls
- Experience with monitoring and observability tools
- Strong troubleshooting and problem-solving skills
- Ability to thrive in a fast-paced startup environment
- Experience with MLOps and AI infrastructure
- Familiarity with Vertex AI, SageMaker, or similar AI/ML deployment platforms
- Knowledge of Large Language Models (LLMs), Agentic AI systems, and AI orchestration frameworks
- Experience deploying Retrieval-Augmented Generation (RAG) pipelines and AI-powered services
- Familiarity with vector databases, distributed systems, and scalable data platforms
- Exposure to security automation, compliance, and cloud governance practices
Benefits
- Remote-first work environment with flexibility and autonomy
- Collaborative culture focused on innovation, ownership, and continuous learning
- Significant opportunities for technical growth and leadership as AI adoption scales
Company Overview