[Remote] AI Software Engineer, Legal Prompting & LLM Dev.
Note: The job is a remote job and is open to candidates in USA. Nixon Peabody LLP is a law firm that values innovation and collective thinking, seeking an AI Software Engineer specializing in Legal Prompting and LLM Development. This role involves building production-grade applications that utilize Large Language Models to enhance legal workflows, requiring expertise in software engineering, prompt engineering, and AI infrastructure.
Responsibilities
- Design, develop, and deploy LLM-integrated applications that enhance legal workflows across transactional, litigation, regulatory, and advisory practice areas
- Develop backend services across the Microsoft stack and in languages such as TypeScript/JavaScript, Python, C#, and others as needed, that interact with LLM providers (OpenAI, Anthropic, etc.), external APIs, SQL and NoSQL databases, and document management systems
- Build and maintain RESTful and event-driven APIs that expose AI capabilities to internal applications and downstream consumers
- Write and refine persona-based prompts, system instructions, and few-shot examples to guide LLMs in delivering accurate, defensible, and legally appropriate responses
- Build prompt evaluation harnesses, regression test suites, and offline/online evaluation pipelines (e.g., LLM-as-judge, golden datasets) to measure quality, hallucination rates, and latency
- Continuously test and iterate on prompts and code to optimize model performance, cost, and user experience
- Design, build, and operate Model Context Protocol (MCP) servers that expose firm systems — document management (e.g., iManage, NetDocuments), time and billing, CRM, research platforms, and internal knowledge bases — as secure, governed tools for AI agents
- Define tool schemas, authentication flows, rate limiting, and audit logging for MCP endpoints, ensuring outputs are scoped to user permissions and ethical walls
- Maintain a catalog of reusable MCP tools and resources that can be composed across multiple AI products at the firm
- Build and tune retrieval-augmented generation pipelines, including chunking strategies, embedding model selection, hybrid search (lexical + semantic), and reranking
- Work with vector databases (e.g., Pinecone, Weaviate, pgvector, Azure AI Search) and orchestration frameworks (e.g., LangChain, LlamaIndex, Semantic Kernel) to ground LLM outputs in firm and client data
- Develop multi-step and multi-agent workflows that combine planning, tool use, and human-in-the-loop checkpoints for sensitive legal tasks
- Implement guardrails, content filters, PII redaction, and citation/verification layers to ensure responsible use
- Containerize services (Docker) and deploy via CI/CD pipelines to cloud environments (Azure preferred; AWS/GCP a plus), using infrastructure-as-code (Terraform, Bicep) where appropriate
- Instrument applications with logging, tracing, and LLM-specific observability tools (e.g., LangSmith, Arize, Weights & Biases, OpenTelemetry) to monitor quality, cost, and drift in production
- Partner with Information Security and the Office of the General Counsel to ensure solutions meet client outside counsel guidelines, data residency requirements, and confidentiality obligations
- Collaborate with attorneys, legal professionals, and product teams to understand domain-specific needs and translate them into technical solutions
- Assess the integration of LLMs into existing legal workflow systems and recommend improvements
- Perform other duties as assigned
Skills
- 4-6 years of production-level software engineering experience on a commercial or internal product team
- Bachelor's degree in Computer Science, Engineering, or a related technical field
- Strong programming skills in modern object-oriented languages such as TypeScript/JavaScript, C#, Python, or Java (typing, async, packaging, testing), with the ability to work fluently across the Microsoft technology stack
- Experience designing and consuming RESTful APIs and working with SQL databases; familiarity with NoSQL and vector stores a plus
- Hands-on experience with LLM APIs (OpenAI, Anthropic, Cohere, Azure OpenAI) and/or open-source models (e.g., LLaMA, Mistral)
- Proficiency with prompt engineering techniques (chain-of-thought, structured outputs/JSON mode, function/tool calling, few-shot design)
- Experience building or integrating with Model Context Protocol (MCP) servers, custom tools, or function-calling endpoints for agentic systems
- Familiarity with orchestration frameworks such as LangChain, LlamaIndex, LangGraph, Semantic Kernel, or Pydantic AI
- Experience implementing RAG pipelines with embeddings, vector databases, and reranking models
- Experience with evaluation frameworks (Ragas, DeepEval, promptfoo) and LLM observability platforms
- Familiarity with containerization (Docker), CI/CD, and cloud deployment (Azure preferred)
- Excellent written communication skills — especially in crafting clear and effective LLM prompts and technical documentation
- Ability to translate legal context and goals into prompt instructions, tool definitions, and system requirements
- Strong analytical and problem-solving capabilities, with sound judgment about when to use deterministic code versus probabilistic models
- Ability to thrive both independently and as part of a collaborative team
- Prior experience developing software solutions in the legal industry strongly preferred
- Legal background highly preferred (e.g., J.D., paralegal, legal tech industry experience, or work with legal software vendors)
- Demonstrated experience in legal practice or support roles is a plus
Benefits
- In addition to a standard benefits package, this role may be eligible for additional contingent compensation based on an array of factors, including but not limited to: work performance, geographic location, work experience, education, and qualifications.
Company Overview