All roles

Open role

[Remote] Senior AI Agent & Evaluations Engineer

Remote · Kenya Full-time

Note: The job is a remote job and is open to candidates in USA. Vacatia is building the future of vacation ownership, focusing on transforming the industry through AI. They are seeking a Senior AI Agent & Evaluations Engineer to design and improve AI agents that directly impact customer experiences and operational efficiency, while owning the intelligence layer behind these systems.

Responsibilities

  • Design, refine, and optimize prompts, tool definitions, routing logic, and decision-making behavior across Vacatia's AI agent ecosystem
  • Build and maintain evaluation frameworks, golden datasets, grading systems, and regression testing pipelines that measure agent quality and reliability
  • Develop guardrails and safe-failure mechanisms that ensure agents operate responsibly in customer-facing and financially sensitive workflows
  • Monitor production performance, investigate failures, identify edge cases, and continuously improve agent outcomes through data-driven iteration
  • Partner with business stakeholders to translate policies, operational requirements, and domain expertise into measurable agent behavior
  • Collaborate with engineering teams to define context requirements, tool contracts, and integration specifications that support agent success
  • Create scalable frameworks and reusable patterns for deploying AI agents across new business workflows and use cases
  • Establish best practices for prompt engineering, evaluation methodologies, observability, and agent operations

Skills

  • Proven experience shipping and owning production AI agents or LLM-powered systems beyond proof-of-concept environments
  • Deep expertise in prompt engineering, including system prompts, tool usage, context management, output constraints, and agent behavior design
  • Hands-on experience building evaluation frameworks using golden datasets, scoring rubrics, LLM-as-judge methodologies, and regression testing
  • Strong familiarity with modern AI development tools such as Claude Code, Codex, or similar coding agents
  • Experience with agent observability and evaluation platforms such as LangSmith, Langfuse, Arize, Galileo, or comparable solutions
  • Ability to distinguish prompt issues from data, tooling, model, or evaluation failures and systematically improve agent performance
  • Strong written and verbal communication skills with the ability to work effectively across engineering and business teams
  • Demonstrated ownership mindset with a passion for building reliable, measurable, and continuously improving AI systems
  • Experience building agents that process communication-based workflows including emails, support tickets, chat interactions, or transcripts
  • Experience with multiple agent frameworks and a practical understanding of their tradeoffs
  • Familiarity with the evolving LLM landscape and model selection strategies
  • Experience designing and implementing end-to-end evaluation pipelines and agent operations workflows
  • Production experience with online evaluation systems and automated scoring of live traffic
  • Experience integrating AI systems with Salesforce, AWS Connect, or customer engagement platforms
  • Background in customer-facing industries where accuracy, compliance, and communication quality are critical
  • Contributions to open-source projects, technical writing, or public thought leadership in AI, prompt engineering, or agent development

Company Overview

  • Vacatia is the resort marketplace for vacationing families, whose mission is to make family vacations better It was founded in 2013, and is headquartered in Mill Valley, California, USA, with a workforce of 1001-5000 employees. Its website is https://vacatia.com.
  • Company H1B Sponsorship

  • Vacatia has a track record of offering H1B sponsorships, with 2 in 2025, 1 in 2022. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Staff Back End Engineer, Trading

    Work from home Full-time role

    [Remote] Senior Accountant

    Work from home Full-time role

    [Remote] Manager, Software Engineering (Reliability Platform)

    Work from home Full-time role

    [Remote] Community Support Forecasting and Demand Planning Analyst

    Work from home Full-time role

    [Remote] Senior Manager, Clinical Operations

    Work from home Full-time role

    Remote Customer Handling Assistant at CVS Health (work from home)

    Work from home Full-time role

    Implementation Consultant I - US Remote

    Work from home Full-time role

    Head of Storytelling and Content

    Work from home Full-time role

    Collections Representative

    Work from home Full-time role

    AML Analyst (US, ET)

    Work from home Full-time role

    Senior Software Engineer, Windows/Desktop Applications - New York, NY, USA

    Work from home Full-time role

    Research Scientist - Epidemiology and Scientific Affairs

    Work from home Full-time role

    Staff AI Product Analyst, Product Management | Spain | Remote

    Work from home Full-time role

    College Students Needs A Work From Home New Work From Home Data

    Work from home Full-time role

    Steuerfachkraft (m/w/d) in Göllheim mindestens 52.000€ - 100% Remote möglich

    Work from home Full-time role

    Remote Entry‑Level Data Entry Specialist – Competitive $70K‑$80K Salary, Flexible Hours & Full Remote Work at careerzynith

    Work from home Full-time role

    Regional EV Customer Success Manager – Fleet Electrification, Telematics & Charging Solutions (Southeast & Central US) – Remote‑First Role

    Work from home Full-time role

    Datacenter Engineering Principal Telecom Engineer

    Work from home Full-time role

    Remote Online Notary (RON) / Mobile Notary

    Work from home Full-time role

    Experienced Remote Live Chat Agent – Customer Service Representative for careerzynith

    Work from home Full-time role

    Claims Adjuster-Crop

    Work from home Full-time role