Jobiglo

Aucun resultat.

QA AI Engineer – LLM Evaluation Specialist

QAT Global | Custom Software Development & IT Staffing Solutions - US, Brazil & Costa Rica Nearshore · Heredia

Nouveau
🇬🇧 English
API testing REST Postman SQL Entity Framework Azure DevOps CI/CD C# .NET Java Anthropic evals OpenAI evals Promptfoo LangSmith Langfuse Playwright Selenium Prompt regression testing LLM-as-judge

Description du poste

About the role

We are looking for a QA AI Engineer who will own the quality assurance process for large‑language‑model (LLM) agents. You will design, implement and maintain evaluation pipelines that turn fuzzy quality goals into measurable metrics, working closely with prompt engineers and developers.

Key responsibilities

  • Design test plans and case libraries for API.NET and Java code generated by AI agents.
  • Build and run golden‑dataset evaluations, scoring rubrics and regression suites using tools such as Anthropic evals, OpenAI evals, Promptfoo, LangSmith or custom harnesses.
  • Integrate automated tests into Azure DevOps CI/CD pipelines and track bugs through the lifecycle.
  • Analyze agent traces, tool‑call sequences and reasoning steps to identify hallucinations, edge‑case failures and pattern violations.
  • Perform statistical analysis (pass@k, variance, success‑rate) across multiple runs to surface systematic issues.
  • Collaborate with prompt authors to iterate on prompts based on quantitative feedback.

Required profile

  • Strong curiosity about AI agent failure modes and ability to read long execution traces.
  • Excellent English communication skills for cross‑functional collaboration.
  • Comfort with fuzzy quality definitions and translating them into concrete test criteria.

Required skills

  • API testing, REST, Postman
  • SQL and Entity Framework
  • Azure DevOps CI/CD and test pipelines
  • .NET (C#) and Java code validation
  • Evaluation tooling: Anthropic evals, OpenAI evals, Promptfoo, LangSmith, Langfuse
  • Test automation frameworks: Playwright, Selenium
  • Statistical analysis of LLM outputs (pass@k, variance)
  • Prompt regression testing and LLM‑as‑judge patterns

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec QAT Global | Custom Software Development & IT Staffing Solutions - US, Brazil & Costa Rica Nearshore.
Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.

Pourquoi signalez-vous cette offre ?

Merci pour votre signalement. Nous allons examiner cette offre.

Postulez en 30 secondes

Entrez votre email pour postuler. Un compte sera cree automatiquement.

En continuant, vous acceptez nos conditions d'utilisation.

Deja un compte ? Connexion

Publie il y a 9 heures

Expire dans 1 mois

10 vues · 0 candidatures

Boostez vos chances

Importez votre CV : nous vous proposons les offres qui matchent votre profil.

Analyse de votre CV en cours...

QAT Global | Custom Software Development & IT Staffing Solutions - US, Brazil & Costa Rica Nearshore

Heredia