JetBrains02.04.2026

Senior Research Engineer (Agentic Behavior)

Amsterdam

Обязанности

01Build tools for agentic error analysis
02Design and implement tooling to systematically capture, classify, and analyse errors that AI coding agents make when generating Kotlin code
03Build observability pipelines over agentic traces – mining patterns from agent sessions in JetBrains IDEs, Junie, Claude Code, Cursor, and other coding agents
04Build evaluation pipelines
05Design, implement, and maintain evaluation pipelines that measure Kotlin code generation quality across dimensions, including correctness, idiomaticity, build success, framework usage, and test coverage
06Build simulation environments where coding agents can be measured on realistic Kotlin developer tasks – from greenfield KMP projects and Gradle dependency management to migrating Spring applications from Java to Kotlin
07Own evaluation infrastructure: metrics, experiment tracking, automated regression checks, and reproducible benchmarking
08Research methods for improving agent and model behavior on Kotlin
09Experiment with post-training techniques (SFT, DPO, GRPO) to improve how models handle Kotlin-specific patterns, idioms, and frameworks
10Investigate context engineering approaches: CLAUDE.md/AGENTS.md files, compiler-as-verifier feedback loops, Kotlin LSP integration, and MCP-based tooling
11Run experiments to measure impact: A/B comparisons, benchmark suites, and before/after analyses on real codebases
12Collaborate with model providers (Anthropic, OpenAI, and Google) to translate Kotlin-specific findings into model improvements
13Build public Kotlin benchmarks
14Design and build open-source benchmarks that measure AI coding agent performance on Kotlin tasks and eventually become the standard reference for the ecosystem
15Create task datasets covering the breadth of Kotlin usage: the server side (Spring, Ktor), multiplatform projects (KMP), build systems (Gradle), Android, library development, and others
16Include both mined real-world tasks and carefully designed synthetic tasks that test specific Kotlin capabilities
17Maintain and evolve benchmarks as models improve, ensuring they remain challenging, relevant, and contamination-resistant

Требования

01Hands-on experience building evaluation or analysis pipelines for LLMs or AI coding agents in a research or production setting
02Strong Python engineering skills (at least three years), with the ability to write clean, maintainable code in data-heavy and ML-adjacent codebases
03Experience with data analysis at scale: querying large datasets (SQL/Athena), building data pipelines, and performing statistical analysis of experimental results
04The ability to own projects end to end – from identifying a problem in agent traces to designing an eval, running experiments, and shipping a fix
05A product-aware mindset: You care about how agents are actually used by developers and can translate real failure modes into evaluation and training work
06Familiarity with Kotlin or a strong willingness to develop deep Kotlin expertise (you'll be living in Kotlin codebases daily)

Условия

01Strong base salary
02Flexible work location
03Remote work
04Extra time off
05Medical insurance allowance
06Learning and development opportunities
07Relocation support
08Language classes
09Fuel your day
10Mental health support
11Sports benefit
12Internal events

Senior Research Engineer (Agentic Behavior)

Обязанности

Требования

Условия

Похожие вакансии

Senior AI Engineer (Core Engine)

AI Lead, Python Tools

Principal Forward Deployed Engineer – AI-Native Software Development

Senior Technical Product Manager (IntelliJ Kotlin Plugin)

Research Engineer, Knowledge Foundations

Staff Research Engineer (LLM Pre-Training)