Cognition01.05.2026

Research, Post-Training

Полная занятостьОфис

Обязанности

01Post-Training Recipe Development: Iterate on the full stack of datasets, training stages, and hyperparameters that determine model behavior
02Evaluation Design and Integrity: Build evals that actually capture what matters
03Deep Understanding: When training produces results that don't make sense, you dig until you understand why
04Alignment and Agent Behavior: Apply and advance techniques like RLHF, RLAIF, and constitutional approaches to shape how agents reason, act, and collaborate with humans in long-horizon tasks
05Scaling and Exploration: Measure how performance scales with data and compute, and develop new methodologies when existing ones hit ceilings

01A track record of advancing ML systems through post-training, alignment, or related methods: RLHF, RLAIF, preference modeling, reward learning, or equivalent
02Strong fundamentals in probability, statistics, and ML theory. The ability to look at experimental data and distinguish real effects from noise and bugs
03Evidence of original contributions: publications at top venues, open-source impact, or equivalent industry results
04Experience with large-scale distributed training and the debugging that comes with it
05Systems-level thinking: not just model optimization, but understanding how training pipelines, data, and evaluation interact
06Comfort with ambiguity and fast-moving research environments where priorities shift quickly

01Small, highly selective team where research and product move together; prototypes reach real deployment quickly
02Compute is not a constraint: large allocations with training jobs routinely running across thousands of GPUs from day one
03The environment rewards speed, autonomy, and technical depth with minimal process overhead; this is one of the most competitive and fast-moving problems in AI
04Everything needed to operate at frontier scale from day one