Poolside19.05.2026

Member of Engineering (Pre-training / Data Research)

Полная занятостьУдалёнка

Обязанности

  • 01Follow the latest research related to LLMs and data quality in particular. Be familiar with the most relevant open-source datasets and models.
  • 02Design and implement complex pipelines that can generate large amounts of data while maintaining high diversity and optimizing the resources available.
  • 03Closely work with other teams such as Pretraining, Posttraining, Evals and Product to ensure short feedback loops on the quality of the models delivered.
  • 04Suggest, conduct and analyze data ablations or training experiments that aim to improve the quality of the datasets generated via quantitative insights.

Требования

  • 01Strong machine learning and engineering background
  • 02Experience with Large Language Models (LLM)
  • 03Understanding of transformer architectures and how LLMs learn
  • 04Data ablations and scaling laws
  • 05Mid-training and Post-training techniques
  • 06Training reasoning and agentic models
  • 07Experience with evals tracking model capabilities (general knowledge, reasoning, math, coding, long-context, etc)
  • 08Experience in building trillion-scale pretraining datasets, and familiarity with concepts like data curation, deduplication, data mixing, tokenization, curriculum, impact of data repetition, etc.
  • 09Excellent programming skills in Python
  • 10Strong prompt engineering skills
  • 11Experience working with large-scale GPU clusters and distributed data pipelines
  • 12Strong obsession with data quality
  • 13Research experience
  • 14Author of scientific papers on any of the topics: applied deep learning, LLMs, source code generation, etc. - is a nice to have
  • 15Can freely discuss the latest papers and descend to fine details
  • 16Is reasonably opinionated

Условия

  • 01Fully remote work & flexible hours
  • 0237 days/year of vacation & holidays
  • 03Health insurance allowance for you & dependents
  • 04Company-provided equipment
  • 05Well-being, always-be-learning & home office allowances
  • 06Frequent team get togethers
  • 07Diverse & inclusive people-first culture