КО
Компания
MLOps
3 000 – 10 000 ₽
Франция, Узбекистан
Навыки
TensorZerovLLMSGlangTRTKubernetesCUDAGrafanaVector storagesTensorZerovLLM/SGlang/TRTKubernetes
- Own inference infrastructure end-to-end: optimize latency, throughput, and cost across our model fleet.
- Build and scale model serving with TensorZero, vLLM/SGlang/TRT, and Kubernetes.
- Design and maintain vector search pipelines with Vector storages.
- Familiarity with support metrics (SLAs, FCR, deflection) and ability to define service health KPIs.
- Turn research into product: grab experimental models from the research team, figure out what's production-ready, and ship it - formatting, sampling parameters, deployment, the whole thing