Описание
[ML engineer (LLM optimization & inference acceleration)](Доступно в источнике)
**220 000 – 650 000 ₽/месяц**
Офис, Фултайм
We are looking for an ML Engineer to focus on developing and optimizing algorithms that accelerate large language model (LLM) inference. Your work will directly impact latency, cost efficiency, and scalability of production-grade AI systems. You’ll explore and implement cutting-edge techniques such as speculative decoding, prompt compression, quantization, and generation optimization...([читать далее](Доступно в источнике))