kefirski.me 2025

Daniil
Gavrilov

Role: Head of AI Research
Organization: T-Tech
Focus: Alignment & RL, Interpretability, Efficient Computation
Venues: ICLR, ICML, NeurIPS, ACL, EMNLP

Effective autism | ∃x : (x ∉ x) ∧ (x ∈ x) | Invest up to $100 at a time

Forbes 30<30 · Science & Tech · 2025

Scroll

B.Sc. Applied Math · SPbU '19

About

Background & Philosophy

I see no fundamental reason AI can't eventually match human ability across every domain, but we're not there yet, and the bottleneck isn't scale. It's understanding. I want to push AI forward through a deep grasp of what's actually happening inside these models, whether that means building better training methods, figuring out how to solve novel tasks, or making systems we can actually interpret and trust.

I run AI Research at T-Tech with a flat team of researchers and students who publish at ICLR, ICML, NeurIPS, and ACL. No credential gatekeeping, capability is what matters. I chose industry research over the traditional academic path for the freedom to build methods that didn't exist before, and that's still what drives the work.

Research

Focus Areas · ICLR, ICML, NeurIPS, ACL, EMNLP, EACL

Area 01

Alignment & RL

Direct alignment, RL training signals, controllable and safe generation for language and vision-language models.

Area 02

Interpretability

Sparse autoencoders, feature flow, representation matching, and mechanistic steering of language models.

Area 03

Efficient Computation

Adaptive depth and pondering, learnable kernels for efficient in-context modeling.

Experience

Career Timeline

T-Tech

Head of AI Research

2021 — Present

Replika

Senior Research Engineer

2021

MIPT

Head of Lab

2020 — 2021

Research Engineer

2018 — 2021

Publications

Selected Papers

ICLR 2026 · WS: Scaling Post-training

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Addresses reward hacking in group relative policy optimization by filtering out trivially solved samples, preventing the policy from over-fitting to easy patterns and forgetting rare but important behaviors.

D. Plyusov, A. Gorbatovski, B. Shaposhnikov, V. Sinii, A. Malakhov, D. Gavrilov

ICLR 2026 · WS: Constrained Flow & Diffusion

Guided Star-Shaped Mask Diffusion

Introduces a star-shaped masking scheme for guided diffusion that enables flexible constraint satisfaction while preserving sample quality in flow-based generative models.

V. Meshchaninov, E. Shibaev, A. Makoian, I. Klimov, N. Balagansky, D. Gavrilov, A. Alanov, D. Vetrov

ICLR 2026 · WS: World Models

Next Embedding Prediction Makes World Models Stronger

Proposes next embedding prediction as a training objective that improves world model accuracy and downstream task performance over standard next-token approaches.

G. Bredis, N. Balagansky, D. Gavrilov, R. Rakhimov

AAMAS 2026 · Oral

Enhancing VLM Training with RL in Synthetic Worlds

Introduces VL-DAC, a reinforcement learning algorithm that trains vision-language models in inexpensive synthetic environments while achieving strong real-world generalization.

G. Bredis, S. Dereka, V. Sinii, R. Rakhimov, D. Gavrilov

ICML 2025

Analyze Feature Flow to Enhance Interpretation and Steering

Maps sparse autoencoder features across consecutive layers using a data-free cosine similarity technique for mechanistic interpretability.

D. Laptev, N. Balagansky, Y. Aksenov, D. Gavrilov

ICLR 2025

Mechanistic Permutability: Match Features Across Layers

Introduces SAE Match, a data-free method for aligning sparse autoencoder features across different layers by minimizing error between folded autoencoder parameters.

N. Balagansky, I. Maksimov, D. Gavrilov

ICLR 2025

Learn Your Reference Model for Real Good Alignment

Proposes Trust Region alignment methods that dynamically adjust the reference policy during offline LLM training to prevent overoptimization.

A. Gorbatovski, B. Shaposhnikov, A. Malakhov, N. Surnachev, Y. Aksenov, I. Maksimov, N. Balagansky, D. Gavrilov

EMNLP 2025

Train One SAE Across Multiple Sparsity Budgets

HierarchicalTopK enables a single sparse autoencoder to optimize across multiple sparsity levels simultaneously.

N. Balagansky, Y. Aksenov, D. Laptev, V. Kurochkin, G. Gerasimov, N. Koryagin, D. Gavrilov

EMNLP 2025

Steering LLM Reasoning Through Bias-Only Adaptation

Training a single steering vector per layer with RL matches fully RL-tuned reasoning models, adding only ~0.0016% parameters.

V. Sinii, A. Gorbatovski, A. Cherepanov, B. Shaposhnikov, N. Balagansky, D. Gavrilov

COLM 2025

Teach Old SAEs New Domain Tricks with Boosting

Residual learning where a secondary SAE models reconstruction error of an existing SAE on specialized texts.

N. Koriagin, Y. Aksenov, D. Laptev, G. Gerasimov, N. Balagansky, D. Gavrilov

Preprint

Train SAEs Efficiently by Utilizing Features Correlation

KronSAE factorizes latent representations via Kronecker product decomposition, drastically reducing memory overhead.

V. Kurochkin, Y. Aksenov, D. Laptev, D. Gavrilov, N. Balagansky

ICLR 2025 · WS

The Differences Between Direct Alignment Algorithms are a Blur

Examines how direct alignment algorithms differ across SFT stages, scalar scores, and ranking objectives.

A. Gorbatovski, B. Shaposhnikov, V. Sinii, A. Malakhov, D. Gavrilov

ACL 2024

Linear Transformers with Learnable Kernel Functions

Modification to the Based linear transformer kernel that amplifies in-context learning abilities.

Y. Aksenov, N. Balagansky, S. M. Lo Cicero Vaina, B. Shaposhnikov, A. Gorbatovski, D. Gavrilov

NeurIPS 2022 · Spotlight

PALBERT: Teaching ALBERT to Ponder

Deterministic Q-exit criterion and revised architecture for adaptive computation time in pre-trained models.

N. Balagansky, D. Gavrilov

NeurIPS 2022 · WS

Classifiers are Better Experts for Controllable Text Generation

CAIF sampling directs text generation by using classifiers to modify language model logits at inference time.

A. Sitdikov, N. Balagansky, D. Gavrilov, A. Markov

EACL 2021

Implicit Unlikelihood Training: Improving Neural Text Generation with RL

Fine-tuning language models with policy gradient RL to directly optimize generation quality.

E. Lagutin, D. Gavrilov, P. Kalaidin

Contact

Get in Touch

Email Google Scholar GitHub X Hugging Face

B.Sc. Applied Mathematics — Saint Petersburg State University, 2019
Forbes 30 Under 30 (Science & Tech), 2025 · Setters Media A-List, 2025