Daniil
Gavrilov
Effective autism | ∃x : (x ∉ x) ∧ (x ∈ x) | Invest up to $100 at a time
About
I see no fundamental reason AI can't eventually match human ability across every domain, but we're not there yet, and the bottleneck isn't scale. It's understanding. I want to push AI forward through a deep grasp of what's actually happening inside these models, whether that means building better training methods, figuring out how to solve novel tasks, or making systems we can actually interpret and trust.
I run AI Research at T-Tech with a flat team of researchers and students who publish at ICLR, ICML, NeurIPS, and ACL. No credential gatekeeping, capability is what matters. I chose industry research over the traditional academic path for the freedom to build methods that didn't exist before, and that's still what drives the work.
Research
Experience
Publications
Addresses reward hacking in group relative policy optimization by filtering out trivially solved samples, preventing the policy from over-fitting to easy patterns and forgetting rare but important behaviors.
Introduces a star-shaped masking scheme for guided diffusion that enables flexible constraint satisfaction while preserving sample quality in flow-based generative models.
Proposes next embedding prediction as a training objective that improves world model accuracy and downstream task performance over standard next-token approaches.
Introduces VL-DAC, a reinforcement learning algorithm that trains vision-language models in inexpensive synthetic environments while achieving strong real-world generalization.
Maps sparse autoencoder features across consecutive layers using a data-free cosine similarity technique for mechanistic interpretability.
Introduces SAE Match, a data-free method for aligning sparse autoencoder features across different layers by minimizing error between folded autoencoder parameters.
Proposes Trust Region alignment methods that dynamically adjust the reference policy during offline LLM training to prevent overoptimization.
HierarchicalTopK enables a single sparse autoencoder to optimize across multiple sparsity levels simultaneously.
Training a single steering vector per layer with RL matches fully RL-tuned reasoning models, adding only ~0.0016% parameters.
Residual learning where a secondary SAE models reconstruction error of an existing SAE on specialized texts.
KronSAE factorizes latent representations via Kronecker product decomposition, drastically reducing memory overhead.
Examines how direct alignment algorithms differ across SFT stages, scalar scores, and ranking objectives.
Modification to the Based linear transformer kernel that amplifies in-context learning abilities.
Deterministic Q-exit criterion and revised architecture for adaptive computation time in pre-trained models.
CAIF sampling directs text generation by using classifiers to modify language model logits at inference time.
Fine-tuning language models with policy gradient RL to directly optimize generation quality.
Contact
Forbes 30 Under 30 (Science & Tech), 2025 · Setters Media A-List, 2025