Daniil Gavrilov
Effective autism | ∃x : (x ∉ x) ∧ (x ∈ x) | Invest up to $100 at a time
Google Scholar
My Papers
Teach Old SAEs New Domain Tricks with Boosting
Nikita Koriagin, Yaroslav Aksenov, Daniil Laptev, Gleb Gerasimov, Nikita Balagansky, Daniil Gavrilov
COLM 2025
Steering LLM Reasoning Through Bias-Only Adaptation
Viacheslav Sinii, Alexey Gorbatovski, Artem Cherepanov, Boris Shaposhnikov, Nikita Balagansky, Daniil Gavrilov
ICML 2025, Workshop on Efficient Systems for Foundation Models
The Differences Between Direct Alignment Algorithms are a Blur
Alexey Gorbatovski, Boris Shaposhnikov, Viacheslav Sinii, Alexey Malakhov, Daniil Gavrilov
ICLR 2025, Workshop on Building Trust in Language Models and Applications
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models
Daniil Laptev, Nikita Balagansky, Yaroslav Aksenov, Daniil Gavrilov
ICML 2025
Mechanistic Permutability: Match Features Across Layers
Nikita Balagansky, Ian Maksimov, Daniil Gavrilov
ICLR 2025
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov, Nikita Surnachev, Yaroslav Aksenov, Ian Maksimov, Nikita Balagansky, Daniil Gavrilov
ICLR 2025
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov
ACL 2024
PALBERT: Teaching ALBERT to Ponder
Nikita Balagansky, Daniil Gavrilov
NeurIPS 2022, Spotlight
Classifiers are Better Experts for Controllable Text Generation
Askhat Sitdikov, Nikita Balagansky, Daniil Gavrilov, Alexander Markov
NeurIPS 2022, TL4NLP Workshop
Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning
Evgeny Lagutin, Daniil Gavrilov, Pavel Kalaidin
EACL 2021
Self-attentive model for headline generation
Daniil Gavrilov, Pavel Kalaidin, Valentin Malykh
ECIR 2019