ICML 2025 Outstanding Papers Announced: 8 Winners Including Researchers from Nanjing University}
ICML 2025 awarded 8 top papers, including 6 best paper awards and 2 outstanding position papers, with notable participation from Nanjing University researchers, highlighting cutting-edge AI research.

Including 6 Best Paper Awards and 2 Outstanding Position Papers.
On Monday, ICML 2025 announced its best paper awards.
This year, 8 papers received awards, comprising 6 best paper awards and 2 outstanding position papers. Notably, researchers from Nanjing University were among the winners.
The International Conference on Machine Learning (ICML) is one of the top global AI conferences, organized by the International Machine Learning Society (IMLS), alongside NeurIPS and ICLR. The 42nd ICML was held from July 13-19 in Vancouver, Canada.

ICML 2025 received a total of 12,107 valid submissions, with 3,260 papers accepted, resulting in an acceptance rate of 26.9%. The number of submissions has significantly increased from 9,653 in 2024, reflecting the booming AI field.
Below are the awarded papers with brief summaries.
Best Paper Awards
Paper 1: Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions

- Link: https://arxiv.org/pdf/2502.06768
- Authors: Jaeyeon Kim, Kulin Shah, Vasilis Kontonis, Sham Kakade, Sitan Chen
- Institutions: Harvard University, University of Texas at Austin
Abstract: Masked diffusion models (MDMs) are emerging as a promising alternative to autoregressive models (ARMs), trading complexity for inference flexibility. This paper explores the theoretical and empirical effects of token ordering strategies, showing that adaptive decoding significantly improves performance, even surpassing larger autoregressive models in logic puzzles like Sudoku.
Paper 2: The Value of Prediction in Identifying the Worst-Off

- Link: https://arxiv.org/pdf/2501.19334
- Authors: Unai Fischer Abaigar, Christoph Kern, Juan Perdomo
- Institutions: University of Munich, Harvard University
Abstract: This paper examines how predictive techniques can identify the most vulnerable populations in social welfare contexts, using a case study on long-term unemployment in Germany, providing policy-relevant insights for equitable resource allocation.
Paper 3: CollabLLM: From Passive Responders to Active Collaborators

- Link: https://arxiv.org/pdf/2502.00640
- Homepage: https://wuyxin.github.io/collabllm/
- Authors: Shirley Wu, Michel Galley, Baolin Peng, Hao Cheng, Gavin Li, Yao Dou, Weixin Cai, James Zou, Jure Leskovec, Jianfeng Gao
- Institutions: Stanford University, Microsoft, Georgia Tech
Abstract: CollabLLM enhances multi-turn human-AI collaboration by estimating long-term contributions through reinforcement learning, significantly improving task performance and user satisfaction in complex dialogues.
Paper 4: Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

- Link: https://arxiv.org/pdf/2504.15266
- Authors: Vaishnavh Nagarajan, Chen Henry Wu, Charles Ding, Aditi Raghunathan
- Institutions: Google Research, Carnegie Mellon University
Abstract: This work proposes minimalistic algorithms to quantify the creative limits of language models, emphasizing the importance of stochasticity and implicit reasoning in open-ended tasks, challenging the traditional next-token prediction paradigm.
Paper 5: Conformal Prediction as Bayesian Quadrature

- Link: https://arxiv.org/abs/2502.13228
- Authors: Jake C. Snell, Thomas L. Griffiths
- Institution: Princeton University
Abstract: This paper reinterprets conformal prediction from a Bayesian perspective, proposing a Bayesian quadrature approach that offers interpretable guarantees and a more comprehensive uncertainty quantification framework.
Paper 6: Score Matching with Missing Data

- Link: https://arxiv.org/abs/2506.00557
- Authors: Josh Givens, Song Liu, Henry W J Reeve
- Institutions: Bristol University, Nanjing University
Abstract: The paper extends score matching techniques to handle missing data, providing new methods for density estimation in incomplete datasets, with theoretical guarantees and practical algorithms.