By Insights Team in AI — 15 Jul 2025

ICML 2025 Outstanding Papers Announced: 8 Winners Including Researchers from Nanjing University}

ICML 2025 awarded 8 top papers, including 6 best paper awards and 2 outstanding position papers, with notable participation from Nanjing University researchers, highlighting cutting-edge AI research.

Including 6 Best Paper Awards and 2 Outstanding Position Papers.

On Monday, ICML 2025 announced its best paper awards.

This year, 8 papers received awards, comprising 6 best paper awards and 2 outstanding position papers. Notably, researchers from Nanjing University were among the winners.

The International Conference on Machine Learning (ICML) is one of the top global AI conferences, organized by the International Machine Learning Society (IMLS), alongside NeurIPS and ICLR. The 42nd ICML was held from July 13-19 in Vancouver, Canada.

ICML 2025 received a total of 12,107 valid submissions, with 3,260 papers accepted, resulting in an acceptance rate of 26.9%. The number of submissions has significantly increased from 9,653 in 2024, reflecting the booming AI field.

Below are the awarded papers with brief summaries.

Best Paper Awards

Paper 1: Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions

Link: https://arxiv.org/pdf/2502.06768
Authors: Jaeyeon Kim, Kulin Shah, Vasilis Kontonis, Sham Kakade, Sitan Chen
Institutions: Harvard University, University of Texas at Austin

Abstract: Masked diffusion models (MDMs) are emerging as a promising alternative to autoregressive models (ARMs), trading complexity for inference flexibility. This paper explores the theoretical and empirical effects of token ordering strategies, showing that adaptive decoding significantly improves performance, even surpassing larger autoregressive models in logic puzzles like Sudoku.

Paper 2: The Value of Prediction in Identifying the Worst-Off

Link: https://arxiv.org/pdf/2501.19334
Authors: Unai Fischer Abaigar, Christoph Kern, Juan Perdomo
Institutions: University of Munich, Harvard University

Abstract: This paper examines how predictive techniques can identify the most vulnerable populations in social welfare contexts, using a case study on long-term unemployment in Germany, providing policy-relevant insights for equitable resource allocation.

Paper 3: CollabLLM: From Passive Responders to Active Collaborators

Link: https://arxiv.org/pdf/2502.00640
Homepage: https://wuyxin.github.io/collabllm/
Authors: Shirley Wu, Michel Galley, Baolin Peng, Hao Cheng, Gavin Li, Yao Dou, Weixin Cai, James Zou, Jure Leskovec, Jianfeng Gao
Institutions: Stanford University, Microsoft, Georgia Tech

Abstract: CollabLLM enhances multi-turn human-AI collaboration by estimating long-term contributions through reinforcement learning, significantly improving task performance and user satisfaction in complex dialogues.

Paper 4: Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Link: https://arxiv.org/pdf/2504.15266
Authors: Vaishnavh Nagarajan, Chen Henry Wu, Charles Ding, Aditi Raghunathan
Institutions: Google Research, Carnegie Mellon University

Abstract: This work proposes minimalistic algorithms to quantify the creative limits of language models, emphasizing the importance of stochasticity and implicit reasoning in open-ended tasks, challenging the traditional next-token prediction paradigm.

Paper 5: Conformal Prediction as Bayesian Quadrature

Link: https://arxiv.org/abs/2502.13228
Authors: Jake C. Snell, Thomas L. Griffiths
Institution: Princeton University

Abstract: This paper reinterprets conformal prediction from a Bayesian perspective, proposing a Bayesian quadrature approach that offers interpretable guarantees and a more comprehensive uncertainty quantification framework.

Paper 6: Score Matching with Missing Data

Link: https://arxiv.org/abs/2506.00557
Authors: Josh Givens, Song Liu, Henry W J Reeve
Institutions: Bristol University, Nanjing University

Abstract: The paper extends score matching techniques to handle missing data, providing new methods for density estimation in incomplete datasets, with theoretical guarantees and practical algorithms.