DeepMind Wins Official IMO Gold Medal but Becomes OpenAI’s Major Public Relations Blunder}

DeepMind's new Gemini model achieves IMO gold, solving five of six difficult problems, highlighting AI's potential, while OpenAI's premature announcement sparks controversy and criticism.

DeepMind Wins Official IMO Gold Medal but Becomes OpenAI’s Major Public Relations Blunder}

Recently, Google DeepMind announced that its new generation Gemini advanced model has officially reached gold medal level in the IMO competition, successfully solving five out of six extremely difficult problems, scoring 35 points (out of 42), becoming the first AI system officially recognized as a gold medalist by the IMO organizing committee.

More importantly, this system proved for the first time that AI can solve complex mathematical problems using natural language understanding without relying on specialized programming languages.

DeepMind CEO Demis Hassabis emphasized on social media platform X: “This is the official result!”

Compared to its performance in 2024, where AlphaProof and AlphaGeometry combined to solve four problems and won silver, this year's breakthrough is significant.

This year’s achievement comes from Gemini Deep Think, an enhanced reasoning system that employs what researchers call parallel thinking. Unlike traditional AI models that follow a single reasoning chain, Deep Think explores multiple solutions simultaneously to arrive at the answer.

Hassabis explained in subsequent posts that Google’s model runs end-to-end in natural language, directly generating rigorous mathematical proofs from official problem descriptions. He also emphasized that the system completed the tasks within the official 4.5-hour time limit.

This official announcement puts OpenAI in an awkward position, as they previously announced their results prematurely, leading to criticism. For details, see OpenAI’s IMO Gold Win Sparks Backlash: Accusations of Hype and Student Stealing.

DeepMind’s cautious approach has been widely praised in the AI community, especially contrasted with OpenAI’s handling of similar achievements.

Hassabis stated, “We did not announce this on Friday because we respect the IMO council’s initial request that all AI labs only share results after independent verification and proper recognition of students.”

In contrast, many criticize OpenAI for acting unethically, disrespectfully, and impolitely. DeepMind’s approach is seen as honest and humane.

This criticism stems from OpenAI’s decision to publish their results without participating in the official IMO evaluation process. They relied on a team of former IMO participants to assess their AI’s performance, which some community members find untrustworthy.

OpenAI Responds Again

OpenAI researcher Noam Brown congratulated Google, but mainly to address doubts. His statement clarified that OpenAI’s approach differs and that there are many research directions worth exploring.

Two months ago, the IMO committee invited OpenAI to participate in a formal Lean-based competition. However, OpenAI declined, focusing instead on unrestricted natural language reasoning research, and the committee never discussed formal problem-solving in natural language with them.

Over recent months, OpenAI made significant progress in general reasoning, including collecting, organizing, and training high-quality mathematical data, which will be used in future models. They clarified that they did not use RAG or any other tools during the IMO evaluation.

All proofs submitted by OpenAI were scored as correct by three external IMO medalists, and the proofs are publicly available for verification at OpenAI IMO 2025 Proofs Repository.

Before sharing their results, OpenAI spoke with an IMO board member who asked them to wait until after the awards ceremony, which they did, announcing the results around 1 a.m. Pacific Time (6 p.m. AEDT). No one requested a later announcement.

They expressed happiness to share their progress and results, emphasizing that AI reasoning is rapidly advancing, as demonstrated by these IMO results.

This event shows that the competition on the IMO stage is not just about technology but also about norms, rhythm, and collaboration. DeepMind’s careful, official approach earned respect and the gold medal, while OpenAI’s premature announcement caused controversy. It reminds us that on the path to AGI, aligning technology with societal rules and values is increasingly important.

References:

Subscribe to QQ Insights

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe