Tao Zhexuan Warns: Next AI Step Requires Cost-Effective Scaling, With OCR Cost Reaching $5,000}
Tao Zhexuan emphasizes the need for cheaper, scalable AI solutions, warning that current high costs hinder widespread deployment, urging the development of more affordable AI models.


Artificial intelligence and mathematics are deeply interconnected.
The development of AI relies on advances in mathematics, and solving complex mathematical problems is essential for AI progress.
Recently, Google’s new generation Gemini advanced model successfully solved five out of six extremely difficult IMO problems, reaching the gold medal level (35/42), becoming the first AI system officially recognized as gold medalist by the IMO organizing committee.
Professor Tao Zhexuan, a lifelong professor at UCLA, Fields Medalist, and renowned mathematician known as the 'Mozart of Mathematics,' attended this year's IMO awards ceremony.
He also expressed keen interest in AI models' performance at IMO but warned that evaluations should be conducted in more controlled environments next year.

Professor Tao believes that some students or teams who might struggle to win bronze medals under standard exam conditions could, under modified competition formats, consistently achieve gold medals.
Therefore, without a unified, non-competition-controlled testing method, performance comparisons among different AI models in competitions like IMO should be approached with caution to avoid oversimplified equivalence.

Professor Tao’s concern about AI development and evaluation remains consistent. Recently, he shared his views on the current state of AI and future evaluation strategies on Mathstodon.
AI technology is rapidly transitioning from qualitative achievements to quantitative results.
As a technology matures, focus shifts from who first achieved a goal to more quantitative metrics, such as resource consumption, required expertise, environmental impact, and risks.
This shift is necessary to expand technology from proof-of-concept to large-scale application.
For example, the Wright brothers' first powered flight in 1903 and Lindbergh’s solo transatlantic flight in 1927 were milestones, but the real breakthrough was the decades-long development of jet technology and infrastructure since the 1950s, which made air travel affordable and routine—an often overlooked but crucial effort.

Compared to aviation, space exploration like the Apollo moon landing in 1969 was costly, and progress in reducing costs has been limited.

Today, almost any concept can be realized in a few years with enough resources and a dedicated team, similar to a 'moon landing' AI project.
However, deploying these technologies at scale requires shifting focus from 'can we do it' to 'how to do it cheaper, safer, and more scalable.'
In short, AI needs to 'reduce costs and improve efficiency.' This is closely tied to how we evaluate AI models.
When announcing a goal, resource consumption should be reported alongside success metrics. Equally important is reporting failures to accurately assess success rates—crucial for understanding true costs.
For example, if an advanced AI tool costs about $1,000 in compute per attempt and has a 20% success rate, the average cost per successful problem is $5,000. Reporting only successes would be misleading.
Similarly, if these successes involve high-paid experts monitoring or intervening, their 'standby costs' should be included in the total expense, even if no intervention occurs.

Although resource usage reporting is acceptable during early development, as AI enters widespread deployment, more transparent, comparable, and standardized evaluation methods are essential.

Professor Tao’s perspective is from a historical viewpoint, but many netizens also reflect on the risks associated with AI proliferation.

For full details, see the original tweet: https://mathstodon.xyz/@tao/114910028356641733