By Insights Team in AI — 29 Jul 2025

WAIC 2025 Dark Horse: How 'Sheldon AI' Uses Molecular Formulas to Surpass Grok-4}

At WAIC 2025, China's 'Sheldon AI' demonstrates groundbreaking scientific capabilities, surpassing Grok-4 with molecular formula understanding, multi-modal reasoning, and innovative AI tools for scientific discovery.

While Elon Musk's Grok-4 still uses 'humor mode' to tell jokes, Chinese scientists are quietly cracking the secrets of cancer drug targets with the bookish Intern-S1—who says scientific research can't be cool and free?

Since last year's Nobel Prize for AI-driven protein structure prediction, AI for Science has reached new heights of attention.

With the power of large models in recent years, we expect to see AI tools that assist scientific research emerge.

Now, they are here.

On July 26, Shanghai AI Laboratory released and open-sourced the 'Sheldon' scientific multimodal large model Intern-S1, the world's first open-source multimodal scientific model. Its text capabilities rival top domestic and international models, and its scientific abilities reach a leading level globally. As a foundational model integrating scientific expertise, Intern-S1 offers the best comprehensive performance among current open-source models.

Based on Intern-S1, the 'Sheldon' scientific discovery platform Intern-Discovery was recently launched, helping researchers, research tools, and research objects to enhance capabilities and evolve collaboratively, driving scientific research from team exploration to the era of scientific discovery Scaling Law.

Intern-S1 experience page: https://chat.intern-ai.org.cn/
GitHub link: https://github.com/InternLM/Intern-S1
HuggingFace link: https://huggingface.co/internlm/Intern-S1-FP8
ModelScope link: https://modelscope.cn/models/Shanghai_AI_Laboratory/Intern-S1

Chinese open-source models, through algorithm optimization (such as dynamic precision adjustment, MoE architecture) and open collaboration ecosystems, are approaching or even surpassing top international closed-source models in performance while significantly reducing computational requirements. For example, DeepSeek-R1, open-sourced to benchmark OpenAI's closed-source GPT-3, achieves similar performance in mathematical reasoning using innovative reinforcement learning and group relative strategy optimization (GRPO), with much lower training costs. Intern-S1 exceeds Grok-4 in scientific reasoning tasks, with only 1% of Grok-4's training power, demonstrating higher computational efficiency.

Leading open-source scientific multimodal models Rebuilding research productivity

Intern-S1 achieves breakthroughs in scientific and general performance with lightweight training costs.

In comprehensive multimodal general ability assessments, Intern-S1 scores on par with top models domestically and internationally, demonstrating robust understanding across text and images, and adaptability to complex input combinations.

In scientific ability evaluations across multiple disciplines such as physics, chemistry, materials, and biology, Intern-S1 outperforms recent closed-source models like Grok-4, validating its strong logical and accurate performance in research scenarios, setting a new industry benchmark.

When large models continue to make breakthroughs in chat, art, and code generation, the scientific community still awaits a truly 'scientific' AI partner. Although current mainstream models excel in NLP and image recognition, they still fall short in complex, precise, highly specialized scientific tasks. Existing open-source models lack deep understanding of complex scientific data, unable to meet the accuracy, professionalism, and reasoning demands of research. Closed-source models with higher performance face deployment barriers and low transparency, making practical application costly and challenging.

At WAIC 2025, Shanghai AI Laboratory launched the 'Sheldon' scientific multimodal large model Intern-S1, which pioneers a 'cross-modal scientific analysis engine' capable of interpreting complex scientific data such as chemical formulas, protein structures, seismic signals, and more. It possesses advanced research capabilities like predicting synthesis pathways, assessing reaction feasibility, and identifying seismic events, transforming AI from a 'dialogue assistant' into a 'research partner' to fundamentally reshape research productivity.

Thanks to its powerful scientific analysis abilities, Intern-S1 surpasses top closed-source models like Grok-4 in chemistry, materials, and earth sciences benchmarks, demonstrating excellent scientific reasoning and understanding. In multimodal capabilities, it also outperforms models like InternVL3 and Qwen2.5-VL, becoming a 'scientific star' among versatile AI models.

Leveraging Intern-S1's cross-modal biological information perception and integration, Shanghai AI Laboratory collaborates with Lingang Laboratory, Shanghai Jiao Tong University, Fudan University, MIT, and other institutions to develop the 'Yuan Sheng' (OriGene) multi-agent virtual disease scientist system, used for target discovery and clinical translation evaluation, with promising results in liver and colorectal cancer treatments, validated by clinical samples and animal experiments.

Systematic technological innovations support Intern-S1's capabilities. Since its initial release, Shanghai AI Laboratory has built a rich family of 'Sheldon' models, including the large language model Sheldon·Puyu InternLM, multimodal model Sheldon·Wanxiang InternVL, and advanced reasoning model Sheldon·Sike InternThinker. Intern-S1 integrates the advantages of the 'Sheldon' family, achieving a high level of language and multimodal performance within a single model, setting a new benchmark for open-source multimodal large models.

Intern-S1 has attracted attention in the international open-source community, with many influencers praising it and noting that 'almost every day, new open-source SOTA results from China are being announced—breaking records daily.'

Subscribe to QQ Insights