By Insights Team in AI — 12 Jul 2025

ICCV2025 | A New Paradigm in Multi-View Generation—Exploring Multi-View Synthesis with Autoregressive Models}

Kimi's open-source trillion-parameter model K2 challenges OpenAI, showcasing advanced multi-view generation and prompting industry-wide competition in large-scale AI models.

Unexpectedly, Kimi's first foundational large model was open-sourced so quickly.

Last night, Moonshot AI officially released the Kimi K2 large model and open-sourced it, with the new model's API now live, priced at 16 RMB per million tokens output.

This release coincided with a wave of major global models, including xAI's Grok 4, Google's upcoming Gemini, and OpenAI's open-source models, indicating a new technological milestone in large-scale models. Perhaps sensing the pressure from Kimi K2, Ultraman recently teased their own open-source model, though public opinion remains skeptical.

Two models are open-sourced: the base model Kimi-K2-Base and the fine-tuned Kimi-K2-Instruct, both available for commercial use.

Blog link: https://moonshotai.github.io/Kimi-K2/
GitHub link: https://github.com/MoonshotAI/Kimi-K2

According to Hugging Face, Kimi K2's downloads approached 12,000 within the first 20 minutes.

From benchmarks like LiveCode Bench, AIME2025, and GPQA-Diamond, Kimi K2 surpasses open-source models like DeepSeek-V3-0324 and Qwen3-235B-A22B, setting new SOTA and outperforming closed models like GPT-4.1 and Claude 4 Opus in knowledge, reasoning, and coding capabilities.

Kimi demonstrates practical applications, showing it can automatically understand how to use tools to complete tasks, without detailed instructions. Early user tests show promising results.

Compared to Grok 4, which showed inconsistent code capabilities, Kimi K2’s code performance has passed initial tests, indicating robustness in practical coding tasks.

In addition, Kimi K2 employs large-scale agentic data synthesis, simulating real-world tool use scenarios, including hundreds of tools across various domains, to generate diverse, high-quality training data.

This data is used in multi-turn tool interaction simulations, with a large language model acting as an evaluator, selecting high-quality data based on rubrics, thus filling the gap of scarce real-world data in specific fields.

Furthermore, Kimi K2 introduces a general reinforcement learning (RL) approach, combining RL with self-judging mechanisms to bridge the gap between verifiable and non-verifiable tasks, enabling continuous self-improvement.

In tasks like mathematics and programming, the model can be updated based on verifiable rewards. For subjective tasks, it uses self-evaluation to provide scalable feedback, overcoming the challenge of sparse rewards in non-verifiable scenarios.

Long-term, these innovations could enable large models to continuously optimize in complex environments, potentially becoming key to future AI intelligence evolution.

Next, what’s next for foundational models?

Kimi’s release reminds us of the recent Grok-4 launch by xAI, where Elon Musk’s team highlighted breakthroughs on the “Humanities Last Exam” (HLE), a challenging test for general AI.

OpenAI’s deep research, Gemin 2.5 Pro, and Kimi-Researcher are listed as major breakthroughs:

Kimi-Researcher, released last month, uses end-to-end autonomous RL with result-driven algorithms, removing the need for supervised fine-tuning or rule-based workflows. More exploration steps lead to better performance.

Similarly, Kimi K2 employs large-scale tool invocation akin to Grok 4.

With domestic computing resource constraints, the new wave of large model competition is shifting away from parameter scaling towards algorithmic innovation to reduce costs and improve efficiency, pushing the boundaries of SOTA.

Subscribe to QQ Insights