Two Versions: Grok 4 and Grok 4 Heavy
First, Grok 4 is divided into a simple single-agent version (Single-Agent Version) of Grok 4 and a multi-agent version (Multi-Agent Version) of Grok 4 Heavy.
- Grok 4: Basic version, handled by a single AI agent.
Grok 4 Heavy: Adopts a multi-agent collaboration mode, allowing multiple agents to solve problems individually, then share and compare solutions like a "study group" to arrive at a final answer.
Currently, the company has also launched the most expensive subscription plan to date, "SuperGrok Heavy," at $300 per month. Subscribers can get early access to Grok 4 Heavy service and priority use of future features.
Doctoral-Level Intelligence: From SAT Perfect Score to Genius Across All Domains
Next, xAI claims that Grok 4 has academic and logical capabilities that surpass human intelligence, becoming one of the models closest to Artificial General Intelligence (AGI) at this stage. It can achieve near-perfect scores in US SAT, GRE, and other advanced exams, demonstrating knowledge at a doctoral level or higher across all subjects.
Additionally, Grok 4 has set new highs in several benchmark tests, showcasing unprecedented capabilities, specifically:
- Ranked first among existing AI models in challenging graduate-level problems (GPQA), US Mathematics Exam (AIME 2025), and US Mathematical Olympiad (USAMO).
In the Vending-Bench automated vending machine business operation simulation test, it successfully doubled asset income, demonstrating stable and consistent strategic decision-making abilities.
The Biomedical Research Center ARC Institute used Grok 4 to assist in automating its research process, efficiently promoting experimental progress.
There are also actual applications in medical image examination, financial strategy formulation, and game development.
In the Humanity's Last Exam (HLE), Grok 4 could solve 25.4% of the questions without assistance, while the Grok 4 Heavy version could solve 44.4% of the questions, ranking first among existing AI models.
Training Grok 4 with Colossus Supercomputer, Significantly Improving Computational Efficiency
xAI revealed that behind Grok 4's emergence is a dual leap in hardware and training strategies: "The training volume of Grok 4 is 100 times that of Grok 2."
Using our Colossus supercomputer with approximately 200,000 H100 GPUs, from pre-training to reinforcement learning (RLHF), Grok 4 enhanced the model's focus and precision on reasoning tasks.
The team emphasized that as human-written test questions can no longer "effectively train" Grok 4, the real world will become the ultimate testing ground, such as whether it can truly create useful inventions or technologies, thereby determining their actual effectiveness.
Tool Integration and Real-World Interaction: Grok 4 Moves Towards Operational AI
At the same time, Grok 4 is not just about thinking, but also learning how to solve real-world problems. xAI stated that unlike other models, Grok 4 incorporates tool usage capabilities into its training process, enhancing practical and adaptive abilities:
In the next few months, Grok 4 will be connected to engineering analysis tools used by Tesla and SpaceX, entering more sophisticated engineering environments. We plan to provide powerful enterprise-level tools and highly accurate physical simulators to major companies by the end of this year.
The team added, "Our current goal is to enable Grok to manipulate the Optimus humanoid robot and verify the authenticity and effectiveness of its logic and creativity in the physical world."
Beyond Human Reasoning: Can Grok 4 Create New Inventions?
Next is xAI's most proud reasoning capability. Grok 4 not only extracts knowledge from training data but also possesses logical thinking abilities cultivated through enhanced training. It can construct problem solutions in unknown situations and conduct collective reasoning verification through multiple agents, ultimately deriving conclusions like human scientists.
Grok 4 is designed to think from "first principles", capable of independently discovering problems, constructing logic, and completing complex deductions - a reasoning domain difficult for previous AI to reach.
xAI expects that Grok 4 will invent truly practical new technologies as early as this year or by next year at the latest, and may discover scientific principles currently unknown to humans within the next two years.
From Market Prediction to Game Creation: Grok 4's Application Scope Expands Again
Lastly, xAI demonstrated Grok 4's practical application potential across multiple domains including voice interaction and financial business. For example, Grok 4 Heavy can view the Polymarket prediction market and use statistical calculation and reasoning abilities to predict the Dodgers' World Series win rate as 21.6% in just a few minutes, showcasing computational capabilities beyond traditional quantitative analysis tools.
Grok's future vision is also impressive. xAI states that future versions will include video understanding and game interaction capabilities, able to play games and determine "fun", and even integrate game engines to create interactive and artistic content, including TV shows, movies, and video games.
In terms of voice, Grok 4 has also seen major upgrades. The new model introduces multiple voice styles and accents, making conversations more natural and smooth. The launch event deliberately compared it with GPT, highlighting that Grok 4 not only avoids interrupting users but also significantly reduces thinking and response delay time.
Grok 4 is Not Just a Tool, But a Human Civilization Accelerator
The birth of Grok 4 represents AI entering a deeper stage of thinking and application. According to Musk, it hopes to trigger an intelligence revolution spanning education, science, business, and creative industries, with Grok truly participating rather than merely assisting as a language model or auxiliary tool.
The xAI development team's future vision is grand and radical. They emphasize: "AI is no longer just helping us think, but co-creating the world with us."
Risk Warning
Cryptocurrency investment carries high risks, and prices may fluctuate dramatically. You may lose all your principal. Please carefully assess the risks.