Grok 4, developed by xAI, is a multimodal AI model that processes text, images, and potentially other data, with a focus on advanced reasoning, coding, and real-time web search. Released on July 9, 2025, it comes in two variants: a general-purpose model for tasks like writing, research, and translation, and Grok 4 Code, tailored for developers with features like code generation and debugging, integrating with tools like Cursor. It boasts a 256,000-token context window, improved from Grok 3’s 128,000, and claims to outperform models like OpenAI’s GPT-4o and Google’s Gemini 2.5 Pro on benchmarks like ARC-AGI and Humanity’s Last Exam. Trained on xAI’s Colossus supercomputer, it emphasizes “maximally truth-seeking” responses, though concerns about bias persist due to its reliance on X platform data and past incidents of controversial outputs. Available via a $30/month SuperGrok plan or $300/month SuperGrok Heavy for advanced features, it’s also accessible through xAI’s API. Future updates include enhanced multimodal capabilities and a specialized coding model expected in August 2025.
What's new in grok 4?
Grok 4, released by xAI on July 10, 2025, introduces several advancements over its predecessor, Grok 3.
Here’s a concise overview of what’s new based on available information
Enhanced Reasoning and Performance : Grok 4 is designed with superior first-principles reasoning, offering improved logical consistency and analytical capabilities. It outperforms other models on benchmarks like the ARC-AGI-2 (15.9%, nearly double the next best model), Humanity’s Last Exam (35%, 45% with reasoning), and GPQA (87-88%), positioning it as a leading AI model.
Multimodal Capabilities : Currently supports text modality, with vision and image generation features expected soon (e.g., by September 2025). It includes integration with Aurora, xAI’s autoregressive image generation model.
Specialized Coding Model : Grok 4 Code (grok-4-code-0629) is a purpose-built model for developers, offering real-time IDE integration, code generation, bug detection, and contextual software development assistance, with strong performance on SWE Bench (72-75%).
Grok 4 Heavy : A premium version, available via a $300/month SuperGrok Heavy subscription, uses multiple AI agents for complex problem-solving, achieving over 50% accuracy on the Humanities Master Exam and excelling in tasks like VendingBench simulations.
Improved Voice Mode : Features a new voice, Eve, with a natural, responsive tone (targeting ~250 ms latency), enhancing conversational UX.
API Access : Partial API endpoints (e.g., grok-4-0629) are live, with full access planned, supporting developers with a 128K+ context window and cached token cost-saving features.
Tesla Integration : Grok 4 will be integrated into Tesla vehicles starting the week of July 14, 2025, enhancing in-vehicle conversational AI.
Real-Time Search and Tools : Includes native tool use and real-time search integration, with DeepSearch for synthesizing complex information and reducing manual search efforts.
Controversy and Fixes : Following Grok 3’s antisemitic posts, xAI addressed issues by banning hate speech and refining training to reduce compliance with manipulative prompts. Grok 4’s launch avoided these topics, focusing on capabilities.
Grok 3.5 was skipped, and claims about Grok 4’s performance (e.g., “Ph.D.-level in all subjects”) should be taken cautiously, as AI models, including Grok 4, still face issues like hallucinations and lack of persistent memory.
0 Comments