Skip to main content

I Tested the First Diffusion Reasoning LLM… It’s Insanely Fast

SkillLeapAIFebruary 25, 20266:20ai_tool_reviews

Summary

This video demonstrates Mercury 2, a new Diffusion Reasoning LLM, focusing on its exceptional speed and code generation capabilities by having it build complete games like checkers and chess. It is highly relevant for educators and students to understand the current capabilities of advanced LLMs, particularly for learning about AI concepts and their practical applications in code generation, which can inform AI curriculum development or the creation of AI-powered educational tools.

Description

You can try Mercury 2 here: M2 Playground: https://chat.inceptionlabs.ai/ M2 API: http://platform.inceptionlabs.ai/ Inception gave me early access and I made this video in partnership with them. I walk through how I test a new reasoning LLM called Mercury 2 and show why it’s so fast. I ask it to build a full game of checkers, and it writes working code almost instantly. Then I push it to create a full game of chess, and it generates hundreds of lines of code in seconds. I also show how it handles follow-up prompts and rewrites the code just as fast. Mercury 2 is different from other large language models like ChatGPT and Claude Haiku. Most LLMs generate tokens one at a time. Mercury 2 uses a diffusion model, which creates and refines tokens in parallel. That’s why this diffusion LLM can reach around 1,000 tokens per second. I compare Mercury 2 to Claude Haiku in a speed test, and Mercury 2 finishes much faster while still keeping strong reasoning. This makes it a great fit for AI agents, coding, voice apps, search tools, and customer service apps where you need both speed and reasoning. If you’re building with an API and need a fast reasoning LLM, Mercury 2 is worth testing in the playground or API.