MiniMax M1 model claims Chinese LLM crown from DeepSeek - plus it's true open-source

MiniMax, an AI firm based in Shanghai, has released an open-source reasoning model that challenges Chinese rival DeepSeek and US-based Anthropic, OpenAI, and Google in terms of performance and cost.

MiniMax-M1 was released Monday under an Apache software license, and thus is actually open source, unlike Meta's Llama family, offered under a community license that's not open source, and DeepSeek, which is only partially under an open-source license.

"In complex, productivity-oriented scenarios, M1's capabilities are top-tier among open-source models, surpassing domestic closed-source models and approaching the leading overseas models, all while offering the industry's best cost-effectiveness," MiniMax boasts in a blog post.

According to the blog post, M1 is competitive with OpenAI o3, Gemini 2.5 Pro, Claude 4 Opus, DeepSeek R1, DeepSeek R1-0528, and Qwen3-235B on various benchmarks (AIME 2024, LiveCodeBench, SWE-bench Verified, Tau-bench, and MRCR), coming in behind some models and ahead of others to varying degrees. As always, take vendor-supplied benchmark results with a grain of salt, but the source code is available on GitHub should you wish to confirm its performance independently.

But MiniMax makes clear that it's trying to supplant DeepSeek as the leading industry disruptor by noting that its context window (the amount of input it can handle) is one million tokens, which rivals Google Gemini 2.5 Pro and is eight times the capacity of DeepSeek R1.

In terms of output, the model can manage 80,000 tokens, better than DeepSeek's 64,000 token capacity but shy of OpenAI's o3, which can spit out 100,000 tokens in response to a prompt.

Backed by Alibaba Group, Tencent, and IDG Capital, MiniMax claims its Lightning Attention mechanism, a way to calculate attention matrices that improves both training and inference efficiency, gives its M1 model an advantage when computing long context inputs and when trying to reason.

"For example, when performing deep reasoning with 80,000 tokens, it requires only about 30 percent of the computing power of DeepSeek R1," the company claims. "This feature gives us a substantial computational efficiency advantage in both training and inference."

This more efficient computation method, in conjunction with an improved reinforcement learning algorithm called CISPO (detailed in M1's technical report [PDF]), translates to lower computing costs.

"The entire reinforcement learning phase used only 512 [Nvidia] H800s for three weeks, with a rental cost of just $537,400," MiniMax claims. "This is an order of magnitude less than initially anticipated." ®

Search
About Us
Website HardCracked provides softwares, patches, cracks and keygens. If you have software or keygens to share, feel free to submit it to us here. Also you may contact us if you have software that needs to be removed from our website. Thanks for use our service!
IT News
Jul 8
Georgia court throws out earlier ruling that relied on fake cases made up by AI

'We are troubled by the citation of bogus cases in the trial court's order'

Jul 8
SUSE launching region-locked support for the sovereignty-conscious

Move targets European orgs wary of cross-border data exposure

Jul 8
Feds brag about hefty Oracle discount - licensing experts smell a lock-in

If a deal looks too good to be true, it probably is

Jul 8
Firefox is fine. The people running it are not

Opinion Mozilla's management is a bug, not a feature

Jul 8
Microsoft developer ported vector database coded in SAP's ABAP to the ZX Spectrum

The mighty Z80 processor ran the code at astounding speed, proving retro-tech got a lot of things right

Jul 8
Samsung predicts profit slump as its HBM3e apparently continues to underwhelm Nvidia

Analysis Markets advised to brace for 45 percent fall from Q1 to Q2

Jul 8
Scholars sneaking phrases into papers to fool AI reviewers

Using prompt injections to play a Jedi mind trick on LLMs