AI Models Learn to Improve Themselves Autonomously

Miniax's new AI model, M2.7, demonstrates remarkable self-evolution capabilities, improving its own code and systems autonomously. The model achieves impressive benchmark scores, rivaling top AI systems, and hints at a future of AI-driven development and operations.

1 week ago
4 min read

Miniax Unveils AI That Evolves Its Own Code

A Chinese tech company called Miniax has introduced a new AI model, M2.7, which they claim can improve itself. This ability, described as “early echoes of self-evolution,” suggests a significant step forward in artificial intelligence development. Miniax, founded in 2022 and backed by major investors like Alibaba and Tencent, reports that M2.7 has helped enhance its own systems. The company is framing this as a move towards AI that can autonomously refine its capabilities.

The Concept of Self-Evolution in AI

The idea of AI improving itself isn’t entirely new. Google DeepMind’s AlphaEvolve has previously shown how AI can help refine future versions of models like Gemini. More recently, AI researcher Andrej Karpathy released Auto-Researcher, a tool that demonstrates self-improvement on a smaller scale, sparking interest in its potential. These developments suggest we are entering an era where AI systems might begin to conduct their own research and development.

Miniax uses an analogy to explain the process: the AI model is like a pilot, and the “harness” is the vehicle or system it operates. In this case, the AI pilot is also tasked with improving the car it’s driving. This means the AI not only performs tasks but also helps design and build the tools needed to perform those tasks better.

M2.7’s Role in Building Its Own Tools

Miniax built an internal research agent harness using an early version of M2.7. This agent’s job was to support data pipelines, manage training environments, and keep track of experiments. Initially, it acted as a research assistant, performing tasks like literature reviews, analyzing experiments, and fixing bugs. This capability is similar to what other AI coding assistants can do.

However, Miniax claims M2.7 is handling 30% to 50% of its reinforcement learning team’s workflow. This means the AI is significantly assisting the human researchers who are working to improve it. This level of assistance, while impressive, is just the first step.

Autonomous Optimization and Performance

The second stage involves M2.7 recursively improving its own harness. The AI tracks its performance, gathers feedback, and iterates on its own architecture and skills. It essentially rewrites its own tools to become better at its job. This is a more advanced form of self-improvement.

The third and most significant stage is autonomous scaffold optimization. Miniax ran M2.7 through over 100 rounds of self-improvement without any human intervention. The AI generated hypotheses for improvement, designed experiments, modified its code, ran tests, and compared results to previous versions. If changes led to better performance, they were kept; if not, they were reverted.

Benchmark Results Show Promise

M2.7 achieved a 30% improvement on internal benchmarks, though the specifics of these benchmarks are not fully detailed. More importantly, the model competed in OpenAI’s MLE benchmark, which tests AI’s ability to perform tasks at the level of a machine learning engineer. Running on a single, relatively affordable A30 GPU, M2.7 scored 66.6%, earning nine gold medals.

This score is notable because it ties with Google’s Gemini 3.1 and is only slightly behind top-tier models like Opus 4.6 (75.7%) and GPT 5.4 (71.2%). These results suggest that M2.7 is a highly capable model, especially considering its self-improvement capabilities and the hardware it requires.

Broader Applications and Future Implications

Beyond machine learning, M2.7 has shown strong performance in other areas. It achieved 56.22% on SWE-Pro (software engineering professional tasks) and 55.6% on Vibe-Pro (end-to-end project delivery), nearing the performance of leading models. It also demonstrated advanced reasoning in production debugging scenarios, reducing recovery time to under three minutes by implementing non-blocking solutions to minimize downtime.

Miniax believes that future AI self-evolution will become fully autonomous, handling all stages of development without human input. They are using M2.7 to become an “AI-native organization,” integrating the model into their company structure. This suggests a future where AI not only performs tasks but also manages and evolves entire businesses.

Open Room: An Interactive AI Experience

Miniax also launched Open Room, an open-source project available on GitHub. It features an AI avatar within a graphical user interface that can interact with users and their files. Much of the code for Open Room was written by AI itself, showcasing a meta-level of AI development.

The project aims to create more personable AI interactions. Users can engage with the AI, which can access and modify files, calendars, and other data. This move towards AI with distinct personalities, like GPT-5.4 being a “smart engineer” and Claude having a “great personality,” highlights how user preference for interaction style can influence AI adoption, even when raw capabilities are similar.

Why This Matters

The development of AI models like M2.7 that can autonomously improve themselves is a significant leap. It suggests that AI’s progress might accelerate beyond human-led development cycles. This could lead to faster innovation across various fields, from scientific research to business operations. The ability of AI to manage and optimize complex workflows, as demonstrated by M2.7’s performance in debugging and software engineering, points towards a future where AI plays a much larger role in day-to-day business functions. Furthermore, the creation of interactive AI agents like those in Open Room could redefine human-computer interaction, making AI a more integrated and personalized part of our digital lives. The implications range from highly efficient AI-driven companies to more intuitive and helpful personal digital assistants.


Source: M2.7 just BROKE the Entire Industry… (YouTube)

Written by

Joshua D. Ovidiu

I enjoy writing.

11,007 articles published
Leave a Comment