Google, OpenAI Boost AI Models; Microsoft Unveils Agentic AI – OVEX NEWS

BREAKING

Investor Buys 30 Units Amid High Rates

Israel Condemns Soldier’s Vandalism of Religious Icon

US Commandos Seize Iranian Ship Amidst Escalating Tensions

Near Misses: Disasters That Almost Ended the World

Mark Gatiss: History’s Warnings Echo in Today’s Troubling News

GOP Candidate Leads California Governor Race by 17 Points

Patriots Land A.J. Brown Post-June 1 Deal!

US Actions Aid Putin, Fuel Iran Tensions, Analysts Claim

High Energy Prices Here to Stay, Expert Warns

MAGA’s Mugshot Defense Crumples Under Scrutiny

Investor Buys 30 Units Amid High Rates

Israel Condemns Soldier’s Vandalism of Religious Icon

US Commandos Seize Iranian Ship Amidst Escalating Tensions

Near Misses: Disasters That Almost Ended the World

Mark Gatiss: History’s Warnings Echo in Today’s Troubling News

GOP Candidate Leads California Governor Race by 17 Points

Patriots Land A.J. Brown Post-June 1 Deal!

US Actions Aid Putin, Fuel Iran Tensions, Analysts Claim

High Energy Prices Here to Stay, Expert Warns

MAGA’s Mugshot Defense Crumples Under Scrutiny

Google, OpenAI Boost AI Models; Microsoft Unveils Agentic AI

Google, OpenAI Debut Advanced AI Models; Microsoft Pivots to Actionable AI

The artificial intelligence landscape is rapidly evolving, with major players like Google and OpenAI unveiling significant upgrades to their foundational models, while Microsoft signals a shift towards AI agents capable of performing complex tasks. This week’s developments highlight advancements in multimodal capabilities, reasoning prowess, and the emergence of AI systems designed to execute actions rather than just respond to queries.

Google Enhances Gemini and NotebookLM Capabilities

Google has rolled out several key updates, including the second iteration of its image generation model, NanoBanana 2. This new version, available within the Gemini Pro plan, boasts enhanced detail, advanced world knowledge, precise text rendering and translation, and improved subject consistency for up to five characters and 14 objects. While 4K upscaling was present in previous versions, the integration of aspect ratio control and subject consistency marks notable improvements.

Beyond image generation, Google has upgraded its educational tool, NotebookLM, with a ‘cinematic overview’ feature. This enhancement allows the tool to generate animated and motion-graphic-rich video summaries of information. While the underlying technology for generating on-demand motion graphics is not publicly detailed, the feature, accessible to users of Google’s top-tier Ultra plan ($200-$250/month), promises a novel way to consume and present educational content, albeit with significant generation times.

Google has released Gemini 3.1 Pro, its flagship model designed as a natively multimodal, high-reasoning system. This model excels at processing diverse inputs including video, audio, and images, a capability that sets it apart from many single-modality AI models.

The MMU Pro version has seen significant improvements, reaching an estimated 76.8 score, reflecting Google’s focus on building ‘world models’ with strong reasoning, enhanced reliability, and longer, more structured outputs. With a context window of up to 1 million tokens, Gemini 3.1 Pro is adept at handling extensive documents, codebases, and long video or audio files, also supporting function calling and search grounding with Google Search.

OpenAI Counters with GPT-5.4 Pro, Focusing on Edge Cases

In response, OpenAI has launched GPT-5.4 Pro, positioned as its most advanced model, reportedly outperforming others in highly specialized domains. While Gemini 3.1 Pro is favored for its native multimodality, GPT-5.4 Pro reportedly excels in frontier mathematics, computer use, and complex scientific problem-solving. This makes it a potential go-to for scientists, researchers, and professionals engaged in high-stakes technical work.

GPT-5.4 Pro is available to ChatGPT Plus and Enterprise users via the API and standard chat interface. OpenAI has also addressed previous issues with reasoning mistakes and standard conversational flow, aiming for more consistent and reliable responses.

Microsoft’s C-Pilot Signals a New Era of AI Agents

Microsoft is making a significant strategic pivot with the introduction of ‘C-Pilot Tasks,’ a new class of AI agents designed to execute tasks autonomously. Described as a ‘to-do list that does itself,’ C-Pilot Tasks allows users to define objectives in natural language, which the AI then plans and executes using its own computing resources and browser access across various applications. This marks a departure from conversational AI, which Microsoft terms the ‘first chapter,’ ushering in a ‘second chapter’ focused on AI that actively performs work.

Potential applications include surfacing urgent emails with draft replies, automatically unsubscribing from promotional mail, tracking apartment listings and booking viewings, generating morning briefings, and creating personalized study plans. C-Pilot Tasks includes safeguards, requiring user consent for significant actions like financial transactions or sending messages on behalf of the user, with options to review, pause, or cancel tasks. Currently in a limited research preview, this initiative positions Microsoft to compete in the growing agentic AI space, targeting mainstream and consumer users.

Microsoft is also navigating public perception, with the term ‘Microslop’ gaining traction on platforms like Discord as a derogatory label for its aggressive AI integration into products, perceived by some as low-value ‘slop.’ The company’s decision to ban the term on its official Discord server has further fueled its use by critics.

Perplexity AI Launches Advanced Digital Worker

Perplexity AI has introduced a sophisticated ‘general-purpose digital worker’ that operates within user interfaces, capable of reasoning, delegating, searching, building, remembering, coding, and delivering results. This system leverages a diverse array of up to 19 AI models, including Claude Opus 4.6 as its core reasoning engine, orchestrating specialized sub-agents for tasks like deep research (Gemini), image processing (NanoBanana), video analysis (V3.1), lightweight tasks (Grok), and long-term recall (GPT-5.2). This model-agnostic approach allows Perplexity to select the optimal AI for each part of a task automatically.

The system operates in isolated compute environments with access to file systems, browsers, and tool integrations. It can manage projects end-to-end, from research and design to coding and deployment, with memory capabilities for recalling past work and connecting to hundreds of services. The Perplexity Max tier, priced at $200 per month, offers substantial usage credits, making it a powerful tool for non-technical power users seeking to automate complex workflows without deep technical setup.

Anthropic Faces Government Scrutiny Over AI Use

Anthropic has found itself at the center of controversy following a public disagreement with the U.S. government regarding the use of its AI models. The company stated it was not ready to allow its Claude AI for applications in mass surveillance and autonomous weapons.

This stance led to a public statement from former President Trump, who criticized Anthropic as a ‘radical woke company’ and ordered federal agencies to cease using its technology, mandating a six-month phase-out period and threatening consequences for non-compliance. This situation highlights the complex ethical considerations and governmental oversight challenges in the rapid deployment of advanced AI.

‘Quit GPT’ Movement Gains Momentum Amidst Ethical Concerns

A growing ‘Quit GPT’ movement, estimated to have impacted millions of users, reflects dissatisfaction with OpenAI’s direction. Key catalysts include a $25 million donation by OpenAI President Greg Brockman to a pro-Trump PAC, the reported use of ChatGPT in an ICE resume screening tool, and OpenAI’s acceptance of a Department of Defense contract that Anthropic had refused on ethical grounds related to surveillance and autonomous weapons. These events, coupled with perceived degradation in ChatGPT’s performance, particularly with the 5.2 model, have led users, including public figures, to seek alternatives like Claude.

Robotics Advancements: Memory, Intuition, and Humanoids Enter Factories

In robotics, Stanford researchers have developed FSM (Few-Shot Memory), a system enabling robots to learn physical principles in real-time without full retraining. FSM addresses the gap between abstract AI knowledge and real-world experience by using a three-tier memory system: episodic memory for raw experiences, hypothesis generation for understanding ‘why,’ and the promotion of verified principles for future actions. This approach significantly improves success rates in real-world tasks, allowing robots to develop intuitive learning capabilities akin to humans.

Physical Intelligence, a well-funded robotics AI startup, has introduced MEM (Multiskill Embodied Memory), combining short-term visual tracking with long-term natural language narratives. This allows robots to maintain focus for extended periods, enabling complex tasks like cleaning a kitchen or preparing a meal. The system differentiates between dense visual memory and summarized semantic events, enabling context adaptation and improved task execution, as demonstrated by a robot successfully adapting its grip strategy after an initial failure.

Faraday Future has launched Embodied AI robots, rebranding Chinese-made AGIBot models, including a full-size humanoid starting at $35,000 and a more athletic version at $20,000. Meanwhile, BMW has deployed its first humanoid robots in a European factory, exploring their utility in manufacturing processes. While still in early stages, these developments signal a growing trend towards integrating humanoid robots into industrial settings, promising increased automation and efficiency.

Source: AI News – New Models From Google & OpenAI , AI Drama & Humanoids In Factories (YouTube)

Tags: AI Models Google Gemini Microsoft AI OpenAI GPT Robotics

Written by

Joshua D. Ovidiu

I enjoy writing.

19,566 articles published

Leave a Comment