About Experience Blog Newsletter Publications

About Experience Blog Newsletter Publications

Jun 01 - Jun 07, 2026

AI Weekly: Anthropic Files for IPO, Gemma 4 Runs on 16GB Laptops

Models & Releases 4 People & Business 3 Policy & Ethics 2 Products & Hardware 4 Research & Resources 2

Models & Releases

4 stories

Gemma 4 12B: Multimodal on a 16GB Laptop

Google released Gemma 4 12B (Apache 2.0), the first mid-sized Gemma with native audio input — no separate vision or audio encoders, projecting raw signals directly into token space.
Benchmark performance sits near the 26B MoE at less than half the memory; it runs locally on 16GB VRAM or unified memory via llama.cpp, MLX, Ollama, and LM Studio.
Google also launched the official Gemma Skills Repository, an agent skill library for building with the Gemma family, now at 150M+ total downloads.

MiniMax M3: Open-Weight, 1M Context, Frontier Coding

MiniMax released M3, an open-weight model with frontier coding and agentic capability built on MSA (MiniMax Sparse Attention), supporting up to 1M context window with a minimum of 512K.
The model ships with native multimodality and is immediately available on Ollama and SiliconFlow — this week’s top post on r/LocalLLaMA.
M3 positions MiniMax as a serious open-weight competitor at the high end, directly targeting DeepSeek V4 and Qwen 3 in the open-weight coding agent category.

Microsoft Launches 7 MAI Models with Frontier Tuning

Microsoft AI launched seven MAI models trained from scratch on clean licensed data with zero distillation, co-designed with Maia 200 silicon for a 1.4x efficiency boost.
The headline capability is Frontier Tuning — RL in real-world environments using organizational workflow traces: MAI tuned for Excel matches GPT-5.4 at 10x lower cost, and MAI tuned for McKinsey achieved the highest win rate of any tested model at roughly 10x lower cost.
Microsoft and Mayo Clinic are also co-creating a frontier healthcare AI model trained on de-identified clinical data; owned by Mayo Clinic, deployed internally first, then available to other health systems via Azure Foundry.

JetBrains Open-Sources Mellum2: Fast MoE for Agent Pipelines

JetBrains open-sourced Mellum2 (Apache 2.0) — a 12B total / 2.5B active MoE ‘focal model’ designed for high-frequency, low-latency tasks in multi-model agent systems such as routing, RAG summarization, and planning steps.
Inference time is cut to less than half of comparable models while remaining competitive on code generation, math, and reasoning benchmarks; it handles both natural language and code, a major evolution from the original Mellum (code completion only, Apr 2025).
Technical report at arXiv:2605.31268; available on HuggingFace for private and local deployment.

People & Business

3 stories

Anthropic Files Confidential S-1 for IPO

Anthropic, PBC confidentially submitted Form S-1 to the SEC for a proposed IPO of common stock, described as potentially one of the largest AI listings ever attempted.
The filing comes weeks after the Series H at a $965B valuation with $47B ARR, signalling the company is moving quickly to public markets while its financial position is at a historic high.
No pricing or timeline has been disclosed; the confidential S-1 process gives Anthropic flexibility to gauge institutional appetite before a public filing.

Meta Launches Enterprise AI Business Agent

Meta unveiled an enterprise AI agent aimed at automating day-to-day business operations, entering the enterprise AI race directly against Microsoft Copilot and Salesforce Agentforce.
The agent is positioned to handle multi-step operational workflows across departments, extending Meta’s AI reach well beyond its consumer social platforms.
The move follows Meta’s 8,000-person restructuring toward AI units in May and reinforces the company’s intent to compete at the enterprise layer, not just the consumer layer.

OpenAI Frontier Models and Codex GA on AWS

OpenAI’s frontier models and Codex are now generally available on AWS in both Commercial and GovCloud regions, removing procurement, security, and governance friction for enterprise customers.
Codex (5M+ weekly users) is now natively inside AWS developer environments via Amazon Bedrock; upcoming: Daybreak — OpenAI’s cyber/security suite including threat modeling, patch validation, and dependency risk analysis — also coming to AWS.
This is the full GA launch after the initial Bedrock announcement in the May 3 edition; the key addition is Codex and the GovCloud region, completing the security and compliance picture.

Policy & Ethics

2 stories

Bipartisan Bill Proposes 3-Year Preemption of State AI Laws

A 269-page bipartisan House draft from Reps. Obernolte and Trahan — dubbed ‘The Great American Artificial Intelligence Act’ — proposes a 3-year federal preemption of all state AI model development laws, including New York and California safety protocol requirements.
Frontier AI developers (OpenAI, Anthropic, Google DeepMind, xAI) would be required to implement catastrophic risk mitigation plans, providing a federal floor while blocking the emerging patchwork of state regimes.
The bill arrives days after Trump signed an EO for voluntary federal agency reviews of frontier models, and directly after Connecticut’s SB5 passed its Senate 32-4 in May, illustrating the accelerating tension between state and federal AI governance.

Anthropic Expands Project Glasswing to 150+ Orgs

Anthropic expanded Project Glasswing — the joint industry initiative using Claude Mythos to find and fix critical software vulnerabilities — to approximately 150 new organizations across 15+ countries, adding Claude Security for codebase scanning and patch suggestions.
Plans include sharing vulnerability-finding tools directly with trusted security teams, deepening the coalition that already reported 10,000+ high and critical-severity bugs in its first month.
The global expansion signals a shift from a US-centric coalition to a multinational defensive security infrastructure, with critical infrastructure operators now participating across Europe, Asia-Pacific, and Latin America.

Products & Hardware

4 stories

NVIDIA Cosmos 3: Open Physical AI for Robotics and AVs

Announced at COMPUTEX 2026 by Jensen Huang, NVIDIA Cosmos 3 is an open physical AI world foundation model offering four capabilities: vision-language reasoning for real-time alerts and logistics, World Action Models (WAMs) for robot policy learning, physics-grounded world simulation for closed-loop evaluation, and synthetic video data generation from text/image/video/audio/action inputs.
The open ecosystem includes Cosmos Curator (data curation), Cosmos Evaluator (scoring generative outputs), and open post-training/inference frameworks; optimised for NVIDIA RTX PRO 6000 Blackwell and GB200.
Cosmos 3 targets robotics, autonomous vehicles, and industrial vision — providing the synthetic training data pipeline that physical AI teams previously had to build themselves.

NVIDIA GTC Taipei: Nemotron 3 Ultra and OpenShell

At GTC Taipei, NVIDIA announced Nemotron 3 Ultra — a 550B MoE model with 5x faster inference and 30% lower cost vs open frontier peers, post-trained for LangChain, OpenHands, OpenClaw, Hermes Agent, and OpenCode; available on HuggingFace, OpenRouter, and NIM.
NemoClaw is an open blueprint framework connecting Nemotron models to enterprise harnesses, already deployed at Cadence (autonomous chip design/verification), Dassault Systèmes, Siemens Fuse EDA, Synopsys, and Foxconn (Nurabot clinical AI + MoMClaw factory ops).
OpenShell is a secure agent runtime with policy and privacy controls, partnering with Microsoft (Windows security primitives), Canonical (Ubuntu snaps), Red Hat, SAP (Joule Studio), and ServiceNow (Project Arc); CUDA-X libraries are available as agent skills in the Claude Code plug-in marketplace and Hermes Skills Hub.

Codex for Every Role: Sites, Annotations, and Plugins

OpenAI expanded Codex with six role-specific plugins (analysts, marketers, sales, designers, investors), Codex Sites (deploy hosted internal apps from a prompt), and Annotations (in-place editing of documents).
The expansion broadens Codex beyond software engineering to every knowledge-worker role, following the GA launch on AWS announced the same week and building on 5M+ weekly active users.
Codex Sites in particular creates a new category — a non-technical user can deploy a functional internal web app by describing it, with no code written and no infrastructure managed.

GPT-Rosalind Gets Agentic Coding and Bioinformatics Plugins

OpenAI updated GPT-Rosalind with stronger agentic coding, drug-discovery, and genomics performance, plus new plugins for evidence retrieval and bioinformatics workflows.
The update builds on the original GPT-Rosalind launch covered in the April 18 edition, which introduced the model as OpenAI’s first domain-specific offering for life sciences; this release adds operational tooling for researchers and clinical workflows.
See also: Microsoft and Mayo Clinic’s frontier healthcare AI model announced the same week — together they signal a race to own the clinical AI layer in 2026.

Research & Resources

2 stories

OpenAI Dreaming V3: Background Memory Synthesis for ChatGPT

OpenAI launched Dreaming V3, a compute-efficient background memory synthesis system that replaces ChatGPT’s saved memories — memories auto-update as time passes, correct stale context, and are reviewable via a memory summary page.
The system achieves approximately 5x lower compute cost vs the previous approach; rolling out to Plus/Pro users in the US now, with Free user rollout over coming weeks.
Dreaming V3 shifts ChatGPT memory from an explicit user-managed list to a continuously maintained model of each user — a significant architectural change in how personalisation is delivered at scale.

Self-Evolving Agents: Updater Strength Doesn't Matter, Receiver Does

arXiv:2605.30621 disentangles two distinct capabilities in self-evolving LLM agents: harness-updating (writing improved prompts/skills/memory) is flat across capability tiers — a 9B Qwen model produces updates yielding the same gains as Claude Opus 4.6.
Harness-benefit is non-monotonic: weak models activate little benefit from updates, mid-tier models benefit most, and strong models gain less than mid-tier — meaning you should invest capability budget in the task-solving agent, not the evolver.
The finding connects to SkillOpt (covered May 10) which showed similar asymmetry in skill optimization gains; code at github.com/A-EVO-Lab/a-evolve.

Previous Issue AI Weekly: Anthropic Hits $965B, Opus 4.8 Rewrites Agent Benchmarks Next Issue AI Weekly: US Gov Pulls Fable 5, Anthropic's Biggest Launch Yet

Deepak Baby

Senior Data Scientist at KBC Bank & Verzekering

Home About Experience Blog Newsletter Publications

© 2026 Deepak Baby

Made with ♥ using Hugo