All Issues
Jun 01 - Jun 07, 2026

AI Weekly: Anthropic Files for IPO, Gemma 4 Runs on 16GB Laptops

Models & Releases

4 stories

MiniMax M3: Open-Weight, 1M Context, Frontier Coding

  • MiniMax released M3, an open-weight model with frontier coding and agentic capability built on MSA (MiniMax Sparse Attention), supporting up to 1M context window with a minimum of 512K.
  • The model ships with native multimodality and is immediately available on Ollama and SiliconFlow — this week’s top post on r/LocalLLaMA.
  • M3 positions MiniMax as a serious open-weight competitor at the high end, directly targeting DeepSeek V4 and Qwen 3 in the open-weight coding agent category.

Microsoft Launches 7 MAI Models with Frontier Tuning

  • Microsoft AI launched seven MAI models trained from scratch on clean licensed data with zero distillation, co-designed with Maia 200 silicon for a 1.4x efficiency boost.
  • The headline capability is Frontier Tuning — RL in real-world environments using organizational workflow traces: MAI tuned for Excel matches GPT-5.4 at 10x lower cost, and MAI tuned for McKinsey achieved the highest win rate of any tested model at roughly 10x lower cost.
  • Microsoft and Mayo Clinic are also co-creating a frontier healthcare AI model trained on de-identified clinical data; owned by Mayo Clinic, deployed internally first, then available to other health systems via Azure Foundry.

JetBrains Open-Sources Mellum2: Fast MoE for Agent Pipelines

  • JetBrains open-sourced Mellum2 (Apache 2.0) — a 12B total / 2.5B active MoE ‘focal model’ designed for high-frequency, low-latency tasks in multi-model agent systems such as routing, RAG summarization, and planning steps.
  • Inference time is cut to less than half of comparable models while remaining competitive on code generation, math, and reasoning benchmarks; it handles both natural language and code, a major evolution from the original Mellum (code completion only, Apr 2025).
  • Technical report at arXiv:2605.31268; available on HuggingFace for private and local deployment.

People & Business

3 stories

Meta Launches Enterprise AI Business Agent

  • Meta unveiled an enterprise AI agent aimed at automating day-to-day business operations, entering the enterprise AI race directly against Microsoft Copilot and Salesforce Agentforce.
  • The agent is positioned to handle multi-step operational workflows across departments, extending Meta’s AI reach well beyond its consumer social platforms.
  • The move follows Meta’s 8,000-person restructuring toward AI units in May and reinforces the company’s intent to compete at the enterprise layer, not just the consumer layer.

OpenAI Frontier Models and Codex GA on AWS

  • OpenAI’s frontier models and Codex are now generally available on AWS in both Commercial and GovCloud regions, removing procurement, security, and governance friction for enterprise customers.
  • Codex (5M+ weekly users) is now natively inside AWS developer environments via Amazon Bedrock; upcoming: Daybreak — OpenAI’s cyber/security suite including threat modeling, patch validation, and dependency risk analysis — also coming to AWS.
  • This is the full GA launch after the initial Bedrock announcement in the May 3 edition; the key addition is Codex and the GovCloud region, completing the security and compliance picture.

Policy & Ethics

2 stories

Anthropic Expands Project Glasswing to 150+ Orgs

  • Anthropic expanded Project Glasswing — the joint industry initiative using Claude Mythos to find and fix critical software vulnerabilities — to approximately 150 new organizations across 15+ countries, adding Claude Security for codebase scanning and patch suggestions.
  • Plans include sharing vulnerability-finding tools directly with trusted security teams, deepening the coalition that already reported 10,000+ high and critical-severity bugs in its first month.
  • The global expansion signals a shift from a US-centric coalition to a multinational defensive security infrastructure, with critical infrastructure operators now participating across Europe, Asia-Pacific, and Latin America.

Products & Hardware

4 stories

NVIDIA GTC Taipei: Nemotron 3 Ultra and OpenShell

  • At GTC Taipei, NVIDIA announced Nemotron 3 Ultra — a 550B MoE model with 5x faster inference and 30% lower cost vs open frontier peers, post-trained for LangChain, OpenHands, OpenClaw, Hermes Agent, and OpenCode; available on HuggingFace, OpenRouter, and NIM.
  • NemoClaw is an open blueprint framework connecting Nemotron models to enterprise harnesses, already deployed at Cadence (autonomous chip design/verification), Dassault Systèmes, Siemens Fuse EDA, Synopsys, and Foxconn (Nurabot clinical AI + MoMClaw factory ops).
  • OpenShell is a secure agent runtime with policy and privacy controls, partnering with Microsoft (Windows security primitives), Canonical (Ubuntu snaps), Red Hat, SAP (Joule Studio), and ServiceNow (Project Arc); CUDA-X libraries are available as agent skills in the Claude Code plug-in marketplace and Hermes Skills Hub.

Codex for Every Role: Sites, Annotations, and Plugins

  • OpenAI expanded Codex with six role-specific plugins (analysts, marketers, sales, designers, investors), Codex Sites (deploy hosted internal apps from a prompt), and Annotations (in-place editing of documents).
  • The expansion broadens Codex beyond software engineering to every knowledge-worker role, following the GA launch on AWS announced the same week and building on 5M+ weekly active users.
  • Codex Sites in particular creates a new category — a non-technical user can deploy a functional internal web app by describing it, with no code written and no infrastructure managed.

GPT-Rosalind Gets Agentic Coding and Bioinformatics Plugins

Research & Resources

2 stories

Self-Evolving Agents: Updater Strength Doesn't Matter, Receiver Does

  • arXiv:2605.30621 disentangles two distinct capabilities in self-evolving LLM agents: harness-updating (writing improved prompts/skills/memory) is flat across capability tiers — a 9B Qwen model produces updates yielding the same gains as Claude Opus 4.6.
  • Harness-benefit is non-monotonic: weak models activate little benefit from updates, mid-tier models benefit most, and strong models gain less than mid-tier — meaning you should invest capability budget in the task-solving agent, not the evolver.
  • The finding connects to SkillOpt (covered May 10) which showed similar asymmetry in skill optimization gains; code at github.com/A-EVO-Lab/a-evolve.