All Issues
Apr 19 - Apr 25, 2026

AI Weekly: GPT-5.5 vs DeepSeek-V4, SpaceX Bets $60B on Cursor

Models & Releases

4 stories

DeepSeek-V4-Pro: Frontier-Class Intelligence at One-Sixth the Price

  • DeepSeek released V4-Pro (1.6T params, 49B active) and V4-Flash (284B, 13B active), both with 1M token context, under MIT licence — a year after the original R1 moment and being hailed as the second DeepSeek moment.
  • At $5.22 per million tokens (blended), it costs roughly one-sixth of GPT-5.5 ($35.00) and one-sixth of Opus 4.7 ($30.00) for comparable performance on most benchmarks.
  • Key architectural advance: a hybrid CSA+HCA attention mechanism that reduces 1M-context inference to just 27% of the FLOPs and 10% of the KV cache required by V3.2.

Kimi K2.6: Open-Source Agent Overhauled a Production Codebase Over 13 Hours

  • Moonshot AI open-sourced Kimi K2.6, featuring long-horizon execution and agent swarm capabilities built for real engineering tasks, not just benchmarks.
  • In one demo, the model autonomously overhauled an 8-year-old financial matching engine over 13 hours and 1,000+ tool calls, delivering a 185% throughput improvement without human intervention.
  • A second demo deployed and optimised Qwen3.5-0.8B inference in Zig — a niche, out-of-distribution language — over 12 hours, achieving 20% higher throughput than LM Studio.

Qwen 3.6 27B: Dense Model Outperforms Its Own 397B MoE

  • Alibaba's Qwen team released Qwen 3.6 27B, a dense model that outperforms the much larger Qwen 397B MoE variant on several key benchmarks — a striking demonstration of training efficiency over raw scale.
  • Available under Apache 2.0, it runs on consumer-grade hardware and has quickly become the top discussion thread on r/LocalLLaMA this week.
  • Part of the continuing Qwen series that has consistently punched above its weight class; follows last week's Qwen3.6-35B-A3B MoE release.

People & Business

4 stories

Cohere Acquires Aleph Alpha to Create $20B Transatlantic AI Company

  • Canadian AI lab Cohere is acquiring German startup Aleph Alpha in a deal that creates a $20 billion combined entity, with Cohere shareholders receiving ~90% and Aleph Alpha shareholders ~10%.
  • The deal is explicitly positioned to give European governments and enterprises an AI provider that is not a US tech giant; the German government is set to become an anchor customer.
  • Both governments facilitated the merger — the first major transatlantic AI consolidation, and a signal that the European AI sovereignty push is moving from policy to M&A.

Anthropic Commits $100 Billion to AWS Over 10 Years for 5 Gigawatts of Compute

  • Anthropic signed a new agreement with Amazon securing up to 5 gigawatts of compute capacity covering Trainium2 (shipping Q2 2026), Trainium3, Trainium4, and future custom silicon generations.
  • The full Claude Platform will now be available natively inside AWS — same account, same billing, no new credentials or contracts required for existing AWS customers.
  • Coming the same week as Meta's tens-of-millions-of-Graviton-cores deal, AWS is consolidating as the infrastructure backbone for multiple competing frontier AI labs simultaneously.

Anthropic Postmortem: Three Bugs, One Month of Degraded Claude Code

  • Anthropic published a detailed postmortem explaining that three separate bugs — a reasoning effort downgrade in March, a memory-clearing bug in late March, and a verbosity prompt change in April — combined to produce month-long quality complaints.
  • Because each change affected a different slice of traffic on a different timeline, the aggregate effect looked like unexplained broad degradation that was difficult to distinguish from normal feedback variation.
  • All three issues are resolved as of v2.1.116; Anthropic reset usage limits for all subscribers as a goodwill gesture — published on the same day GPT-5.5 launched.

Policy & Ethics

2 stories

UK Government Considers Ending Palantir's NHS Contract

  • The UK government is reportedly considering invoking the break clause on Palantir's NHS Federated Data Platform contract, following sustained pressure from MPs, unions, and digital rights campaigners over data privacy concerns.
  • The contract has been controversial since its award, with critics arguing the platform gives a US defence contractor excessive access to NHS patient data.
  • If invoked, it would be one of the largest public-sector AI contract cancellations in the UK and would significantly set back the NHS digital data agenda.

Products & Hardware

3 stories

ChatGPT Images 2.0: Reasoning Mode, 2K Resolution, Magazine-Quality Layouts

  • OpenAI's Images 2.0 adds a 'thinking' mode with built-in compositional reasoning, up to 2K resolution, and dramatically improved text rendering — handling small text, UI elements, iconography, and dense layouts that break most image models.
  • With a December 2025 knowledge cutoff, it can handle end-to-end creative workflows from copywriting through design composition — OpenAI's response to the Claude Design launch the previous week.
  • Available now to ChatGPT users with flexible aspect ratio support.

Grok Voice Think Fast 1.0 Launches, Tops Voice Benchmark, Ships on Starlink

  • xAI launched Grok Voice Think Fast 1.0, a voice model designed for complex multi-step enterprise workflows including customer support, sales, and operational automation.
  • It tops the Tau Voice Bench leaderboard and ships directly embedded in Starlink — the first frontier AI voice model deployed inside a satellite internet service.
  • Extends the Audio AI race (alongside Mistral Voxtral, Gemini Flash Live, MAI-Voice-1) into a new distribution channel: satellite connectivity infrastructure.

Research & Resources

2 stories

FairyFuse: 29x Speedup for Ternary LLMs on CPUs, Zero Multiplications

  • FairyFuse runs ternary-weight LLMs on commodity CPUs using AVX-512 fused masked add/subtract loops — eliminating floating-point multiplications entirely from the inference hot path.
  • Achieves a 29.6x kernel speedup and 32.4 tokens/second on a single Intel Xeon 8558P, outperforming llama.cpp Q4_K_M by 1.24x with near-lossless quality (WikiText-2 perplexity 5.52 vs 5.47 for FP16).
  • Key insight: ternary 16x weight compression shifts the memory-bound GEMV bottleneck toward compute — which is exactly where AVX-512 wins.