June 5, 2026 · Friday

Anthropic Data: Over 80% of Code Written by Claude, AI Accelerates Self-Evolution

Internal data shows task durations doubling every 4 months. Over 80% of code is now authored by Claude. Recursive self-improvement — AI designing more powerful successors — may arrive faster than expected.

Anthropic released internal data revealing that AI is accelerating its own development at unprecedented speed. Quarterly code output per engineer has increased 8× compared to 2021–2025 levels. AI task horizons have expanded from a mere 4 minutes to 12 hours, doubling every four months. Over 80 percent of the company's code is now written by Claude, and benchmarks including SWE-bench and CORE-Bench have saturated within two years and fifteen months respectively. The trajectory points toward recursive self-improvement — AI systems autonomously designing more capable successors — which, while not yet realized, may be approaching faster than most researchers anticipated. As of May 2026, Claude authors the vast majority of Anthropic's software, fundamentally reshaping the development pipeline and raising urgent questions about the pace of AI advancement.

NVIDIA Nemotron 3 Ultra — a 550B MoE frontier open model for long-running agents.

NVIDIA Launches Nemotron 3 Ultra: 550B MoE Open Model Built for Agents

A 550B total, 55B active hybrid Transformer-Mamba MoE with 1M context delivers 5× inference speedup and 30% lower cost for agentic tasks.

NVIDIA released Nemotron 3 Ultra, a frontier open model designed from the ground up for long-running autonomous agents. The hybrid Transformer-Mamba mixture-of-experts architecture packs 550 billion total parameters with only 55 billion active, achieving 5× faster inference and up to 30% lower cost on complex agentic workloads. With a 1-million-token context window and support for multi-token prediction speculative decoding, the model targets coding, research, and enterprise workflows requiring sustained reasoning, planning, and tool use over extended horizons. Weights are available under an open license on Hugging Face, with day-zero serving support from vLLM.

NVIDIA Unveils Cosmos 3: The First Universal World Foundation Model

Cosmos 3 understands and generates across text, image, video, sound, and action — a unified framework for physical AI.

NVIDIA introduced Cosmos 3, the world's first open omnimodel designed for physical AI. It ingests and produces across five modalities — text, images, video, audio, and action — providing a single unified framework. Built atop a new breakthrough architecture, Cosmos 3 aims to serve as the foundational layer for robots, autonomous vehicles, and any system that must first understand the physical world before acting within it. The model is open and available for research and commercial use.

OpenAI Model Finds Counterexample to 80-Year-Old Erdős Conjecture

An OpenAI model discovered a counterexample to Erdős's eighty-year-old conjecture, a breakthrough shared on the OpenAI Podcast by researchers Alex Wei, Hongxun Wu, and Wojciech Zaremba, showcasing a new paradigm for mathematician-AI collaboration.

ChatGPT Memory System Receives Major Upgrade, Persistent Across Conversations

OpenAI introduced a more powerful ChatGPT memory system that carries context across conversations and maintains long-term usefulness. The new system allows ChatGPT to retain relevant information over time, making each interaction more personalized and efficient without requiring users to re-establish context.

OpenAI API Adds Inline Moderation Scores

Developers can now receive content moderation signals for both input and output in the same Responses API and Completions API request flow using the omni-moderation-latest model. The feature supports text and images, is free of charge, and lets applications decide how to use scores for logging, routing, review, or blocking.

Codex Introduces Personal Profile Showing Activity and Usage Data

OpenAI Codex now features a personal profile page displaying activity graphs, coding streaks, lifetime token counts, peak daily tokens, and top-used features. Profiles are private by default with an option to share a card publicly.

Codex Launches Build iOS Apps Plugin with Live Preview

The new Build iOS Apps plugin lets Codex view and test iOS applications in the in-app browser, open SwiftUI previews, and hot reload edits without leaving the Codex environment.

Jensen Huang: Agents Become a New Layer of Enterprise Software

NVIDIA CEO Jensen Huang explained that companies including Cadence, CrowdStrike, SAP, ServiceNow, Siemens, and Synopsys are building agents on NVIDIA. He emphasized that the opportunity for software partners is only beginning as agentic AI becomes enterprise infrastructure.

NVIDIA Releases Physical AI Agent Skills at CVPR 2026

At CVPR 2026, NVIDIA announced composable workflows that automate data generation, simulation, and policy training for autonomous vehicles, robots, and vision AI, aiming to speed development for teams that cannot collect sufficient real-world data on their own.

Sakana AI Plans to Build Japan's First 1T Parameter Model

Sakana AI founder revealed plans to build Japan's first 1-trillion-parameter agent-native model through Japan's METI GENIAC initiative. The model will be specifically optimized for long-horizon deep research and autonomous reasoning.

LM Studio Releases Mobile App, Local Models on the Go

LM Studio launched a mobile application that lets users run their local models directly on their phones, bringing offline AI inference to a pocket-sized form factor.

Safety by narrow control has shown to fail many times. Need more transparency on the absolute frontier, and openness close behind.
Nathan Lambert, AI Safety Researcher

Agents & Ecosystem06.05

Small Business

Perplexity and SBA Launch Main Street AI Accelerator

Perplexity partnered with the U.S. Small Business Administration, committing $25 million in compute credits — $250 each for up to 100,000 companies — to accelerate AI adoption among American small businesses.

Platform

Perplexity Computer Will Integrate All Business Connectors

Perplexity CEO announced that the Computer platform will introduce all connectors needed to start and run a business, allowing anyone with an idea and a small team to build growing companies faster than ever.

Hardware

NVIDIA DGX Spark Update Boosts Inference Speed 2.6×

NVIDIA DGX Spark updates simplify local agent workflows and accelerate inference up to 2.6× via NVIDIA NemoClaw, announced at GTC Taipei during COMPUTEX.

E-Commerce

Replit Partners with Shopify: Build an Online Store in Minutes

Replit Agent now integrates with Shopify. Describe what you want to sell and the agent builds a custom storefront, creates your Shopify store, and helps add products — go live in minutes.

Productivity

Cursor Adds Canvas Sharing Feature

Cursor now supports publishing canvases — dashboards, reports, and internal tools — as shareable URLs, enabling team collaboration without leaving the editor.

Inference

Step 3.7 Flash Deployed on Fireworks AI at 400 Tokens/s

StepStar's Step 3.7 Flash model is now available on Fireworks AI, featuring MTP-assisted decoding reaching 400 tokens per second, designed for capable agents in production.

Models & Research06.05

Benchmarks

LlamaIndex Unveils ParseBench at CVPR 2026

ParseBench is the first document parsing benchmark designed specifically for AI agents, treating document understanding as an AGI-complete problem.

Speech AI

Nemotron Parakeet ASR Achieves 97.7% Accuracy on Indonesian

Rafiqspace.ai fine-tuned Nemotron Parakeet ASR to 97.7% accuracy (2.3% WER) for Bahasa Indonesia, cutting transcription costs by up to 90%.

Defense AI

Cohere Wins NATO Cognitive Warfare AI Challenge

Cohere took first place in NATO's Agentic AI for Cognitive Warfare Innovation Challenge, ahead of OpenMinds, Ipsos, and Thoughtworks.

Video AI

Runway Launches Aleph 2.0 Precise Editing

Runway's Aleph 2.0 provides finer video editing control, changing only user-specified parts while keeping the rest of the frame untouched.

Consumer AI

Pika Launches First In-App Group Chat AI Agent

Pika introduced an in-app group chat where AI agents can help with phone updates, create memes, and collaborate on creative projects.

Open Source

Ollama Supports Gemma 4 12B Across All Platforms

Ollama now runs Gemma 4 12B, launchable inside Claude Code, Hermes Agent, OpenClaw, and Codex via the ollama launch command.

Models

Jeff Dean Recommends Gemma 4 12B for Laptops

Google's chief scientist called Gemma 4 12B a super capable open weights model that runs directly on a laptop.

Image Gen

Ideogram 4 Goes Open Weights, Ranks as Best Open Image Model

Ideogram 4 is Ideogram's first open-weight text-to-image model, trained from scratch with 9.3B parameters, structured JSON prompts, multilingual text rendering, and native 2K output.

Education

Free vLLM Community Course Covers Full Deployment Optimization

Red Hat and DeepLearning.AI jointly released a free vLLM course with three hands-on labs covering quantization, deployment, and benchmarking on live vLLM servers.

Education

Andrew Ng Launches New Course on Serving LLMs Efficiently

DeepLearning.AI teamed up with Red Hat to teach how to serve models to many concurrent users at low latency and reasonable cost, covering efficient memory management for large parameter models.

Growth

Runway Sees 50% Token Growth in 6 Weeks

Runway CEO shared that token consumption grew 50% in six weeks, power users up 140%, and enterprise NDR hit 300% as the platform becomes more embedded in daily workflows.

Open Source

Nathan Lambert: US Open Model Labs Reverse the Decline

Since last June, Nvidia, Ai2, Arcee, Gemma, and GPT-OSS have put the US back on the map for open AI — a dramatic reversal from being "totally owned" a year ago.