May 8, 2026 · Friday


Anthropic trains Claude to translate its own neural activations into human-readable text

A natural language autoencoder opens a new window into how frontier models think internally.

Anthropic researchers have trained Claude to decode its own internal activations — the high-dimensional numerical vectors that encode the model's reasoning — into plain English summaries. While models like Claude communicate in words, their internal thought process has remained opaque even to their creators. The natural language autoencoder bridges this gap, offering a powerful new tool for model interpretability. By translating activations into readable descriptions of what the model is considering at each step, the technique could help researchers audit model behavior, detect unwanted reasoning patterns, and build safer AI systems.

OpenAI Codex now runs inside Chrome with background multi-tab parallel execution

The coding agent gets a Chrome extension, working across tabs without taking over the browser.

Codex can now operate directly in Chrome on macOS and Windows through a new browser extension. It runs tasks in background tabs, gathering context across multiple pages and using Chrome DevTools in parallel, all while keeping results organized without hijacking the user's browser interface. The extension supports testing web applications, collecting cross-tab context, and executing multi-step development workflows autonomously. Combined with Codex App's existing features, this makes the coding agent a practical tool for everyday browser-based development tasks.


People are really starting to use voice to interact with AI, especially when they have a lot of context to dump. GPT-Realtime-2 comes to the API today; it is a pretty big step forward.

Sam Altman, OpenAI CEO

xAI API launches Image Generation Quality Mode, over 300 million images generated on Grok

xAI introduced Image Generation Quality Mode in its API, improving photorealism and text rendering. The model has already powered more than 300 million image generations on the Grok platform.

Perplexity launches Mac Personal Computer that operates local files and apps

Perplexity's Personal Computer is now available via a new Mac app, executing tasks across local files, native Mac applications, the web, and Perplexity's secure servers.

Cursor launches /orchestrate skill for recursive agent spawning on complex tasks

Cursor's new skill recursively generates agents for ambitious workflows. Internal use cut token consumption by 20% and reduced backend cold start times by 80%.

PhysForge generates physically grounded 3D assets for interactive virtual worlds

A decoupled two-stage framework using physics blueprint planning and physics-guided diffusion models produces functional, simulation-ready 3D assets. Accepted at ICML 2026.

RLDX-1 robot policy achieves 86.8% success rate on ALLEX humanoid tasks

Built on a multi-stream action Transformer architecture, RLDX-1 far surpasses π0.5 and GR00T N1.6 which scored approximately 40% on the same dexterous manipulation benchmarks.

Mozilla uses Claude Mythos preview to harden Firefox, confirms it is not hype

Mozilla's security team validated Claude Mythos by finding and reproducing real-world bugs in Firefox while filtering out false positives. The general-purpose model's exploit-finding capability proved genuine.


Briefing05.08
OPEN SOURCE

Anthropic donates alignment tool Petri to non-profit Meridian Labs

Anthropic donated its open-source alignment testing tool Petri to Meridian Labs, releasing a major update that improves test adaptability, realism, and depth. Petri has been used to evaluate every Claude model since Sonnet 4.5.

PLATFORM

OpenAI ships three new voice models: conversation, translation, and transcription

Beyond GPT-Realtime-2, the Realtime API now includes GPT-Realtime-Translate with 70 input languages and GPT-Realtime-Whisper for faster real-time transcription.

ENTERPRISE

Anthropic deploys Claude across Microsoft 365: Excel, PowerPoint, and Word

Claude plugins for Excel, PowerPoint, and Word graduate from beta to general availability, while the Outlook plugin enters public beta. Users can now call Claude directly from within documents.

VOICE

xAI launches Grok Voice Think Fast 1.0 for real-world customer support

Grok Voice Think Fast 1.0 handles complex workflows with speed and accuracy even in noisy environments, covering multi-step troubleshooting and high-volume tool calls.

PAPER

Zhipu publishes GLM-5V-Turbo report detailing native multimodal agent foundation model

The technical report covers improvements in model design, multimodal training, reinforcement learning, toolchain expansion, and agent framework integration.

DeepMind

Google DeepMind AlphaEvolve accelerates quantum, biotech, and logistics research

The Gemini-powered coding agent has been accelerating progress across quantum computing, biotechnology, logistics, and Google's internal AI systems over the past year.

EDUCATION

Andrew Ng launches course on building agents that generate custom interactive UIs

A new short course with CopilotKit teaches how to build agents that respond with charts, forms, and whiteboards generated on demand and displayed directly in chat.

RESEARCH

Study shows most LLM compute goes to recipe development, not final training runs

Research led by Jacob Cares reveals that the vast majority of compute in building large language models is spent on developing training recipes, not the final model execution.

PAPER

Stream-R1 improves streaming video generation through reward distillation

An adaptive reweighting approach enhances visual quality, motion quality, and text alignment for streaming video generation without adding computational overhead.


Tools & Briefs05.08