Codex Computer Use Lands on Windows with Mobile Remote Steering
OpenAI announced that Codex's Computer Use feature now supports Windows, allowing the coding agent to test applications, debug flows, and review work directly on Windows machines where project context lives. The ChatGPT mobile app can now connect to Windows machines, letting developers start, review, and steer tasks on the go while work continues on their desktop. The announcement drew 5,429 likes and over 546,000 views, reflecting strong developer demand for cross-platform agentic tooling.
Claude Opus 4.8 Adds Mid-Conversation System Instructions Without Breaking Prompt Cache
Anthropic released Claude Opus 4.8 with a key capability: system instructions can now be added mid-conversation without interrupting the prompt cache. This means more cache hits, lower cost, and reduced latency for API requests. The feature addresses a long-standing friction point for developers building multi-turn agentic applications where context must evolve dynamically. The update arrives alongside broader Opus 4.8 improvements, including increased honesty and a roughly 4x reduction in code defect omission rate.
Cohere Command A+ Beats Mistral, DeepSeek, and Google Translate in Machine Translation
Cohere released Command A+, setting a new company benchmark for machine translation. The model opened a clear gap over open-source peers Mistral Medium 3.5, DeepSeek, and OpenAI's gpt-oss, as well as Claude Opus 4.6. It also outperformed Google Translate, the long-standing specialist system, though Cohere acknowledged that RWS remains superior.
Visa Invests in AI Coding Platform Replit to Power Agentic Payments
Visa has invested in Replit, exploring how the AI coding platform can enable agentic payments. The partnership aims to help developers build payment applications more efficiently by leveraging Replit's AI-powered development environment. The move signals growing financial sector interest in agentic AI infrastructure as payment workflows become programmable and autonomous.
Step-3.7-Flash GGUF Lands on HuggingFace for Local Hardware
Jieyue Xingchen released the GGUF quantized version of Step-3.7-Flash on HuggingFace, enabling users to run the model on their own hardware. The release aligns with the broader push toward local-first AI, allowing developers to download and run frontier models without API keys or cloud dependencies.
"The words or the language, as they are written or spoken, do not seem to play any role in my mechanism of thought."
Albert Einstein, cited by François Chollet — on the limits of natural language for invention
Anthropic Reports Annualized Revenue Run-Rate of $47 Billion
Simon Willison relayed Anthropic's self-reported annualized revenue growth, noting that Axios founder Jim VandeHei said he could not find any company in any industry in any era that has scaled organic revenue this quickly at this level. When Anthropic was at $30 billion, the claim already seemed extraordinary; at $47 billion, it defies comparison. The figure underscores the breakneck commercialization of frontier AI models.
GPT-5 Pro Series Remains Unbeaten on Single-Shot Hard Problems Since Last Summer
Ethan Mollick observed that GPT-5 Pro series models have consistently been the best at solving the hardest problems in a single attempt since summer 2025, with no real competition emerging in all that time. The observation highlights OpenAI's sustained lead in frontier reasoning despite an increasingly crowded model landscape.
Claude Dynamic Workflows Can Launch Hundreds of Subagents for Large-Scale Tasks
Commentator op7418 noted that Claude's newly released dynamic workflows may be more significant than the Opus 4.8 model update itself. The system extends concurrent subagent logic, potentially launching hundreds of subagents to tackle massive tasks such as researching an entire codebase or generating comprehensive reports in a single session.
DeepSeek's Infrastructure Engineering Is So Good the Industry Politely Pretends It Doesn't Exist
Teortaxes Tex remarked that DeepSeek is so excellent at infrastructure engineering that the rest of the industry has to pretend they are operating at a loss or that it simply is not happening. The observation points to the competitive tension around DeepSeek's cost efficiency and its underappreciated operational excellence in serving large-scale AI workloads.
France Releases Advanced Open-Source LLM Under Apache 2.0 License
An advanced large language model has been released by France under the permissive Apache 2.0 license, targeting both personal and enterprise use cases. The move represents a significant European contribution to the open-weight AI ecosystem, offering a sovereign alternative to American and Chinese frontier models.
Cursor Introduces Auto-Review Mode
Cursor released Auto-review mode, allowing agents to run tool calls with fewer approval prompts while executing more safely. The feature reduces friction in agentic coding loops.
Surya OCR 2 Released with 650M Parameters
VikParuchuri announced Surya OCR 2, scoring 83.3% on the olmocr benchmark and 87% on an internal 91-language benchmark, positioning it as the top sub-3B OCR model.
OpenAI Launches Rosalind Biodefense Program
OpenAI announced the Rosalind biodefense project to accelerate AI-driven biosafety and pandemic preparedness, expanding GPT-Rosalind access to U.S. government and allied partners.
Stanford OpenJarvis Runs Locally via Ollama
OpenJarvis, a local-first personal AI developed by Stanford HazyResearch and Scaling Intelligence Lab, can now run via Ollama as part of the Intelligence Per Watt research initiative.
Runway Aleph 2.0 Exclusive to Adobe Firefly
Adobe Firefly has the exclusive on Runway Aleph 2.0 video generation model, allowing users to generate new clips by editing existing videos. Available through June 1.
Qwen-VLA Unifies Vision-Language-Action Across Robots
Qwen-VLA proposes unified vision-language-action modeling across tasks, environments, and robot embodiments, advancing general-purpose robotic AI.
Cartesia Ink-2 Tops Streaming Speech-to-Text Leaderboard
Cartesia released Ink-2, ranking first on the streaming speech-to-text leaderboard, optimized for low-latency transcription.
Luma Agents Auto-Generate Promotion Graphics
Luma released Luma Agents, which automatically generate full promotion graphics from input content and marketing hooks, described as a creative team multiplier.
llama.cpp Launches Official Website llama.app
llama.cpp launched its official site llama.app, enabling frontier models to run locally without API keys, supporting hardware from phones to clusters.
vLLM Integrates Open-Source Rust Tokenizer fastokens
vLLM now includes fastokens, an open-source Rust BPE tokenizer built by CrusoeAI and NVIDIA Dynamo, compatible with DeepSeek, Qwen, Kimi, MiniMax, and Nemotron models.
vLLM Rolls Out Two Major RL Upgrades
vLLM released a native weight synchronization API and an improved pause/resume feature for asynchronous RL training, standardizing weight transfer with optimized NCCL and CUDA IPC support.
Opus 4.8 ParseBench: Tables Up, Charts Down
LlamaIndex published ParseBench results for Opus 4.8, showing gains in tables and semantic formatting but slight drops in chart parsing and content faithfulness, with a minor page-price increase.
GPIC Dataset: 100M VLM-Annotated Image-Text Pairs
Keshi Geyan released the GPIC dataset containing 100 million VLM-captioned image-text pairs for visual generation benchmarking.
NVIDIA Blackwell Ultra Delivers 50x Throughput Per Megawatt
NVIDIA promoted its AI factory vision, with Blackwell Ultra achieving 50x higher throughput per megawatt, converting energy into continuous intelligence.
Simon Willison Reviews Claude Opus 4.8
Anthropic released Opus 4.8 with modest but real improvements: increased honesty, lowest hallucination rate, same pricing, and minimum cache tokens reduced from 4096 to 1024.
Step 3.7 Flash Gets Day-0 NVIDIA NIM and NeMo Support
Jieyue Xingchen confirmed NVIDIA NIM, NeMo, and GPU-accelerated endpoints are ready for Step 3.7 Flash on launch day.
Terence Tao: AI Frees Researchers to Pursue Bolder Ideas
OpenAI shared mathematician Terence Tao's view that AI creates more room for experimentation, enabling researchers to test unexpected paths and discover what might otherwise stay out of reach.
Red Hat Speculators v0.5.0 Adds DFlash Training Support
Red Hat AI released Speculators v0.5.0, adding DFlash training support for drafting all tokens in a single pass via block diffusion, alongside two other major updates.