May 19, 2026 · Tuesday

NVIDIA Delivers First Custom Vera CPU for the Agentic AI Era

NVIDIA ships its inaugural self-developed processor to Anthropic, OpenAI, SpaceX, and Oracle Cloud, marking a new chapter in custom AI silicon.

Ian Buck hand-delivers the first Vera CPUs — NVIDIA's first custom processor purpose-built for agentic AI workloads.

NVIDIA has shipped the first batch of its Vera CPU, the company's inaugural custom processor designed specifically for agentic AI workloads. The chips were hand-delivered by NVIDIA VP Ian Buck to partners Anthropic, OpenAI, SpaceX, and Oracle Cloud. Vera represents NVIDIA's strategic move beyond GPUs into vertically integrated compute for the next generation of AI agents. The Vera platform promises to power a new class of AI systems optimized for autonomous reasoning and tool use, with a full hardware roadmap still ahead.


Composer 2.5 outperforms models at similar parameter counts through RL breakthroughs.

Cursor Launches Composer 2.5, Its Most Powerful Proprietary Model

Cursor's new coding model delivers improved intelligence on long-running tasks, doubled free quota, and RL-driven performance well above its weight class.

Cursor has unveiled Composer 2.5, the company's most capable proprietary model to date. Early benchmarks show the model punching far above its parameter count on coding tasks, driven by significant advances in reinforcement learning. For the first week, Cursor is doubling included usage for all users. Developers report that the model handles sustained, complex multi-step workflows with markedly improved reliability. The release has drawn praise from across the industry, with Hugging Face CEO Clement Delangue noting that serious AI companies are increasingly training their own models rather than outsourcing via APIs.

Internal testing by xAI employees confirms the model fights well above its weight class, signaling that reinforcement learning breakthroughs are reshaping the competitive landscape for code-generation models. The release underscores a broader industry trend: frontier AI companies are investing heavily in proprietary training.


Anthropic Acquires SDK Platform Stainless

The company behind every official Anthropic SDK now joins the lab to strengthen the developer ecosystem.

Anthropic announced the acquisition of Stainless, the SDK and MCP server platform that has powered every official Anthropic SDK since the API's earliest days. The move signals a deepened commitment to developer tooling, with Stainless's infrastructure poised to accelerate the next generation of integrations around Claude and the Model Context Protocol.

Qwen3.7 Preview Lands on Arena Leaderboards

Alibaba's Qwen3.7-Max and Qwen3.7-Plus previews debut, pushing the lab to #6 in text and #5 in vision rankings on LMSys.

Tongyi Qianwen released preview versions of Qwen3.7 on the LMSys Chatbot Arena, with Alibaba now ranking sixth among text model labs and fifth in vision. The previews showcase significant capability gains across both modalities, and the team promises the full Qwen3.7 series release is imminent.

Claude Code at Scale: Best Practices for Large Codebases

Claude's new blog series shares lessons from teams running AI-assisted development across enormous codebases.

Claude published the first post in a series on running Claude Code at scale, drawing from experiences with teams operating across million-line monorepos, decades-old legacy systems, and distributed microservices. The post identifies common patterns in configuration, tooling, and organizational structure that lead to successful deployments, offering a practical starting guide for engineering teams.

Tether Fine-Tunes a 13B Model on an iPhone 16

No data center, no enterprise GPU — a 13-billion-parameter model is customized entirely on a consumer smartphone.

TechCrunch reports that Tether successfully fine-tuned a 13-billion-parameter AI model entirely on an iPhone 16, with no data center or enterprise GPU required. The approach achieves full on-device privacy while demonstrating that meaningful model customization is now possible on consumer hardware, potentially reshaping the economics of local edge AI deployment.


A mental model for working with coding agents is that they're blind squirrels running into a maze and bumping into walls. You must place the walls — verifiable constraints — strategically so that they end up in the general region you want them in.

François Chollet

Codex now supports remote connections from the ChatGPT mobile app.

Codex Desktop Gains Remote Connection via ChatGPT Mobile

Codex added remote connectivity, enabling users to control their Mac-based Codex session from the ChatGPT mobile app. With the "Keep this Mac awake" option enabled, Codex keeps running on a powered and plugged-in Mac while users work remotely from their phone.

Google I/O 2026 kicks off today at 10 a.m. PT with live coverage on X.

Google I/O 2026 Kicks Off With AI Breakthroughs Teased

Google DeepMind and CEO Sundar Pichai previewed I/O 2026, promising breakthroughs in AI tools and innovations. The livestream starts today at 10 a.m. PT from Mountain View.

Claude Code Fast Mode Now Defaults to Opus 4.7

The /fast mode in Claude Code now uses the Opus 4.7 model by default, boosting efficiency for rapid coding tasks and shorter iteration cycles.

Claude Console Adds Prompt Cache Diagnostics

Developers can now see exactly which part of their prompt changed when a cache miss occurs, with a detailed token-cost breakdown for each cache miss event.

Sam Altman Hails Major ChatGPT Performance Update

OpenAI CEO praises the latest update for dramatically improving ChatGPT, calling it the team's best work yet.

Sam Altman publicly praised the latest ChatGPT update, stating the model has "gotten so much better" and expressing deep pride in the OpenAI team. The update appears to deliver broad performance gains, though specific technical details remain undisclosed. Separately, Altman noted that ChatGPT Images 2.0 has already generated over one billion images in India alone.

Musk's Lawsuit Against OpenAI Dismissed

A California federal jury rules the lawsuit was filed too late, dismissing all claims after under two hours of deliberation.

A federal jury in Oakland unanimously ruled that Elon Musk's lawsuit against OpenAI and Sam Altman exceeded the statute of limitations, dismissing all claims after less than two hours of deliberation. Judge Yvonne Gonzalez Rogers subsequently denied an appeal request, upholding the jury's ruling. Musk's lawyer Marc Toberoff said they will appeal.

Jensen Huang joins Michael Dell on stage at Dell TechWorld in Las Vegas.

Jensen Huang and Michael Dell Co-Keynote at Dell TechWorld

NVIDIA and Dell CEOs take the stage together, showcasing AI factory innovations and enterprise AI partnerships.

NVIDIA CEO Jensen Huang joined Dell Chairman and CEO Michael Dell on stage at the "Unleash the Future" keynote at Dell TechWorld in Las Vegas. The joint appearance underscored the deepening collaboration between the two companies on enterprise AI infrastructure. Hugging Face also participated, with CEO Clement Delangue promoting on-premise local AI based on open-source models — positioning the combination of Dell hardware, NVIDIA acceleration, and Hugging Face models as an alternative to cloud APIs that is cheaper, faster, and more secure.


Grok Agent Mode Is a Major Capability Unlock

Elon Musk described Grok's new agent mode as a significant breakthrough, enabling autonomous task execution and tool use. Musk also publicly solicited community feedback to improve Grok Build.

Musk Teases New Model Trained on Colossus 2

A new model from xAI, partially trained on the Colossus 2 supercomputing cluster, is now available for users to try. Musk encouraged the community to test it.

Comprehensive Guide Published on Evaluating AI Agents

A detailed guide covers agent fundamentals, common evaluation patterns and frameworks, and case studies of popular agent benchmarks.

Runway Characters Can Now Execute Tool Calls

Runway's real-time video agents gained the ability to call external tools based on user instructions, moving beyond conversation into autonomous action.

Black Forest Labs Releases Official FLUX MCP

Generate images directly inside Claude, Cursor, and other MCP-compatible tools, with automatic model routing from sub-second drafts to production-quality assets.

On-Policy Distillation May Become Enduring Training Method

Nathan Lambert argues on-policy distillation is on track to join instruction tuning, RLHF, DPO, and RLVR as a lasting post-training paradigm.

GLM-5.1 Goes Live on OrcaRouter, Tops SWE-Bench Pro

Zhipu's GLM-5.1 is now available via OrcaRouter, ranking as the #1 open-source model on SWE-Bench Pro and beating several closed-source models.

NVIDIA Releases Nemotron CLIMB Proxy Models

NVIDIA published 62M and 350M parameter Nemotron CLIMB Proxy models on Hugging Face — small decoder-only models for agent tasks.

vLLM Now Installs Directly on GH200/GB200/GB300

The vLLM project announced pip install vllm works out of the box on NVIDIA's latest hardware, thanks to PyTorch and NVIDIA collaboration.

Former Employee Compares Anthropic and DeepMind Research Cultures

A researcher who worked at both labs said Anthropic offers compute for pure research without requiring permissions or meetings, contrasting with the more bureaucratic environment at DeepMind.

Abridge Processes Over 100M Medical Conversations

AI healthcare company Abridge has processed over 100 million doctor-patient conversations, enabling real-time prior authorization and building a clinical intelligence layer that saves doctors 10 to 20 hours per week.

Recraft V4.1 vs GPT Image 2 High: Side-by-Side Comparison

Same prompts produce dramatically different creative outcomes, with visible differences in atmosphere, composition, color handling, and detail between the two models.


Codex Adds /goal Command for Persistent Objectives

Greg Brockman introduces a new mode that keeps Codex working on a single goal until it is fully solved.

The new /goal command in Codex enables users to set a persistent objective that the model will continue working on until completion. Unlike single-shot prompts, /goal maintains context across multiple turns, making it suitable for complex, multi-step engineering tasks that require sustained effort over time.

PapersWithCode Platform Announces Revival

The research-to-code bridge is being rebuilt, with Niels Rogge leading the restart of the iconic platform.

Niels Rogge announced the revival of PapersWithCode, the platform that connects machine learning research papers with their implementations. Citing Ilya Sutskever's observation that we are back in an "age of research," the relaunch aims to restore the critical link between published results and reproducible code.

MIT Introduces Pedagogical Reinforcement Learning

Key finding: even correct reasoning traces can produce bad training signals — a new method rethinks how models learn to reason.

MIT researchers released Pedagogical RL, a new reinforcement learning method built on the insight that correct reasoning trajectories can still yield poor training data. The method reframes how reward signals are structured, challenging fundamental assumptions in current RLHF and reasoning-model training pipelines.

Richard Sutton's Bitter Lesson, Distilled to 26 Words

The father of reinforcement learning restates AI's most enduring principle in a single penetrating sentence.

"Don't be distracted by human knowledge, as AI has been historically. Instead focus on general methods that leverage computation." The restatement by Richard Sutton, one of the founding figures of reinforcement learning, has resonated widely as AI labs increasingly shift from hand-crafted architectures toward scaling compute-driven approaches.

Vercel Firewall Achieves Near-Instant Global Propagation

Rule updates propagate worldwide in approximately 300ms — optimized for AI agent workflows.

Vercel's Firewall, designed with AI agents in mind, updates security rules globally in about 300 milliseconds, compared to the minutes-long propagation typical of traditional CDN and WAF providers. The speed enables real-time security adjustments for autonomous agent operations.

Hugging Face and Dell Partner on Local AI Infrastructure

Clement Delangue promotes on-prem AI as cheaper, faster, and safer than cloud APIs, with Dell providing the hardware backbone.

At Dell TechWorld, Hugging Face CEO Clement Delangue reinforced the case for local and on-premise AI based on open-source models. In collaboration with Dell, the initiative positions locally hosted models as a compelling alternative to cloud APIs, addressing GPU shortages while offering cost, speed, and security advantages.



Product Briefs May 19

Short Takes May 19
Industry

Thom Wolf Analyzes AI-Era Software Architecture Shifts

An early reflection on structural changes in software driven by AI, with a TL;DR summary.

Dell TechWorld

Hugging Face Showcases Multi-Model Selection

Kimi K2.6, DeepSeek and other models featured with emphasis on model choice without infrastructure chaos.

Codex

Codex Plugin Builds macOS Apps from Voice Dictation

The "Build macOS App" plugin enables voice-driven macOS application development through Codex.

ChatGPT

ChatGPT Images 2.0 Surpasses 1 Billion in India

Sam Altman announced the milestone, calling it an inspiring adoption signal in one of the world's largest markets.

Google

Sundar Pichai Previews Google I/O 2026

Google's CEO posted from the road to I/O 2026, confirming the event starts today at 10 a.m. PT.

Opinion

Clement Delangue: Serious AI Firms Will Train Their Own Models

All serious AI companies will ultimately train proprietary models based on open source rather than relying on external APIs.

Hugging Face

Local AI Seen as Answer to GPU Shortages

Hugging Face CEO argues on-prem and local AI based on open-source models is cheaper, faster, and safer than cloud.

Luma Labs

UNI-1.1 Excels at Cinematic Composition and Style Editing

Strong at reference-based generation, cinematic composition, and style-consistent editing across multiple subjects.

xAI

xAI Internal Test Model Trained on Cursor Chat Logs

Almost all company Cursor chat logs were used for training an internal test model with reportedly excellent results.

RL

Composer 2.5 Outperforms Weight Class via RL Breakthroughs

xAI employees note the model fights far above its weight class thanks to reinforcement learning advances.


© 2026 FAV0 · AI Daily · Automated editorial compilation