May 17, 2026 · Sunday









Industry Briefs05.17
ARENA

Anthropic and ZAI Models Lead Slides Arena Soft-Verification Tests

In slide generation arena results, Anthropic and ZAI models performed best on soft verification benchmarks.

INFERENCE

TOKENSPEED_MLA Integrated into vLLM Blackwell Backend for DeepSeek-R1 and Kimi K2.5

Lightseek announced TOKENSPEED_MLA has been integrated into vLLM, optimizing inference on Blackwell GPUs.

PLUGINS

Grok CLI Gains Vercel Plugin for Cloud Deployment Superpowers

Grok CLI now supports Plugins and Skills. Installing the Vercel Plugin gives Grok seamless cloud deployment on Vercel from the command line.

SECURITY

Vercel Protects Agent Deployments Behind SSO for Production Environments

Vercel enables SSO protection via Okta for AI-generated app deployments from v0, Codex, Claude and others, creating a secure intranet for agent-built applications.

MODELS

Open Model Roundup: Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5 and More

Interconnects.ai released its 21st newsletter covering the latest wave of open model releases.

POLICY

US CAISI and Epoch AI Diverge in Assessments of Open vs. Closed Model Gap

Two leading AI research organizations now disagree on how wide the performance gap between open and closed models really is.

PRODUCT

Codex Can Control Multiple Computers Across Devices in a Single Project

op7418 discovered Codex can switch control across multiple computers from the ChatGPT app without switching devices.

PROMPT ENGINEERING

Codex Side Chat System Prompt Revealed: Lightweight Q&A Without Disturbing Main Thread

Dotey shared the system prompt behind Codex's side chat feature, designed for non-disruptive exploration.

ENGINEERING

Key to AI Long Tasks: Small-Stage Planning and Verification at Each Step

Dotey argues that breaking tasks into small phases with explicit verification methods such as unit tests is essential for keeping AI productive over extended sessions.

If you're not obsessed with the research problem you're working on, for its own sake, you're unlikely to succeed. Intrinsic motivation is far more powerful than external rewards.

François Chollet
INSIGHT

Tokens Are Rapidly Becoming the Universal Input for Solving Problems

Sam Altman observed that tokens are replacing traditional structured inputs across an expanding range of problem domains.

ACCESS

ChatGPT Plus Now Available Nationwide Across Malta

OpenAI announced that ChatGPT Plus subscription service is now fully accessible across the entire country of Malta.

MOBILE

Using Codex from the ChatGPT App Is a Freeing Experience

Altman remarked that Codex on mobile makes you realize how tethered you normally are to your desktop computer.

SECURITY

Using GPT for Defensive Security

Altman highlighted the growing role of GPT and LLMs in defensive cybersecurity applications.

PRODUCT

Codex Is in a Category of Its Own: Agentic Excel on Mac

Altman described Codex as being in a distinct product category, calling it agentic Excel on Mac.

CODE

Codex for Improving Computational Complexity

Altman shared Codex's utility for analyzing and improving algorithmic computational complexity.

PEOPLE

KempeLab Chief to Leave Meta FAIR After Leading LLM Reasoning Research

The head of KempeLab, which focused on LLM reasoning research at Meta FAIR, announced their departure after nearly two years.

RESEARCH

New Benchmark Tests Models' Ability to Pinpoint Reasoning Errors Mid-Trace

A regression task challenges models to identify where a long reasoning trace first goes wrong, testing fine-grained error detection.

PAPER

On-Policy RL and Distillation Found to Rely on Labeled Final-Answer Data

Research shows on-policy reinforcement learning and distillation algorithms typically depend on privileged labeled answers and process rewards.

BCI

Neuralink Helps Paralyzed Patient Regain Ability to Paint

Neuralink shared the story of a patient paralyzed in a car accident who regained painting ability through a brain-computer interface after surgery.

USAGE

Claude Resets Five-Hour and Weekly Usage Limits for the Weekend

Claude users noted the reset of five-hour and weekly quota limits, renewing access for weekend experimentation.

MODELS

Compressed 120B Models Shown to Retain Substantial General Capability

Research confirms that compressed large models at 120B parameter scale preserve strong general performance, with knowledge retention being an active test frontier.


© 2026 FAV0 · AI Daily