May 23, 2026 · Saturday

Codex Adds Remote Mac Control, Works on Lock Screen

OpenAI launches computer use feature enabling remote control of Mac GUI apps through Codex, supporting secure locked-screen operation without requiring command-line interfaces or plugins.

OpenAI has released a significant update to its Codex application, introducing a new Computer Use feature that fundamentally changes how AI agents interact with desktop software. macOS users who install the plugin and grant screen recording and accessibility permissions can now let Codex control graphical interface applications directly — clicking buttons, filling forms, and navigating menus — even when the Mac is locked. The locked-screen mode temporarily unlocks the Mac while blocking local user input, allowing only explicitly authorized tasks to proceed. This capability targets scenarios where no command-line interface or third-party plugin support exists, such as inspecting desktop applications, performing complex browser operations, and reproducing GUI bugs that are otherwise difficult to automate. The feature does not support automated terminal access or self-referential Codex operations, and the locked-use mode requires separate opt-in activation for each session.

DeepSeek V4-Pro Permanently Lowered, Price War Continues

DeepSeek converts promotional pricing into a permanent discount.

DeepSeek has converted its previous limited-time promotional pricing into a permanent discount, significantly reducing the API cost of V4-Pro access. The move intensifies the ongoing price war among frontier model providers and signals DeepSeek’s commitment to making high-end inference affordable for developers at scale. With over 12,000 likes and 1.8 million views on the announcement, developer enthusiasm for cheaper frontier access is unmistakable.

Runway Releases Aleph 2.0, Full Video Editor at 30s 1080p

Aleph 2.0 edits specific elements while preserving the rest of the clip.

Runway has introduced Aleph 2.0, an upgraded video editing model that allows precise changes to specific elements while preserving everything else in the clip. Available inside the new Edit Studio, the model supports multi-shot sequences up to 30 seconds long at 1080p resolution. Users can edit dialogue, replace backgrounds, alter objects, and adjust lighting — all through natural language prompts — without affecting the surrounding footage. The release marks a significant step forward in AI-assisted professional video production.

Anthropic Finds Over 10K Critical Vulnerabilities in Essential Software

Anthropic’s Project Glasswing, a collaborative AI cybersecurity initiative launched last month, has already uncovered more than ten thousand high- or critical-severity vulnerabilities in essential software alongside its partners. The project demonstrates how AI-driven security auditing can operate at a scale and speed unattainable by traditional manual review processes, potentially reshaping how organizations approach vulnerability discovery in their software supply chains.

Cursor Ships SDK for Custom Agents in Python & TypeScript

Cursor has released its SDK, enabling developers to build custom agents using Composer 2.5 with support for both Python and TypeScript. The SDK exposes the full Composer reasoning engine programmatically, allowing teams to embed agentic workflows into their own tools and pipelines. To encourage experimentation over the long weekend, Cursor is offering Composer usage at 90% off through the SDK.

Perplexity Open-Sources Bumblebee Security Scanner

Perplexity AI has open-sourced Bumblebee, a read-only scanner for macOS and Linux developer machines. The tool inventories risky packages, browser extensions, and AI tool configurations to surface supply-chain exposure. When connected to Perplexity’s Computer platform, Bumblebee can automatically trigger deep scans whenever new supply-chain risks emerge, enabling continuous security posture monitoring across development fleets.

“After some mathematical rewrite, turns out all of transformer is a series of GEMM + epilogue. Given a few optimized primitives, LLMs — and novice humans — can write speed-of-light kernels for all transformer ops.”
— Tri Dao, creator of FlashAttention

Microsoft Revokes Internal Claude Code Licenses, Pushes GitHub Copilot

Microsoft began promoting Claude Code internally last December before reversing course.

According to The Verge, Microsoft is pulling Claude Code permissions from its developers, citing token billing costs, and requiring teams to switch to the company’s own GitHub Copilot CLI. Microsoft had begun promoting Claude Code internally in December, encouraging project managers and designers without traditional coding backgrounds to try building software with AI. Half a year later, the experiment appears to have collided with corporate cost discipline and platform loyalty.

Luma Debuts Seedance 2.0 for Cinematic-Quality Video Generation

Seedance 2.0 renders portraits, landscapes, sci-fi, and fantasy scenes at cinematic quality.

Luma Agents has launched Seedance 2.0, generating cinematic-quality imagery spanning portraits, landscapes, sci-fi, and fantasy worlds with what the company calls “instant cinematic reality.” The new model renders every shot at a quality that speaks for itself, with the full creative pipeline — planning, generation, iteration, and optimization — managed by Luma Agents as a force multiplier for creative teams.

Ethan Mollick demonstrates Gemini Omni’s native video editing capabilities.

Gemini Omni Proves Full Multimodality with Native Video Editing

Ethan Mollick highlights a key differentiator for Gemini Omni that many overlook: it is fully multimodal, meaning it can natively edit video rather than merely generating it. Taking the famous 1896 film of a train arriving at a station, he demonstrated transformations into a bullet train, a LEGO version, and additions of a time traveler, a centipede, and muppets — all while preserving the original motion and spatial coherence. This native editing capability sets Omni apart from generation-only video models and points toward a future where AI modifies existing media with surgical precision rather than starting from scratch.

Product & Research RoundupMay 23

PRODUCT

Ideogram MCP Brings Image Generation to Claude, ChatGPT, Cursor

Ideogram MCP allows direct image generation, design work, and custom model training without leaving the chat interface in Claude, ChatGPT, Cursor, and other MCP-compatible environments.

DEEPMIND

SynthID Watermark Expands, Queryable via Gemini and Search

Google DeepMind’s SynthID, the imperceptible watermark for AI-generated content, is expanding to more partners. Users can now query whether content is AI-generated directly through the Gemini app or Google Search.

INDUSTRY

Cloudflare CEO: How AI Decides Who Gets Laid Off

Cloudflare laid off approximately 1,100 employees — 20 percent of its workforce — while simultaneously hiring 1,111 interns. CEO Matthew Prince wrote a Wall Street Journal column explaining his framework for using AI to determine which roles to eliminate.

Model Releases & BenchmarksMay 23

OPEN-SOURCE

llama.cpp Gets Full WebGPU Backend

llama.cpp and ggml now support a complete WebGPU backend, enabling large language models to run directly in the browser with GPU acceleration.

RESEARCH

Allen AI’s ArtifactLinker Predicts Model Benchmarks

ArtifactLinker automatically predicts which benchmarks a model will perform well on, addressing the problem that most models are only evaluated on a fraction of available tests.

SPEED

GLM-5.1-HighSpeed Hits 400 Tokens per Second

GLM-5.1-HighSpeed achieves 400 tokens per second on flagship-tier LLM API, setting a new speed benchmark without trading off model size for throughput.

MODEL

Qwen3.7-Max Launches with 1M Context for the Agent Era

Together AI introduces Qwen3.7-Max, Alibaba Qwen’s flagship model designed for the agent era, featuring 1-million-token context and leading benchmark performance.

BENCHMARK

ARC-AGI-3 Sees First Meaningful Score Jump

tufalabs doubled their score from 0.68% to 1.17% in the ARC-AGI-3 competition, marking the first significant improvement in this reasoning benchmark.

TRAINING

Adaption AutoScientist Reaches Frontier Model in Two Days

Adaption’s AutoScientist tool lets users train a frontier-level model within two days, with free compute resources offered throughout the next month.

TOOLING

Kakuna Automates Code Hardening via Subagent Checklists

Kakuna uses skill checklists with parallel sub-agents to harden codebases automatically, returning an audited version with all the boring work done.

CAPABILITIES

Modern LLMs Solve 100-Digit Multiplication Without Tools

teortaxesTex demonstrates that modern language models can perform 100-digit multiplication through chain-of-thought scaling alone, challenging the “embers of autoregression” thesis.

DATA

20 Billion Web Pages Now Queryable on Hugging Face via SQL

CommonCrawl’s April 2026 crawl data and URL index are now available on Hugging Face, enabling SQL-based analysis of over two billion web pages with zero download required.

PLATFORM

Xiaohongshu Adds AI Skill Upload Feature

Xiaohongshu now enables users to upload custom AI skills directly to the platform, opening a new distribution channel for AI capabilities within the popular social commerce app.

DEEPMIND

Project Genie Turns Google Street View into Interactive Worlds

Real U.S. locations transformed into interactive virtual worlds.

Google DeepMind’s Project Genie now integrates with Google Maps Street View, allowing users to take real U.S. locations and transform them into new, interactive virtual worlds through generative AI.

PRODUCT

Claude Pro Gets Auto Mode with Sonnet 4.6 and Opus 4.7

Claude Devs extends auto mode to Pro plan subscribers and adds support for Sonnet 4.6 alongside Opus 4.7, allowing users to press Shift+Tab and let Claude run autonomously.

PLATFORM

Microsoft Foundry Partners with Hugging Face on Open-Source Image Models

Three open-source image models are now available on Microsoft Foundry, bringing developers the largest catalog for AI innovation through the collaboration with Hugging Face.

PLATFORM

Cohere Command A+ Now Available as Managed Compute on Microsoft Foundry

Cohere’s latest open-source model Command A+ is now offered as a managed compute service on Microsoft Foundry, expanding enterprise deployment options.

RETAIL

Zalando Achieves 48-Hour 3D Scan to Storefront with NVIDIA AI

Zalando integrates NVIDIA Cosmos, Gen-3C, and DiffusionRenderer for rapid 3D production.

Zalando is scaling high-fidelity 3D product production by integrating ALLSIDES’ AI-powered photometric 3D scanning platform, powered by NVIDIA Cosmos, Gen-3C, and DiffusionRenderer. The pipeline goes from scan to storefront in just 48 hours.

TOOLING

LM Studio 0.4.14 Adds Multi-Token Prediction Support

LM Studio releases version 0.4.14 with Multi-Token Prediction (MTP) functionality, enabling faster inference by predicting multiple tokens simultaneously.

MOBILE

PixVerse App Adds On-Device Image Generation

PixVerse mobile app now supports image generation from prompts or reference images, with three free generations for all users during the May 24–31 launch window.

INFRA

CommonCrawl Recommends Hugging Face Buckets for Large Training Datasets

CommonCrawl recommends Hugging Face Buckets for storing large, constantly evolving training datasets. The content-defined chunking technology reportedly reduces upload volumes by 75% by only transmitting changed portions of datasets.

INSIGHT

“The Model Alone Is No Longer the Product”

OpenAI’s Greg Brockman observes that models in isolation are insufficient as products, pointing to a future where full application ecosystems define value in AI rather than raw model capabilities alone.

INSIGHT

Developers Reflect on Coding Before Codex

Greg Brockman captures the generational shift in software development, noting how quickly AI-assisted coding has become the default, making the pre-Codex era feel distant even to veteran engineers.