DeepSeek V4-Pro Permanently Lowered, Price War Continues
DeepSeek has converted its previous limited-time promotional pricing into a permanent discount, significantly reducing the API cost of V4-Pro access. The move intensifies the ongoing price war among frontier model providers and signals DeepSeek’s commitment to making high-end inference affordable for developers at scale. With over 12,000 likes and 1.8 million views on the announcement, developer enthusiasm for cheaper frontier access is unmistakable.
Runway Releases Aleph 2.0, Full Video Editor at 30s 1080p
Runway has introduced Aleph 2.0, an upgraded video editing model that allows precise changes to specific elements while preserving everything else in the clip. Available inside the new Edit Studio, the model supports multi-shot sequences up to 30 seconds long at 1080p resolution. Users can edit dialogue, replace backgrounds, alter objects, and adjust lighting — all through natural language prompts — without affecting the surrounding footage. The release marks a significant step forward in AI-assisted professional video production.
Anthropic Finds Over 10K Critical Vulnerabilities in Essential Software
Anthropic’s Project Glasswing, a collaborative AI cybersecurity initiative launched last month, has already uncovered more than ten thousand high- or critical-severity vulnerabilities in essential software alongside its partners. The project demonstrates how AI-driven security auditing can operate at a scale and speed unattainable by traditional manual review processes, potentially reshaping how organizations approach vulnerability discovery in their software supply chains.
Cursor Ships SDK for Custom Agents in Python & TypeScript
Cursor has released its SDK, enabling developers to build custom agents using Composer 2.5 with support for both Python and TypeScript. The SDK exposes the full Composer reasoning engine programmatically, allowing teams to embed agentic workflows into their own tools and pipelines. To encourage experimentation over the long weekend, Cursor is offering Composer usage at 90% off through the SDK.
Perplexity Open-Sources Bumblebee Security Scanner
Perplexity AI has open-sourced Bumblebee, a read-only scanner for macOS and Linux developer machines. The tool inventories risky packages, browser extensions, and AI tool configurations to surface supply-chain exposure. When connected to Perplexity’s Computer platform, Bumblebee can automatically trigger deep scans whenever new supply-chain risks emerge, enabling continuous security posture monitoring across development fleets.
“After some mathematical rewrite, turns out all of transformer is a series of GEMM + epilogue. Given a few optimized primitives, LLMs — and novice humans — can write speed-of-light kernels for all transformer ops.”
— Tri Dao, creator of FlashAttention
Microsoft Revokes Internal Claude Code Licenses, Pushes GitHub Copilot
According to The Verge, Microsoft is pulling Claude Code permissions from its developers, citing token billing costs, and requiring teams to switch to the company’s own GitHub Copilot CLI. Microsoft had begun promoting Claude Code internally in December, encouraging project managers and designers without traditional coding backgrounds to try building software with AI. Half a year later, the experiment appears to have collided with corporate cost discipline and platform loyalty.
Luma Debuts Seedance 2.0 for Cinematic-Quality Video Generation
Luma Agents has launched Seedance 2.0, generating cinematic-quality imagery spanning portraits, landscapes, sci-fi, and fantasy worlds with what the company calls “instant cinematic reality.” The new model renders every shot at a quality that speaks for itself, with the full creative pipeline — planning, generation, iteration, and optimization — managed by Luma Agents as a force multiplier for creative teams.
Gemini Omni Proves Full Multimodality with Native Video Editing
Ethan Mollick highlights a key differentiator for Gemini Omni that many overlook: it is fully multimodal, meaning it can natively edit video rather than merely generating it. Taking the famous 1896 film of a train arriving at a station, he demonstrated transformations into a bullet train, a LEGO version, and additions of a time traveler, a centipede, and muppets — all while preserving the original motion and spatial coherence. This native editing capability sets Omni apart from generation-only video models and points toward a future where AI modifies existing media with surgical precision rather than starting from scratch.
Ideogram MCP Brings Image Generation to Claude, ChatGPT, Cursor
Ideogram MCP allows direct image generation, design work, and custom model training without leaving the chat interface in Claude, ChatGPT, Cursor, and other MCP-compatible environments.
SynthID Watermark Expands, Queryable via Gemini and Search
Google DeepMind’s SynthID, the imperceptible watermark for AI-generated content, is expanding to more partners. Users can now query whether content is AI-generated directly through the Gemini app or Google Search.
Cloudflare CEO: How AI Decides Who Gets Laid Off
Cloudflare laid off approximately 1,100 employees — 20 percent of its workforce — while simultaneously hiring 1,111 interns. CEO Matthew Prince wrote a Wall Street Journal column explaining his framework for using AI to determine which roles to eliminate.
llama.cpp Gets Full WebGPU Backend
llama.cpp and ggml now support a complete WebGPU backend, enabling large language models to run directly in the browser with GPU acceleration.
Allen AI’s ArtifactLinker Predicts Model Benchmarks
ArtifactLinker automatically predicts which benchmarks a model will perform well on, addressing the problem that most models are only evaluated on a fraction of available tests.
GLM-5.1-HighSpeed Hits 400 Tokens per Second
GLM-5.1-HighSpeed achieves 400 tokens per second on flagship-tier LLM API, setting a new speed benchmark without trading off model size for throughput.
Qwen3.7-Max Launches with 1M Context for the Agent Era
Together AI introduces Qwen3.7-Max, Alibaba Qwen’s flagship model designed for the agent era, featuring 1-million-token context and leading benchmark performance.
ARC-AGI-3 Sees First Meaningful Score Jump
tufalabs doubled their score from 0.68% to 1.17% in the ARC-AGI-3 competition, marking the first significant improvement in this reasoning benchmark.
Adaption AutoScientist Reaches Frontier Model in Two Days
Adaption’s AutoScientist tool lets users train a frontier-level model within two days, with free compute resources offered throughout the next month.
Kakuna Automates Code Hardening via Subagent Checklists
Kakuna uses skill checklists with parallel sub-agents to harden codebases automatically, returning an audited version with all the boring work done.
Modern LLMs Solve 100-Digit Multiplication Without Tools
teortaxesTex demonstrates that modern language models can perform 100-digit multiplication through chain-of-thought scaling alone, challenging the “embers of autoregression” thesis.
20 Billion Web Pages Now Queryable on Hugging Face via SQL
CommonCrawl’s April 2026 crawl data and URL index are now available on Hugging Face, enabling SQL-based analysis of over two billion web pages with zero download required.
Xiaohongshu Adds AI Skill Upload Feature
Xiaohongshu now enables users to upload custom AI skills directly to the platform, opening a new distribution channel for AI capabilities within the popular social commerce app.
Project Genie Turns Google Street View into Interactive Worlds
Google DeepMind’s Project Genie now integrates with Google Maps Street View, allowing users to take real U.S. locations and transform them into new, interactive virtual worlds through generative AI.
Claude Pro Gets Auto Mode with Sonnet 4.6 and Opus 4.7
Claude Devs extends auto mode to Pro plan subscribers and adds support for Sonnet 4.6 alongside Opus 4.7, allowing users to press Shift+Tab and let Claude run autonomously.
Microsoft Foundry Partners with Hugging Face on Open-Source Image Models
Three open-source image models are now available on Microsoft Foundry, bringing developers the largest catalog for AI innovation through the collaboration with Hugging Face.
Cohere Command A+ Now Available as Managed Compute on Microsoft Foundry
Cohere’s latest open-source model Command A+ is now offered as a managed compute service on Microsoft Foundry, expanding enterprise deployment options.
Zalando Achieves 48-Hour 3D Scan to Storefront with NVIDIA AI
Zalando is scaling high-fidelity 3D product production by integrating ALLSIDES’ AI-powered photometric 3D scanning platform, powered by NVIDIA Cosmos, Gen-3C, and DiffusionRenderer. The pipeline goes from scan to storefront in just 48 hours.
LM Studio 0.4.14 Adds Multi-Token Prediction Support
LM Studio releases version 0.4.14 with Multi-Token Prediction (MTP) functionality, enabling faster inference by predicting multiple tokens simultaneously.
PixVerse App Adds On-Device Image Generation
PixVerse mobile app now supports image generation from prompts or reference images, with three free generations for all users during the May 24–31 launch window.
CommonCrawl Recommends Hugging Face Buckets for Large Training Datasets
CommonCrawl recommends Hugging Face Buckets for storing large, constantly evolving training datasets. The content-defined chunking technology reportedly reduces upload volumes by 75% by only transmitting changed portions of datasets.
“The Model Alone Is No Longer the Product”
OpenAI’s Greg Brockman observes that models in isolation are insufficient as products, pointing to a future where full application ecosystems define value in AI rather than raw model capabilities alone.
Developers Reflect on Coding Before Codex
Greg Brockman captures the generational shift in software development, noting how quickly AI-assisted coding has become the default, making the pre-Codex era feel distant even to veteran engineers.