May 5, 2026 · Tuesday

Runway Launches Real-Time Video Agent: 24fps HD Conversational Video from Single Image

Runway introduces Runway Characters, turning a single image into a fully expressive, conversational real-time video agent streamed at 24fps HD with an end-to-end latency of just 1.75 seconds.

Runway Characters transforms a single image into a fully expressive, conversational video agent in real time at 24fps.

Real-time video agents are here. Runway has built Runway Characters, allowing users to turn one image into a fully expressive, conversational video agent streaming at 24 frames per second in HD. With just 1.75 seconds of end-to-end latency, the system represents a significant leap in AI-driven video synthesis. The technology combines facial animation, voice synthesis, and real-time streaming into a single pipeline that responds to natural language input. This opens new frontiers for interactive media, virtual assistants, and personalized content creation at scale, blurring the boundary between recorded and generated video.

PRODUCT

xAI Launches Grok Voice API Voice Cloning Feature

xAI releases a voice cloning feature for the Grok Voice API, enabling cloning of natural-sounding speech from short audio recordings and voice library management via a console for personalized brand voice customization.

Two voices. One human. One AI. Voice cloning with natural emotion is now live on Grok Voice API.

Two voices. One human. One AI. Voice cloning rich with natural emotion is now live on the Grok Voice API. Users can clone voices from short recordings and manage voice libraries through the xAI console, opening up personalized voice experiences for brands and developers. The feature supports natural emotional inflection, making cloned voices indistinguishable from human speech in conversation.

PRODUCT

Ollama Supports Claude Desktop, Enables Third-Party Inference

Ollama now supports all models on Ollama Cloud, including Claude Cowork and Claude Code, via Claude Desktop's built-in third-party inference feature.

All models from Ollama Cloud can now be used across Claude Cowork and Claude Code from Claude Desktop.

Ollama now supports Claude Desktop via built-in third-party inference. The integration allows all models from Ollama Cloud to be used across Claude Cowork and Claude Code directly from the Claude Desktop app. This bridges the gap between self-hosted open-source models and frontier AI coding tools, giving developers a seamless path to leverage local models within Anthropic's ecosystem.

We need to create a new term for the attacks some Chinese labs are doing on APIs that is different than distillation, or else we risk tarnishing a crucial technique that is fundamental to AI diffusion, academic research, and the open-source ecosystem.
Nathan Lambert, interconnects.ai

OPEN SOURCE

Vercel Launches Open-Source Agent Orchestrator deepsec for Deep Security Review

Vercel's CEO announces the open-source agent orchestrator deepsec, designed for deep security review and validated on several major OSS projects. Coding agents can now autonomously find critical vulnerabilities.

Vercel introduces deepsec, an open-source agent orchestrator purpose-built for deep security reviews. Initially developed for internal use, the tool was validated against several major open-source projects and gained enough conviction to be shared publicly. Coding agents powered by deepsec can autonomously probe codebases for critical vulnerabilities, misconfigurations, and supply-chain risks. The orchestrator coordinates multiple specialized agents, each focusing on different attack surfaces, and synthesizes their findings into actionable reports. This represents a shift toward proactive, automated security auditing in the software development lifecycle.

Perplexity Computer now available inside Microsoft Teams workspace.

PRODUCT

Perplexity Computer Integrates with Microsoft Teams

Perplexity Computer is now available within Microsoft Teams, allowing users to conduct research, analysis, and document creation directly in the Teams workspace with the same capabilities as the standalone Computer product.

Luma Agents turns creative concepts into complete advertising systems automatically.

PRODUCT

Luma Launches Creative Agent for Full Ad Systems

Luma Agents automates the entire process from planning and generation to iterative optimization, turning creative ideas into complete advertising systems. Users define the concept and aesthetic direction, then the agent handles the rest.

HARDWARE

GB300 Ultra NVL72 Leaks: 2.7x Faster Than GB200 on Inference

SemiAnalysis reports that the GB300 Ultra NVL72 is 2.7 times faster than the GB200 NVL72 on industry-standard inference benchmarks, marking a significant generational leap in AI training and inference hardware performance.

RESEARCH

DeepSeek-V4: Mixed Attention Cuts KV Cache by 90%, Supports 1M-Token Context

DeepSeek-V4 uses a hybrid attention and sparse MoE architecture that reduces KV cache by up to 90%, enabling support for context lengths of one million tokens while maintaining inference efficiency.

INDUSTRY

NVIDIA: AI Is a Five-Layer Cake — Energy, Chips, Infrastructure, Models, Apps

NVIDIA frames AI infrastructure as five interdependent layers: energy, chips, infrastructure, models, and applications. The countries and companies that build the full stack will define the next industrial era.

MODEL RELEASE

IBM Granite 4.1-8B Released, Optimized for 8–16GB VRAM Hardware

The IBM Granite 4.1-8B model is now open-sourced on Hugging Face, specifically optimized for hardware with 8 to 16GB of VRAM, advancing the frontier of accessible open-source AI for developers.

Agent & Model Innovations 05.05

MODEL

nanowhale: Small DeepSeek Model Fully Pretrained by an Agent

Inspired by Karpathy's nanochat, nanowhale is a tiny DeepSeek model entirely pretrained by an AI agent, showcasing automated model training as a new paradigm. The project demonstrates that agents can handle the full pretraining pipeline autonomously.

TOOL

XGrammar-2: Structured Generation for Complex Agent Harnesses

XGrammar-2 introduces structured generation for complex agent frameworks, supporting strict tool-calling formats with built-in DeepSeek integration. It ensures reliable output formatting for multi-agent orchestration scenarios.

PRODUCT

Grok 4.3 Builds an Entire Game from a Single Prompt

Grok 4.3 demonstrated the ability to build a complete playable game from a single prompt, featuring the fastest token output speed of any model and outperforming Claude Sonnet in end-to-end generation speed.

PUBLISHING

François Chollet's "Deep Learning with Python" Now Free to Read Online

The definitive guide to deep learning, which sold 120,000 copies and helped tens of thousands launch their careers, is now available to read online for free. The book demystifies how deep learning works and how to apply it effectively.

PRODUCT

Replit: Build Full Pitch Decks by Describing What You Want

Replit now lets users generate full pitch decks without touching a single slide. Describe your idea, iterate in chat, edit visually, then export to PPTX, Google Slides, or PDF, or publish as a live URL.

PAPER

Web2BigTable: Multi-Agent LLM System for Internet-Scale Search

A bi-level multi-agent framework for internet-scale web search and table extraction. On the WideSearch benchmark, it achieves an Avg@4 success rate of 38.50, dramatically outperforming the second-place score of 5.10.

MODEL

Qwen 3.6: High TPS on Just 12GB VRAM

Community-shared Qwen 3.6 configs deliver fast tokens-per-second even on consumer GPUs with only 12GB VRAM.

RESEARCH

Can Open-Weight Coding Agents Match Claude Code?

New study explores whether open-weight coding agents with harnesses can rival Claude Code on training domain-specific models.

HARDWARE

Blackwell Ultra: Named for Ultra Performance

NVIDIA's Blackwell Ultra derives its name from its ultra-high GPU performance, confirmed by SemiAnalysis.

Anthropic co-founder Jack Clark says 60% chance of RSI by end of 2028.
via @goodside

Community & Short Takes 05.05

EDUCATION

AI Multi-Modal Learning Platform for Deaf Students

Replit CEO Amjad Masad spotlights an AI-powered multi-modal learning platform purpose-built for deaf students.

REPLIT

Most Agentic Parallelism Anywhere Online Happens on Replit

Amjad Masad notes Replit hosts more parallel agentic development activity than any other internet platform: 10 active, 198 draft, 700+ completed.

TOOL

Hugging Face Model Visualizer Lets You Explore Any Architecture

A new community tool visualizes Hugging Face model architectures at any granularity by simply entering a model URL, supporting layer-level exploration and cross-model comparison.

PAPERS

Top Papers: Recursive Multi-Agent Systems and World Modeling

Hugging Papers highlights the week's best research on recursive multi-agent systems, agentic world modeling, and AI organizational structures.

PAPER

UniVidX: Unified Multimodal Framework for Video Generation via Diffusion Priors

UniVidX proposes a unified multimodal framework leveraging diffusion priors, achieving SOTA on RGB and RGBA layer composition tasks.

TRENDING

DeepSeek, Xiaomi, OpenAI Models Trending on Hugging Face

Current trending open models on Hugging Face include releases from DeepSeek, Xiaomi, OpenAI, Mistral AI, and AI Pool, reflecting a diverse open-source landscape.

OPINION

Software Is a Cache of Agents

A thought-provoking thesis: traditional software is essentially a cache of proven agent workflows, crystallizing reliable multi-step processes into deterministic logic that no longer requires runtime reasoning.

RESEARCH

Transformer Gradients Are Sparse — Low-Rank Exploration Justified

An investigation into Transformer gradients reveals they are sparse in certain dimensions, validating low-rank approximation methods for efficient model training and fine-tuning.

CLAUDE

Claude 4.7 Accurately Explains the Origins of Prompt Injection

A Claude 4.7 research report precisely traced the history of prompt injection attacks, accurately referencing early tweets and adversarial examples that first demonstrated the vulnerability.

PRODUCT

Luma Agents Generate Winning Client Pitch Boards

Luma Agents automatically plans, generates, and optimizes client pitch boards. Users set the brief and aesthetic direction, and the agent produces high-quality proposals designed to win.