May 5, 2026 · Tuesday

We need to create a new term for the attacks some Chinese labs are doing on APIs that is different than distillation, or else we risk tarnishing a crucial technique that is fundamental to AI diffusion, academic research, and the open-source ecosystem.

Nathan Lambert, interconnects.ai
Perplexity Computer now available inside Microsoft Teams workspace.
PRODUCT

Perplexity Computer Integrates with Microsoft Teams

Perplexity Computer is now available within Microsoft Teams, allowing users to conduct research, analysis, and document creation directly in the Teams workspace with the same capabilities as the standalone Computer product.

Luma Agents turns creative concepts into complete advertising systems automatically.
PRODUCT

Luma Launches Creative Agent for Full Ad Systems

Luma Agents automates the entire process from planning and generation to iterative optimization, turning creative ideas into complete advertising systems. Users define the concept and aesthetic direction, then the agent handles the rest.

HARDWARE

GB300 Ultra NVL72 Leaks: 2.7x Faster Than GB200 on Inference

SemiAnalysis reports that the GB300 Ultra NVL72 is 2.7 times faster than the GB200 NVL72 on industry-standard inference benchmarks, marking a significant generational leap in AI training and inference hardware performance.

RESEARCH

DeepSeek-V4: Mixed Attention Cuts KV Cache by 90%, Supports 1M-Token Context

DeepSeek-V4 uses a hybrid attention and sparse MoE architecture that reduces KV cache by up to 90%, enabling support for context lengths of one million tokens while maintaining inference efficiency.

INDUSTRY

NVIDIA: AI Is a Five-Layer Cake — Energy, Chips, Infrastructure, Models, Apps

NVIDIA frames AI infrastructure as five interdependent layers: energy, chips, infrastructure, models, and applications. The countries and companies that build the full stack will define the next industrial era.

MODEL RELEASE

IBM Granite 4.1-8B Released, Optimized for 8–16GB VRAM Hardware

The IBM Granite 4.1-8B model is now open-sourced on Hugging Face, specifically optimized for hardware with 8 to 16GB of VRAM, advancing the frontier of accessible open-source AI for developers.

Agent & Model Innovations 05.05

Anthropic co-founder Jack Clark says 60% chance of RSI by end of 2028.

via @goodside
Community & Short Takes 05.05
EDUCATION

AI Multi-Modal Learning Platform for Deaf Students

Replit CEO Amjad Masad spotlights an AI-powered multi-modal learning platform purpose-built for deaf students.

REPLIT

Most Agentic Parallelism Anywhere Online Happens on Replit

Amjad Masad notes Replit hosts more parallel agentic development activity than any other internet platform: 10 active, 198 draft, 700+ completed.

TOOL

Hugging Face Model Visualizer Lets You Explore Any Architecture

A new community tool visualizes Hugging Face model architectures at any granularity by simply entering a model URL, supporting layer-level exploration and cross-model comparison.

PAPERS

Top Papers: Recursive Multi-Agent Systems and World Modeling

Hugging Papers highlights the week's best research on recursive multi-agent systems, agentic world modeling, and AI organizational structures.

PAPER

UniVidX: Unified Multimodal Framework for Video Generation via Diffusion Priors

UniVidX proposes a unified multimodal framework leveraging diffusion priors, achieving SOTA on RGB and RGBA layer composition tasks.

TRENDING

DeepSeek, Xiaomi, OpenAI Models Trending on Hugging Face

Current trending open models on Hugging Face include releases from DeepSeek, Xiaomi, OpenAI, Mistral AI, and AI Pool, reflecting a diverse open-source landscape.

OPINION

Software Is a Cache of Agents

A thought-provoking thesis: traditional software is essentially a cache of proven agent workflows, crystallizing reliable multi-step processes into deterministic logic that no longer requires runtime reasoning.

RESEARCH

Transformer Gradients Are Sparse — Low-Rank Exploration Justified

An investigation into Transformer gradients reveals they are sparse in certain dimensions, validating low-rank approximation methods for efficient model training and fine-tuning.

CLAUDE

Claude 4.7 Accurately Explains the Origins of Prompt Injection

A Claude 4.7 research report precisely traced the history of prompt injection attacks, accurately referencing early tweets and adversarial examples that first demonstrated the vulnerability.

PRODUCT

Luma Agents Generate Winning Client Pitch Boards

Luma Agents automatically plans, generates, and optimizes client pitch boards. Users set the brief and aesthetic direction, and the agent produces high-quality proposals designed to win.