May 24, 2026 · Sunday

Google Claimed AI Built an OS for $916 — The Reality Is Muddier

The alleged "single prompt" was thousands of lines long. Attempt counts, human intervention, and source code remain undisclosed. Independent analysis finds more press release than reproducible science.

This week Google claimed that a team of AI agents built an entire operating system from a single prompt, costing only about nine hundred dollars in tokens. The announcement, which rippled through the AI research community, was met with a mixture of excitement and deep skepticism. NormalTech, an independent technology publication, conducted a thorough fact-check of the claim and published findings that paint a significantly less rosy picture.

According to the analysis, what Google described publicly as a "single prompt" was in reality an elaborate sequence spanning thousands of lines. More importantly, the company disclosed neither the number of attempts that were made, nor how much human intervention was required to reach the final result, nor whether any code was copied from publicly available sources on the internet. Crucially, no prompt, no source code, and no execution logs were open-sourced, making any form of independent verification impossible. The author argues that while these "open-world evaluation" demonstrations show genuine potential, they currently lack the methodology and transparency needed to count as scientific evidence. As a path forward, the piece calls for independent evaluators to be brought into the loop for any such claims.

E-Commerce Giant Now Generates 75% of Visual Media With AI, Costs Down 99%

Projects that once cost $800,000 now go for under $10,000. Every product shot saves roughly $30,000.

One of the world's largest e-commerce companies adopted Runway's AI video and image generation tools, and the results have been staggering. Seventy-five percent of all visual media produced by the company is now AI-generated, fundamentally reshaping its entire creative production pipeline. Projects that previously carried an eight-hundred-thousand-dollar price tag can now be completed for under ten thousand dollars, while each individual product image saves an estimated thirty thousand dollars compared to traditional photo shoots. The scale of the transformation suggests that AI-native visual production is no longer experimental but a core operational capability at the highest levels of global commerce.

MiniMax Launches Gizmo: Dual-LLM Architecture Eliminates Awkward Pauses

The lightning-fast M2-her model handles instant acknowledgements while M2.7 tackles complex reasoning in the background.

MiniMax-backed conversational product Gizmo officially launched with a novel dual large language model architecture designed to solve one of voice AI's most persistent user-experience problems: awkward silence during conversations. The system pairs a lightning-fast lightweight model, M2-her, which delivers instant verbal acknowledgements to keep the conversational flow alive, with a more powerful M2.7 model that processes complex reasoning tasks in parallel. The dual architecture ensures users never experience dead air while waiting for an AI response. The product was co-developed with Gradium and marks MiniMax's expansion into real-time conversational agent territory.

DeepSeek sparse attention (DSA) from-scratch implementation, contributed by a reader to the LLMs-from-scratch repository.

LLMs-from-Scratch Repository Adds DeepSeek Sparse Attention Implementation

Sebastian Raschka's widely-used educational repository has received a new contribution: a from-scratch implementation of DeepSeek sparse attention (DSA), the mechanism that underpins DeepSeek's ability to handle extremely long contexts efficiently. The contribution, submitted by a reader, includes a full motivational overview, detailed explanations of the sparse attention mechanism, and a standalone reference implementation built in the GPT-style model format. The addition makes DSA accessible to students and practitioners who want to understand how modern sparse attention works at the implementation level without relying on opaque library calls.

StepAudio 2.5 Realtime captures tone, pace, pauses, sighs, and micro-emotions during voice interaction.

StepFun Releases StepAudio 2.5: Real-Time Voice Model Reads Tone, Pauses, and Micro-Emotions

StepFun has released StepAudio 2.5 Realtime, a voice model that goes far beyond basic speech recognition to achieve top-tier paralinguistic perception. The system understands not just what you say but how you say it — interpreting tone, pace, pauses, sighs, and even the half-laugh that punctuates a mid-sentence shift in mood. These micro-emotional signals, which humans process instinctively, have long been a blind spot for voice AI systems. StepAudio 2.5 Realtime also offers a bring-your-own-persona capability via API, enabling developers to define custom character voices and interaction styles for their applications.

Google DeepMind Expands AI Partnership With Singapore

Google DeepMind announced an expanded partnership with Singapore, focusing on safe AI deployment at scale. In collaboration with country-level experts, new programs will target accelerating scientific discovery, advancing pandemic preparedness, and improving healthcare outcomes through AI. The partnership signals continued institutional investment in applied AI for public-good domains across Southeast Asia.

NVIDIA GTC Week Kicks Off in Taipei, Autonomous Agents Take the Spotlight

NVIDIA's GTC conference officially opened in Taipei this week, with developers getting hands-on experience building and testing autonomous AI agents. Jensen Huang made a surprise visit to the Meet-a-Claw event, where participants explored agent development tooling and practices. The event sets the tone for a conference week heavily focused on agentic AI workloads and the infrastructure needed to support them at production scale.

Replit Agent + Squidler Enables Fully Automated QA Loop

Replit has integrated the Squidler testing tool into its platform, enabling a complete AI-powered QA cycle. Users describe what their application should do in plain English, and Replit Agent builds it. Squidler then navigates the application the way a real person would, identifying broken behavior, and the Agent automatically applies fixes. The entire loop runs inside Replit's MCP library, closing the gap between AI code generation and production-quality verification.

New Paper Argues for a Unified Science of Intelligence

Surya Ganguli has published a new article in the journal Daedalus proposing an integrated science of intelligence that combines physics, neuroscience, and artificial intelligence. The piece, retweeted by Yann LeCun, argues that tools from complex systems physics can analyze how large neural networks learn, that neuroscience reveals biological intelligence advantages over AI by orders of magnitude, and that quantum hardware and digital-twin approaches can bridge the remaining gaps between artificial and biological cognition.

DeepSeek V4 Pro Supports Million-Token Context, 245TB Variant on the Horizon

DeepSeek V4 Pro can currently handle twenty-four thousand five hundred concurrent instances of one-million-token context windows, and a two-hundred-and-forty-five-terabyte variant is reportedly planned for the future. Observers note that the primary bottleneck is not model capacity but tooling: multi-turn harnesses that can productively utilize multi-day sessions remain underdeveloped. The community has expressed growing impatience with so-called "context compaction" workarounds, calling for more robust native long-context infrastructure instead.

Huawei Optimizes DeepSeek V4 Inference Within Days of Release

Huawei demonstrated inference performance optimizations on DeepSeek V4 within hours of the model's public release, a signal that post-training was already conducted on Huawei hardware. Observers note this marks a significant milestone for the Huawei AI chip ecosystem, suggesting the company's long-rumored dedicated AI datacenter should be coming online soon. The speed of the optimization work implies deep integration between DeepSeek's training pipeline and Huawei's hardware stack.

The tragic thing is that tech is easy and money is hard. Anthropic can have the most inefficient architecture on the market, but so long as they train stronger models and rack up revenue, their lead increases.

teortaxesTex

Gemini Omni Video Editing: Seamlessly Replace Background Locations

A user uploaded a video filmed while riding in a Waymo vehicle in Menlo Park, then instructed Gemini Omni to re-shoot it in a different location using only a screenshot from Google Maps as reference. The result was seamless, with smooth transitions between the original and the new background. This demonstration showcases the growing maturity of AI-powered video editing tools that can realistically alter scene context while preserving natural motion, lighting, and perspective consistency.

Jensen Huang: AI Assistants Give Teams More Leverage

NVIDIA CEO Jensen Huang shared his perspective on how AI assistants are transforming team productivity. In a message accompanying conference week, Huang argued that AI gives individuals and teams considerably more leverage to do their best work, enabling them to move faster, think bigger, and take on challenges that were once considered out of reach. The remarks frame AI not as a replacement for human capability but as a force multiplier.

Codex Used to Build and Debug iPhone Simulator End-to-End

Developer gdb demonstrated using Codex, an AI coding tool, to build and debug a complete iPhone simulator from start to finish. The workflow covered the full development cycle including code generation, runtime debugging, and iterative fixes. The demonstration highlights how AI-assisted development is moving beyond generating isolated code snippets toward managing end-to-end software projects.

AI BRIEFSMay 24
PRODUCT

Luma Agents Turn Brand Stories Into Graphics at Every Touchpoint

Luma Labs launched Luma Agents, a tool that defines a brand's narrative and look, then generates graphics that build trust and drive connection across every customer touchpoint.

MODELS

Six-Person Team Builds AI Model 4–8x Faster Than OpenAI and Anthropic

A compact team built task-specific AI models that are four to eight times faster than anything from OpenAI or Anthropic, already reaching half a million downloads.

EVENTS

Spike Jonze to Take the Stage at Replit Vibecon

Oscar-winning filmmaker Spike Jonze, known for Her and Being John Malkovich, will join CEO Amjad Masad for a conversation on day one of the Vibecon conference in New York.

FILM

Kling AI Hosts Official Session at Cannes on AI Filmmaking

Kling AI held a conference at the Cannes Film Festival's Marché du Film, gathering global film professionals to explore how AI is entering real film production workflows.

CREATIVE

Google Omni Flash + Dreams 3D Showcases Unmatched Creative Control

A creator demonstrated generative AI workflows combining Google Omni Flash with Dreams 3D animation, achieving speeds and flexibility no other toolchain currently matches for an experienced user.

DEVELOPER

Use Codex or Claude Code to Interpret the Hermes Agent Architecture

Practitioners recommend opening the Hermes Agent codebase directly with Codex or Claude Code, letting the AI agent explain the code structure and documentation interactively.

OPEN SOURCE

Feishu–Claude Code Bridge Connects Office Suite to AI

The open-source project feishu-claude-code-bridge enables bidirectional operations between the Feishu office platform and Claude Code, allowing AI to read and edit Feishu documents.

DEBUGGING

Opus 4.6 Excels at Debugging Complex Code When Other LLMs Fail

Users report that when other large language models cannot resolve stubborn code issues, calling Opus 4.6 frequently succeeds in diagnosing and fixing the problem.

AUDIO

Stable Audio 3 One-Click Launcher Runs on Any Computer Without VRAM

The project cocktailpeanut released a one-click launcher enabling the official Stable Audio 3 Gradio app to run on any computer regardless of GPU or VRAM constraints.

SEARCH

MiniMax Connects to Perplexity Search Infrastructure

MiniMax, positioned as a leading open-source model and agent platform, is now powered by Perplexity's search infrastructure, strengthening its retrieval capabilities.

TOOLS

Diffusers Tool Adds Model Performance Profiling for Optimization

RisingSayak started a project within Hugging Face Diffusers to track and analyze model performance, noting that what cannot be profiled cannot be optimized.

VIDEO

Runway Unveils Aleph 2.0 Video Editing Model

Runway launched the upgraded Aleph 2.0 video editing model, allowing users to change exactly what they want in a video while keeping everything else untouched.

FAV0 · AI Daily · Layout by automated editorial system