June 22, 2026 · Monday

Vercel CEO: GLM-5.2's coding ability is "genuinely shocking"

Rauchg expresses deep admiration for Zhipu's latest model, believing it will change the competitive landscape. The open-weight model is now being compared to Opus 4.8, with many developers reporting it outperforms closed alternatives.

Vercel CEO Rauchg — one of the most respected voices in frontend infrastructure — took to X to express something between admiration and alarm. "Genuinely impressed, almost shocked, at how good GLM-5.2 by Zhipu is at coding," he wrote. "This changes things." The post garnered over 6,000 likes and more than 1.2 million views within hours, signaling that the open-weight model from the Chinese lab has crossed a critical threshold in developer credibility.

The endorsement is significant because Rauchg's company, Vercel, powers the frontends of many of the world's most demanding AI products. When the CEO of a platform that serves millions of developers publicly declares a model "changes things," the ecosystem listens. GLM-5.2 is reportedly the first open-weight model to match or exceed Opus 4.8 in real-world coding benchmarks, including the Vending Bench and Claude Code environments. Multiple independent testers have confirmed that it outperforms Opus 4.8 in agentic coding scenarios, with setup taking as little as five minutes on Fireworks AI.

Vercel ships Agentic Infrastructure: sandboxed AI agent runtime

The team optimized every frame of web performance — painting, layout, WebGPU shaders — and published a blog on infrastructure purpose-built for AI agents.

Vercel's engineering team conducted a deep optimization sweep, scrutinizing blocking scripts and WebGPU shader performance, and released their findings alongside a major infrastructure announcement. The blog introduces Vercel's new agentic infrastructure: designed for code isolation, long-running tasks with fault recovery, and workflow pause-and-resume. Use cases already include Notion handling millions of agent conversations daily and Zapier serving over five million site visits monthly through agent-driven workflows.

"Everything the light touches was optimized," Rauchg noted, referencing Vercel's homepage. The team plans to update their skills.sh platform with lessons learned from the optimization effort, which touched every frame rendered on screen.

MaineCoon: first social-interaction video model runs real-time on a single H100

22B parameters, 47.5 FPS, under $0.001 per second — François Chollet calls the inference specs "really impressive."

MaineCoon is the first video generation model to focus exclusively on social interactions: facial expressions, emotional nuance, fluid conversation pacing, and audio-lip synchronization. As François Chollet, creator of Keras, noted, the model achieves these results with striking efficiency — 22 billion parameters running at 47.5 frames per second on a single H100 GPU, at a cost below one-tenth of a cent per second of generated video. The model represents a new category: video synthesis built not for spectacle but for the subtleties of human interaction. Its real-time inference capability opens the door to live conversational agents, telepresence applications, and assistive interfaces where non-verbal cues are as important as words.

Anthropic CEO: competitors have overbuilt compute and can't profit from it

Dario Amodei explains in a podcast why Anthropic won't spend $300 billion to $1 trillion on compute — and hints that rivals may surrender their clusters voluntarily.

In a podcast appearance with Dwarkesh Patel, Anthropic CEO Dario Amodei offered a notably incomplete answer to the question of why his company won't pursue a compute spend in the hundreds of billions. Observers filled in the gap: the real reason, unspoken, is that competitors have overbuilt. "Our competitors will willingly surrender their clusters because they have overbuilt and can't generate revenue with their models," one analyst summarized. The comment points to a deepening divide in AI economics — between companies that build at any cost and those betting that capital efficiency, not brute scale, will determine the winner. Amodei's restraint may be less about caution and more about knowing something his peers are only beginning to discover.

GPT-5.5 Pro's output: the model found new data and expanded the argument of a 20-year-old academic paper.

GPT-5.5 Pro reanalyzes a 20-year-old paper: finds errors, discovers new data

Ethan Mollick gave GPT-5.5 Pro his first published paper from graduate school and asked it to find errors and update the research. The model found new data, analyzed it, created reproducible files, and extended the key argument beyond what the original paper had established. "The interaction between AI and past scholarly work is going to get weird," Mollick wrote. With over 100,000 views, the post resonated deeply in academic circles, where the implications of AI-augmented retrospective scholarship are only beginning to be understood. The experiment suggests that decades of published research may soon be systematically re-examined by models that can spot patterns, gaps, and errors invisible to human reviewers at the time of publication.

For software, the end result is what matters and code serves as a source of truth. For knowledge work, the process is at least as important as the outcome.

Ethan Mollick on why extending programming co-work patterns to all knowledge work is fundamentally problematic

GLM-5.2 pushes open-weight models past Gemini in coding utility

~200 days after Opus 4.5's release, open-weight models have their "very practically useful" moment, says Natolambert.

AI researcher Nathan Lambert observed that open-weight models, led by GLM-5.2, have reached a milestone: they now outpace Gemini in practical coding harness scenarios. The moment comes roughly 200 days after the release of Opus 4.5, marking what Lambert calls the point where open models became "very practically useful" — not just in benchmarks but in real developer workflows. The shift challenges the narrative that frontier capability requires closed, proprietary systems. Lambert, who tested GLM-5.2 on Fireworks AI, reported setup took under five minutes and performance was "definitely solid" in Claude Code. Several testers confirmed GLM-5.2's exceptional strength in research-acceleration scenarios, with one analyst noting it may be "straightforwardly the best research accelerator" in some configurations.

The engineering path to a top-3 AGI lab: combine four existing technologies

A provocative post laid out a recipe: take DeepSeek V3.2's attention mechanism, Kimi K2's model shape at 1T parameters and 32B active, pretrain on 100 trillion tokens, and apply Zhipu AI's Slime post-training framework. "It was so easy," the author wrote, highlighting how the building blocks for frontier capability are now publicly described — even if the capital and organizational will to assemble them remain rare. The post reflects a broader mood in the AI engineering community: the knowledge gap between labs has narrowed, and execution is now the only moat. With Chinese open models proving unexpectedly competitive, the traditional economic model of AI services faces pressure from below.

Physical plausibility signals found hidden in frozen image encoder geometry

New research shared by Yann LeCun indicates that signals of physical plausibility — whether an object configuration makes sense in the real world — are implicitly encoded in the geometry of frozen image encoders. No video training. No physics supervision. The finding hints at a new path for building AI systems that understand physical reality without explicit physics simulation, potentially accelerating robotics and embodied AI research by tapping into representations already present in vision models.

Three early steps of recursive self-improvement, summarized

A widely shared summary of recursive self-improvement (RSI) in AI companies notes that Anthropic now has 80% of its code merged by AI systems, while other frontier labs are practicing similar patterns. The concept — AI systems building better AI systems — is no longer speculative; it is operational. The Turing Post piece framed three observable stages, from automated code generation through co-engineering to autonomous architecture design, that together form the early map of RSI in practice.

Rumor: DeepSeek employee departed over Agent strategy disagreement

Sources claim that a senior DeepSeek team member left after failing to convince leadership to focus on agentic training earlier and more aggressively. The company is now reportedly "all in" on agentic training and custom harnesses, but the team is still being assembled, leading some to ask whether this represents a rare strategic misstep. The claim underscores the intensifying race around agents as the next frontier, and the internal tensions it can create within even the most successful AI labs.

MiniMax M3 agent navigates Unreal Engine 5.8 for 45 minutes

In a demo condensed to a one-minute video, the Hermes Agent 0.17 paired with MiniMax M3 operated autonomously in Unreal Engine 5.8's MCP environment for 45 minutes, consuming 10 million tokens. The demonstration showcases agent capability in complex, open-ended simulation environments — a test bed for the kind of long-horizon reasoning and spatial navigation that general-purpose AI agents will need.

BriefingModels & Infrastructure
EUROPE

Europe cannot rent its way to AI sovereignty

A report shared with frontier AI lab leadership argues that leasing infrastructure is not a path to sovereignty; Europe must build independently. The conclusion was presented this week to decision-makers across the continent.

FIRST IMPRESSION

GLM-5.2: five minutes to Claude Code on Fireworks AI

Nathan Lambert tested GLM-5.2 and found it took just five minutes to set up on Fireworks AI for use in Claude Code. "First impression is definitely that GLM is really solid," he reported.

ECONOMICS

DeepSeek called the "death threshold" for AI models

If your model is only marginally better than DeepSeek, you won't survive — because DeepSeek is insanely cheap. The observation has become a shorthand for the brutal economics of the open-weight era.

RESEARCH

LLMs' "understanding of the world" is a byproduct of language modeling

François Fleuret argues that any world knowledge in current LLMs is an implicit side effect, comparable to human reasoning about abstract mathematical objects that exist only through textual description.

INDUSTRY

Frontier labs spend billions on poets, musicians, and accountants for data annotation

A thread revealed that frontier AI labs are hiring domain experts — poets, musicians, accountants, consultants — at massive scale to annotate essays, slides, and spreadsheets. It is a "brute-force bet" that rich human expertise translates to better models.

OPINION

Chollet: AI adoption increases SaaS demand, not the opposite

"The more you embrace AI, the more you need SaaS," Keras author François Chollet argues. "This is not obvious to armchair analysts who love disruption narratives, but it is obvious to people actually running companies."

LLM step-by-step plans: reasoning or illusion?

A study shared by Yann LeCun suggests that when an LLM outputs a step-by-step plan, it creates a powerful illusion that the machine is reasoning — but the underlying mechanism may not constitute reasoning at all. The research challenges assumptions about chain-of-thought interpretability.

Pushing Nvidia Blackwell to its physical limits

A technical deep-dive explores the types of matrices encountered during training and the engineering required to push Blackwell GPUs to the edge of physical possibility. The post calls for further exploration of iterative refinement and mixed-precision strategies.

New architecture vs. standard Decoder Transformer: a proper comparison

François Fleuret shared a characteristically candid image comparing his new architecture against a vanilla decoder transformer after properly normalizing for FLOPs and memory. The post underscores the importance of fair benchmarking in architecture research.

Are all AI companies now recursive self-improvement companies?

Graham Neubig asks: if every company building AI coding agents or LLMs uses its own products to build its systems, doesn't that make all of them recursive self-improvement companies by definition? The question blurs the line between marketing category and operational reality.

Open-source Swift implementation of Codex Computer Use permission flow

The permiso project on GitHub implements the accessibility permission dialog interaction from Codex Computer Use in Swift. The open-source release provides a reference for developers building agent-driven desktop interaction patterns.

IIT Bombay and BharatGen join open AI construction

India's premier technical institute and the BharatGen initiative have joined Project Tapestry as founding contributors, signaling India's intent to build frontier AI rooted in its own languages and knowledge systems rather than merely adopting Western models.

AI Engineer World's Fair 2026 expands to 4x exhibition space

The annual gathering of AI engineers will be the largest yet, with quadruple the exhibition area and four dedicated pavilions, reflecting the rapid growth of the AI engineering discipline as a distinct professional field.

LlamaIndex: agents need a native document format

As AI agents generate an ever-growing volume of documents, a format purpose-built for agent consumption and generation is becoming critical. LlamaIndex has identified two main candidates for such an agent-native document format.

Images are wonderful memory banks; language is not

François Fleuret observes that the strong redundancy and compositionality of images make them naturally navigable memory structures, while language — which evolved on top of visual cognition — lacks these properties, creating fundamental challenges for text-only LLMs.

Korean creator uses AI video to advertise products that should exist

A Korean creator is using AI video generation to produce advertisements for conceptual products — items that do not yet exist but demonstrate the persuasive power of AI-generated marketing content. The project, @404product on Instagram, is gaining attention for its creative premise.

Hotelist: AI-rated hotels to beat review manipulation

A new platform called Hotelist uses AI to rate hotels with the explicit goal of eliminating paid reviews and manipulation. The founder personally reviews each property and plans to invite a trusted network of contributors to maintain integrity.

WeChat's missing markdown support draws ire in the AI era

A developer criticized WeChat for its refusal to support markdown in an era when AI-generated text — rich in formatting — is becoming ubiquitous. The complaint reflects broader friction between legacy platforms and the needs of AI-native workflows.


FAV0 · AI Daily — editorial selection by automated systems. All content attributed to original sources.