Vercel CEO: GLM-5.2's coding ability is "genuinely shocking"
Rauchg expresses deep admiration for Zhipu's latest model, believing it will change the competitive landscape. The open-weight model is now being compared to Opus 4.8, with many developers reporting it outperforms closed alternatives.
Vercel CEO Rauchg — one of the most respected voices in frontend infrastructure — took to X to express something between admiration and alarm. "Genuinely impressed, almost shocked, at how good GLM-5.2 by Zhipu is at coding," he wrote. "This changes things." The post garnered over 6,000 likes and more than 1.2 million views within hours, signaling that the open-weight model from the Chinese lab has crossed a critical threshold in developer credibility.
The endorsement is significant because Rauchg's company, Vercel, powers the frontends of many of the world's most demanding AI products. When the CEO of a platform that serves millions of developers publicly declares a model "changes things," the ecosystem listens. GLM-5.2 is reportedly the first open-weight model to match or exceed Opus 4.8 in real-world coding benchmarks, including the Vending Bench and Claude Code environments. Multiple independent testers have confirmed that it outperforms Opus 4.8 in agentic coding scenarios, with setup taking as little as five minutes on Fireworks AI.
Vercel ships Agentic Infrastructure: sandboxed AI agent runtime
The team optimized every frame of web performance — painting, layout, WebGPU shaders — and published a blog on infrastructure purpose-built for AI agents.
Vercel's engineering team conducted a deep optimization sweep, scrutinizing blocking scripts and WebGPU shader performance, and released their findings alongside a major infrastructure announcement. The blog introduces Vercel's new agentic infrastructure: designed for code isolation, long-running tasks with fault recovery, and workflow pause-and-resume. Use cases already include Notion handling millions of agent conversations daily and Zapier serving over five million site visits monthly through agent-driven workflows.
"Everything the light touches was optimized," Rauchg noted, referencing Vercel's homepage. The team plans to update their skills.sh platform with lessons learned from the optimization effort, which touched every frame rendered on screen.
MaineCoon: first social-interaction video model runs real-time on a single H100
22B parameters, 47.5 FPS, under $0.001 per second — François Chollet calls the inference specs "really impressive."
MaineCoon is the first video generation model to focus exclusively on social interactions: facial expressions, emotional nuance, fluid conversation pacing, and audio-lip synchronization. As François Chollet, creator of Keras, noted, the model achieves these results with striking efficiency — 22 billion parameters running at 47.5 frames per second on a single H100 GPU, at a cost below one-tenth of a cent per second of generated video. The model represents a new category: video synthesis built not for spectacle but for the subtleties of human interaction. Its real-time inference capability opens the door to live conversational agents, telepresence applications, and assistive interfaces where non-verbal cues are as important as words.
Anthropic CEO: competitors have overbuilt compute and can't profit from it
Dario Amodei explains in a podcast why Anthropic won't spend $300 billion to $1 trillion on compute — and hints that rivals may surrender their clusters voluntarily.
In a podcast appearance with Dwarkesh Patel, Anthropic CEO Dario Amodei offered a notably incomplete answer to the question of why his company won't pursue a compute spend in the hundreds of billions. Observers filled in the gap: the real reason, unspoken, is that competitors have overbuilt. "Our competitors will willingly surrender their clusters because they have overbuilt and can't generate revenue with their models," one analyst summarized. The comment points to a deepening divide in AI economics — between companies that build at any cost and those betting that capital efficiency, not brute scale, will determine the winner. Amodei's restraint may be less about caution and more about knowing something his peers are only beginning to discover.
GPT-5.5 Pro reanalyzes a 20-year-old paper: finds errors, discovers new data
Ethan Mollick gave GPT-5.5 Pro his first published paper from graduate school and asked it to find errors and update the research. The model found new data, analyzed it, created reproducible files, and extended the key argument beyond what the original paper had established. "The interaction between AI and past scholarly work is going to get weird," Mollick wrote. With over 100,000 views, the post resonated deeply in academic circles, where the implications of AI-augmented retrospective scholarship are only beginning to be understood. The experiment suggests that decades of published research may soon be systematically re-examined by models that can spot patterns, gaps, and errors invisible to human reviewers at the time of publication.
For software, the end result is what matters and code serves as a source of truth. For knowledge work, the process is at least as important as the outcome.
Ethan Mollick on why extending programming co-work patterns to all knowledge work is fundamentally problematic
GLM-5.2 pushes open-weight models past Gemini in coding utility
~200 days after Opus 4.5's release, open-weight models have their "very practically useful" moment, says Natolambert.
AI researcher Nathan Lambert observed that open-weight models, led by GLM-5.2, have reached a milestone: they now outpace Gemini in practical coding harness scenarios. The moment comes roughly 200 days after the release of Opus 4.5, marking what Lambert calls the point where open models became "very practically useful" — not just in benchmarks but in real developer workflows. The shift challenges the narrative that frontier capability requires closed, proprietary systems. Lambert, who tested GLM-5.2 on Fireworks AI, reported setup took under five minutes and performance was "definitely solid" in Claude Code. Several testers confirmed GLM-5.2's exceptional strength in research-acceleration scenarios, with one analyst noting it may be "straightforwardly the best research accelerator" in some configurations.
The engineering path to a top-3 AGI lab: combine four existing technologies
A provocative post laid out a recipe: take DeepSeek V3.2's attention mechanism, Kimi K2's model shape at 1T parameters and 32B active, pretrain on 100 trillion tokens, and apply Zhipu AI's Slime post-training framework. "It was so easy," the author wrote, highlighting how the building blocks for frontier capability are now publicly described — even if the capital and organizational will to assemble them remain rare. The post reflects a broader mood in the AI engineering community: the knowledge gap between labs has narrowed, and execution is now the only moat. With Chinese open models proving unexpectedly competitive, the traditional economic model of AI services faces pressure from below.
Physical plausibility signals found hidden in frozen image encoder geometry
New research shared by Yann LeCun indicates that signals of physical plausibility — whether an object configuration makes sense in the real world — are implicitly encoded in the geometry of frozen image encoders. No video training. No physics supervision. The finding hints at a new path for building AI systems that understand physical reality without explicit physics simulation, potentially accelerating robotics and embodied AI research by tapping into representations already present in vision models.
Three early steps of recursive self-improvement, summarized
A widely shared summary of recursive self-improvement (RSI) in AI companies notes that Anthropic now has 80% of its code merged by AI systems, while other frontier labs are practicing similar patterns. The concept — AI systems building better AI systems — is no longer speculative; it is operational. The Turing Post piece framed three observable stages, from automated code generation through co-engineering to autonomous architecture design, that together form the early map of RSI in practice.
Rumor: DeepSeek employee departed over Agent strategy disagreement
Sources claim that a senior DeepSeek team member left after failing to convince leadership to focus on agentic training earlier and more aggressively. The company is now reportedly "all in" on agentic training and custom harnesses, but the team is still being assembled, leading some to ask whether this represents a rare strategic misstep. The claim underscores the intensifying race around agents as the next frontier, and the internal tensions it can create within even the most successful AI labs.
MiniMax M3 agent navigates Unreal Engine 5.8 for 45 minutes
In a demo condensed to a one-minute video, the Hermes Agent 0.17 paired with MiniMax M3 operated autonomously in Unreal Engine 5.8's MCP environment for 45 minutes, consuming 10 million tokens. The demonstration showcases agent capability in complex, open-ended simulation environments — a test bed for the kind of long-horizon reasoning and spatial navigation that general-purpose AI agents will need.
Open-source AI leadership is a prerequisite for general AI leadership
HuggingFace co-founder Clement Delangue maps the China-US timeline: open-source leadership precedes and enables general AI dominance.
Clement Delangue, co-founder of HuggingFace, laid out a thesis that is reshaping how industry observers think about the AI race. The United States led open-source AI from 2016 to 2024, he argues, but China will lead it from 2024 to 2026 — and that transition matters enormously. "It's not open-source AI leadership OR general AI leadership," Delangue wrote. "It's open-source AI leadership BEFORE general AI leadership." The implication is clear: whoever controls the open ecosystem builds the developer community, the tooling, and the trust that eventually translates into general AI supremacy. The US, he predicts, will lead general AI from 2024 to 2027, benefiting massively — but the open-source phase is the necessary on-ramp.
Nvidia is a first-tier AI lab waiting to happen
A widely discussed take argues that Nvidia has the hardware, the capital, and the talent pipeline to become a top-tier AI lab. "Jensen has got to start poaching," the post urged, reflecting a growing belief that the GPU giant's position in the ecosystem gives it an unparalleled foundation for vertical integration into model development.
Europe cannot rent its way to AI sovereignty
A report shared with frontier AI lab leadership argues that leasing infrastructure is not a path to sovereignty; Europe must build independently. The conclusion was presented this week to decision-makers across the continent.
GLM-5.2: five minutes to Claude Code on Fireworks AI
Nathan Lambert tested GLM-5.2 and found it took just five minutes to set up on Fireworks AI for use in Claude Code. "First impression is definitely that GLM is really solid," he reported.
DeepSeek called the "death threshold" for AI models
If your model is only marginally better than DeepSeek, you won't survive — because DeepSeek is insanely cheap. The observation has become a shorthand for the brutal economics of the open-weight era.
LLMs' "understanding of the world" is a byproduct of language modeling
François Fleuret argues that any world knowledge in current LLMs is an implicit side effect, comparable to human reasoning about abstract mathematical objects that exist only through textual description.
Frontier labs spend billions on poets, musicians, and accountants for data annotation
A thread revealed that frontier AI labs are hiring domain experts — poets, musicians, accountants, consultants — at massive scale to annotate essays, slides, and spreadsheets. It is a "brute-force bet" that rich human expertise translates to better models.
Chollet: AI adoption increases SaaS demand, not the opposite
"The more you embrace AI, the more you need SaaS," Keras author François Chollet argues. "This is not obvious to armchair analysts who love disruption narratives, but it is obvious to people actually running companies."
LLM step-by-step plans: reasoning or illusion?
A study shared by Yann LeCun suggests that when an LLM outputs a step-by-step plan, it creates a powerful illusion that the machine is reasoning — but the underlying mechanism may not constitute reasoning at all. The research challenges assumptions about chain-of-thought interpretability.
Pushing Nvidia Blackwell to its physical limits
A technical deep-dive explores the types of matrices encountered during training and the engineering required to push Blackwell GPUs to the edge of physical possibility. The post calls for further exploration of iterative refinement and mixed-precision strategies.
New architecture vs. standard Decoder Transformer: a proper comparison
François Fleuret shared a characteristically candid image comparing his new architecture against a vanilla decoder transformer after properly normalizing for FLOPs and memory. The post underscores the importance of fair benchmarking in architecture research.
Are all AI companies now recursive self-improvement companies?
Graham Neubig asks: if every company building AI coding agents or LLMs uses its own products to build its systems, doesn't that make all of them recursive self-improvement companies by definition? The question blurs the line between marketing category and operational reality.
Open-source Swift implementation of Codex Computer Use permission flow
The permiso project on GitHub implements the accessibility permission dialog interaction from Codex Computer Use in Swift. The open-source release provides a reference for developers building agent-driven desktop interaction patterns.
IIT Bombay and BharatGen join open AI construction
India's premier technical institute and the BharatGen initiative have joined Project Tapestry as founding contributors, signaling India's intent to build frontier AI rooted in its own languages and knowledge systems rather than merely adopting Western models.
AI Engineer World's Fair 2026 expands to 4x exhibition space
The annual gathering of AI engineers will be the largest yet, with quadruple the exhibition area and four dedicated pavilions, reflecting the rapid growth of the AI engineering discipline as a distinct professional field.
LlamaIndex: agents need a native document format
As AI agents generate an ever-growing volume of documents, a format purpose-built for agent consumption and generation is becoming critical. LlamaIndex has identified two main candidates for such an agent-native document format.
Images are wonderful memory banks; language is not
François Fleuret observes that the strong redundancy and compositionality of images make them naturally navigable memory structures, while language — which evolved on top of visual cognition — lacks these properties, creating fundamental challenges for text-only LLMs.
Korean creator uses AI video to advertise products that should exist
A Korean creator is using AI video generation to produce advertisements for conceptual products — items that do not yet exist but demonstrate the persuasive power of AI-generated marketing content. The project, @404product on Instagram, is gaining attention for its creative premise.
Hotelist: AI-rated hotels to beat review manipulation
A new platform called Hotelist uses AI to rate hotels with the explicit goal of eliminating paid reviews and manipulation. The founder personally reviews each property and plans to invite a trusted network of contributors to maintain integrity.
WeChat's missing markdown support draws ire in the AI era
A developer criticized WeChat for its refusal to support markdown in an era when AI-generated text — rich in formatting — is becoming ubiquitous. The complaint reflects broader friction between legacy platforms and the needs of AI-native workflows.