Beyond Language Modeling: A New Framework for Multimodal Pretraining
A team presenting at ICML introduces an empirical framework for native multimodal pretraining, examining how representation learning, data mixture, architectural design, and scalability interact. The work moves past simple text-to-image alignment and asks what constitutes a genuinely multimodal foundation — one where vision and language are co-learned from the start rather than glued together post-hoc. Authors John Nguyen and David Fan will present the spotlight paper at ICML, with accompanying materials available at beyond-llms.github.io. The research provides empirical guidance on data construction strategies, model architecture choices, and scaling laws for multimodal systems — a timely contribution as the industry shifts from language-only models toward integrated perception and reasoning.
vLLM Cuts DeepSeek V4 Token Cost by 5x in One Month
The vLLM community achieved a fivefold reduction in token-serving cost for DeepSeek V4 through a sustained optimization campaign spanning kernel rewrites, scheduler improvements, and serving-stack tuning. Day-zero integration recipes gave way to deep performance work as every pull request chipped away at the cost curve. The trajectory demonstrates how a focused open-source inference community can compress costs faster than any single vendor's roadmap. The benchmark stands at 5x lower cost per token within a single month — a result that matters for every startup and enterprise running DeepSeek V4 in production. For the broader ecosystem, it validates the model of community-driven inference optimization as a credible alternative to vertically integrated serving stacks.
GLM 5.2 Becomes First Open-Source Model to Lead APEX-SWE
GLM 5.2 scored 55.3% Pass@1 on the APEX-SWE integration category, making it the first open-source model to top that benchmark for software engineering evaluation. The result puts open-weight models on competitive footing with proprietary systems in code-centric reasoning tasks.
Claude API Rate Limits Increased 5x, Tiers Simplified
Claude Platform API raised rate limits up to 5x at the highest tier and decoupled tiers from API spend. The latest Sonnet and Haiku models benefit immediately from the new structure.
Eventually, much of AI will converge towards intuition-guided symbolic world modeling — deep learning-guided program synthesis. It is inevitable. Symbolic modeling lets a system construct a compact, reusable, highly generalizable mental model of a problem space using minimal data.
François Chollet
CMU Launches AI Agents Course This Fall
Carnegie Mellon University is offering a new course on AI Agents, taught by Graham Neubig. Students will learn to build scaffolds, design evaluations, and train agentic LLMs using reinforcement learning, balancing theory with hands-on practice using modern frameworks.
Meta Releases Autodata Framework for High-Quality Training Data
Meta introduced Autodata, a framework that automates the generation of high-quality training data. The system targets one of the most persistent bottlenecks in frontier AI development: the scarcity of clean, diverse supervised data at scale.
vLLM Removes PagedAttention Module
Core vLLM developers deleted the PagedAttention module from the framework, marking a significant architectural evolution. The change reflects how rapidly the inference serving landscape is advancing beyond its original design assumptions.
Sakana AI Establishes Recursive Self-Improvement Lab in Tokyo
Sakana AI launched its RSI Lab, targeting autonomous optimization loops that evolve from human-driven R&D toward self-improving intelligence engines. The Tokyo-based lab is actively hiring program managers to scale its recursive self-improvement research program.
Vercel Positions AI Gateway as Token CDN
Vercel CEO Guillermo Rauch described the AI Gateway as a Content Token Delivery Network — an AI model CDN that supports dynamic routing and traffic rejection without redeployment. When Fable was suddenly retired, the gateway absorbed the impact on production workloads.
Fable 5 Returns to Replit with High-Effort Mode
Replit brought Fable 5 back online for longer, more complex coding projects. Toggle High-effort mode in Replit Agent for the toughest builds and see what the model can deliver on sustained autonomous tasks.
GLM 5.2 DSpark Preview: First Non-DeepSeek Speculator
RedHatAI released the GLM-5.2-speculator.dspark-preview on Hugging Face — the first DSpark speculator built for a non-DeepSeek frontier model, extending speculative decoding to a new model family.
OpenAI Proposes Giving 5% Stake to US Government
OpenAI is reportedly exploring a plan to transfer 5% of its equity to the US government, aiming to give ordinary citizens a share of the AI dividend. The $852 billion startup's proposal would be unprecedented in scale and structure for a private technology company.
Rampart PII Removal Model Tops Hugging Face Trending
The Rampart model, built by ND Studio and the White House for PII removal and token classification, reached number one on Hugging Face trending. Clement Delangue noted it as evidence that public organizations should own their model weights rather than renting from API providers.
CS2-10k: 600,000+ Gameplay Videos Released on Hugging Face
Reka Labs published CS2-10k, containing over 600,000 egocentric gameplay videos spanning 10,000+ hours. Every frame is paired with text captions, providing rich multimodal training material for vision-language models.
80TB Astrophysics Dataset Quietly Arrives on Hugging Face
A massive 80TB dataset compiled from over 30 astrophysics sources appeared on Hugging Face, part of what Thom Wolf describes as a weekly mega-release pattern in AI-driven science. The dataset spans multiple observation modalities and research institutions.
Eve: A Next.js-Style Framework for Building Agents
Evedev released eve, described as "Next.js for agents" — a single-folder framework for building agents that are durable by default, with persistent state management and streamlined deployment patterns.
Kling AI Ad Film Wins Bronze Lion at Cannes
The AI-generated short film "Lorem Ipsum," produced by Argentine studio Purga Films using Kling AI, won a Bronze Lion at Cannes Lions in the Film B2B category — a milestone for AI-assisted creative production.
PixVerse Seedance 2.0 Converts Motion Reference to 4K Cinematic Scenes
PixVerse showcases Seedance 2.0 transforming raw motion capture and single reference images into stylized 4K animated sequences, preserving details like cape physics, landing weight, and environmental consistency across shots.
Agentic Kernel Optimization: The Future of On-Device Inference
Google Gemma's team declared that agentic kernel optimization is the future of on-device inference. Xenovac used Fable 5 to author kernels that pushed Gemma inference performance on edge hardware.
TogetherCompute Secures Series C
TogetherCompute completed its Series C round, with congratulations from MiniMax and industry peers.
Speech-to-Text Goes Live in Grok Build
Users can now dictate prompts directly to coding agents using Grok's new voice input feature.
Nuclear Startup Valar Powers NVIDIA Spark
Valar Atomics became the first nuclear startup to generate electricity, successfully powering an NVIDIA Spark.
Qualcomm Expands AI Collaboration with Hugging Face
Qualcomm and Hugging Face deepened their open-source developer AI partnership across model onboarding.
GLiNER2 PII Filter Hits 55k Downloads
The fastino/gliner2-privacy-filter-PII-multi model reached 55,000 downloads in six weeks on Hugging Face.
Q3 Drama Integrated into Anishort Platform
Vidu Q3 Drama now supports consistent character identity, 1080P visuals, and native audio-video sync.
Anthropic Hosts Life Sciences Hackathon
Anthropic and Gladstone Institutes launched "Built with Claude: Life Sciences," a global virtual hackathon.
Laguna XS 2.1 Lands on SGLang Day-One
Poolside AI's 33B MoE model for agentic code got day-zero support on the SGLang inference framework.