Baidu Unlimited-OCR Parses Entire Books Without Zoom
Now integrated with vLLM and powered by Reference Sliding Window Attention, Baidu Unlimited-OCR achieves constant KV cache during full book parsing, eliminating memory bloat at any output length.

Baidu's Unlimited-OCR joins the vLLM ecosystem with a breakthrough for long-document processing: one-shot parsing of entire books without memory bloat. The core innovation, Reference Sliding Window Attention (R-SWA), maintains a fixed KV cache regardless of how many pages need to be processed. No memory blowup, no slowdown, no matter how long the output. This is particularly significant for academic and legal document processing where entire volumes need digitization in a single pass. The vLLM integration means the model benefits from the framework's optimized inference pipeline, making large-scale OCR deployment practical for the first time.

Seedance 2.0 Enables Precision 3D-to-Video Motion Control
PixVerse's Seedance 2.0 locks motion and camera paths using 3D channels, generating precise, high-speed racing car videos without relying on text prompts for motion cues. The 3D pass ensures consistent frame-to-frame results for cinematic sequences.
Clement Calls for Regulating Frontier API Models, Not Open-Source AI
HuggingFace CEO Clement Delangue argues that the most dangerous AI systems today are large frontier LLM APIs, not open models. His proposal recommends increasing government transparency requirements for API-based frontier models while keeping open-source AI untouched. The distinction matters because open-source models have fundamentally different risk profiles and distribution dynamics.
"Getting regulated by a government because your model is 'too dangerous' is the best marketing, especially for enterprise sales. Everyone is trying to get it now."
— Clement Delangue, CEO of HuggingFace
DeepSpec: DeepSeek Curated Model & Data Collection
deepseek-ai releases DeepSpec, a curated collection of models, datasets, papers, and Spaces on HuggingFace, providing the community with centralized access to their research artifacts and resources.
Emollick: GLM-5.2 Solid but Not Frontier; Open Weights Closing the Gap
GLM-5.2 is good but still behind GPT-5.5 and Opus 4.8, according to AI researcher Emollick. Crucially, open-source models have now crossed into GPT-5.2 capability territory — a milestone showing open weights are genuinely chasing the frontier.
Anthropic Launches Claude Tag: @Claude in Slack Channels
Claude Tag (Beta) lets Team and Enterprise users @Claude in Slack channels like a colleague. Admins pre-configure which channels, tools, data sources, and code repositories Claude can access, enabling anyone in the channel to assign tasks directly.

DeepSeek Accelerates Models via DSpark; Gemma4-12B May Be Best Local Model
DeepSeek has released accelerated models through DSpark, with Gemma4-12B as the standout. It may include vision capabilities and could become the best local model in its weight class by a significant margin. Qwen 3.5 is excluded due to lack of linear attention support in DSpark.
Repo Prompt Open-Sourced After Creator Joins OpenAI
The community edition of Repo Prompt is now open-source on GitHub. After OpenAI's developer experience lead invited Provencher to join the team, he first made the project free for paid users, then fully open-sourced it as a condition before accepting the role.
Goodside Reflects: LLMs Shattered the Expected AI Development Path
AI researcher Riley Goodside recalls believing for 25 years that AI would emerge through recursive self-improvement via reinforcement learning, not chatbots. He viewed the old analogy — making AI through chatbots is like making real flowers by sculpting wax — as essentially correct. The LLM era overturned that expectation, proving language-first approaches could achieve what iterative self-play could not.
Analysis Infers DeepSeek Profit Margins at 40-60%
TeortaxesTex calculates that DSpark data suggests DeepSeek margins between 40-60%, and potentially over 80% with GB300 hardware, refuting the common assumption that AI inference services operate at a loss.
fofrAI: Gemini 3.5 Flash Is an Excellent Workhorse for Sub-Agents
AI developer fofrAI praises Gemini 3.5 Flash as determined, fast, and perfect for sub-agents, calling it the go-to workhorse for everyday AI tasks.
Model Routers Underestimate Difficulty of Non-Math Tasks
All model routers undervalue non-verifiable tasks like innovation, marketing, and qualitative analysis, assigning too little intelligence where smarter AI models matter most.
Adobe Firefly Boards Integrated Directly into Photoshop
Adobe Firefly Boards is now integrated into Photoshop, allowing users to collaborate and create using the Firefly AI canvas directly inside the design tool.
Emollick Questions Whether Gemini 3.5 Pro Is Export-Controlled
The question of whether Gemini 3.5 Pro faces export controls raises broader questions about regulatory boundaries for frontier AI systems.
Bangalore Auto Driver Shares How ChatGPT Helps Daily Life
An auto-rickshaw driver in Bengaluru uses ChatGPT for daily tasks, praising its ability to remember all queries and link conversations from four months ago. OpenAI has responded.
DiffusionBench: Comprehensive Benchmark for Diffusion Transformers
A new paper introduces DiffusionBench for holistic evaluation of diffusion transformers, providing an important reference for the research community.
Clement Delangue: AI's Future Is Multi-Model and Open-Source
HuggingFace CEO believes the future of AI involves multiple models, the majority of which will be open-source, facilitated by platforms like HuggingFace.
China's Top AI Researchers: More Output, Sharply Lower Retention
Marco Polo snapshot from NeurIPS, ICML 2024 and ICLR 2025 data shows China's top researcher production rising but retention plummeting. The US advantage over China is 5.5x.
Baidu Unlimited-OCR Tops HuggingFace Trending Models
Baidu Unlimited-OCR becomes the number one model on HuggingFace, drawing community attention for its ability to parse entire books in a single pass.

DeepSeek DSpark Speed Uplift Raises Questions About a Secret Upgrade
TeortaxesTex notes that while DeepSeek claims DSpark has been serving since May 8, the speed has clearly jumped in the last few days. The sudden improvement suggests they may be testing a better version behind the scenes, raising speculation about what comes next.

TeortaxesTex on Miscalibrated Cynicism About AI Profitability
The problem is not that people don't do the math, but that cynicism is miscalibrated: too cynical to believe in viable subsidies, yet not cynical enough to trust that 90%+ margins are real and achievable.

VentureTwins: The Age of Hyper-Personalized AI Advertising Has Begun
Commentary notes that the era of hyper-personalized advertising has arrived, implying AI-driven customized ad generation is rapidly reshaping the consumer landscape.
VISReg: A New Variance-Invariance-Sketch Regularization Method for JEPA
A new paper proposes VISReg, a variance-invariance-sketch regularization technique for JEPA training, aiming to improve self-supervised learning performance through novel regularization.
Emollick Praises Growing Benefits of Open Science for AI Papers
AI researcher Emollick notes that open science and transparent methodologies bring increasing benefits when writing AI papers, with reproducibility and validation becoming more valuable.
MiniMax and Cysic Hackathon Winners Announced
Winners of the MiniMax AI and Cysic co-hosted hackathon are announced, showcasing projects built on the M3 model.
Nathan Lambert: Open-Source Model Diversity Brings Hope
AI researcher Nathan Lambert says the diversity of companies building open-source models is encouraging, even though their stories are often overshadowed by frontier models.
Mental Models for Benchmark Difficulty and Eval Design
Researcher CWolfe shares thoughts on benchmark difficulty, noting that evaluation design often relies on greedy heuristics and needs more systematic approaches.
SF AI Engineer Week Kicks Off at Moscone Center
AI Engineer Week starts in San Francisco with badge pick-up and new engineer orientation from 5-9 PM today at Moscone Center.
Emollick Jokes: What Model Is OpenAI Saving GPT-6 For?
Given the naming jumps in recent releases, Emollick asks which model name OpenAI is reserving for GPT-6.
TeortaxesTex Corrects: 'Duibiao' Means Head-to-Head, Not Imitate
After correction from a native speaker, TeortaxesTex notes 'duibiao' correctly means competing head-on, making him optimistic about the benchmark project.
Nathan Lambert Warns of 'Vibe Regulation' Consequences
Nathan Lambert warns that vibe regulation of frontier models has real and frightening consequences for the AI industry.
Arohan: Kernel Debates Will Drive Better Linear Algebra
AI researcher Arohan predicts that current debates in the field will lead to significantly better linear algebra kernels.
Midjourney 8.1 HD Mode Fixes Hand Generation Details
Hand generation is still problematic in Midjourney 8.1 standard mode, but HD rerun significantly improves through detail regeneration, not simple upscaling.
VentureTwins: No Surprise AI-Generated Junk Writing Is Popular
Commentary argues it is unnecessary to be shocked by AI junk writing trending, as popular books already reflect a garbage culture.
Oran_ge on Programmer Choices After AI Boosts Code Efficiency
If code efficiency improves 10x, should programmers write 100x more code or spend saved time on scarce, important tasks, and which will bosses expect?
TeortaxesTex Questions GLM 5.2 Evaluation Authenticity
TeortaxesTex questions whether GLM 5.2-Cyber evaluation results are genuine or from an unrepresentative benchmark.
Clement Jokes: Government Regulation Is Best Marketing
HuggingFace CEO Clement Delangue jokes that being regulated by the government for being 'too dangerous' is the best marketing for selling to enterprises.
TeortaxesTex: High-Flyer's Charitable Donations Not One-Off
Analysis notes that High-Flyer (DeepSeek's parent) charitable donations are not sporadic; Liang Wenfeng (alias 'Ordinary Piggy') donated 138 million RMB, showing continuity.