Anthropic Automates 95% of Business Analytics with Claude
Anthropic shared best practices for automating 95% of business analytics queries using Claude, achieving approximately 95% accuracy. The core method involves building a strong data foundation to reduce ambiguity, establishing knowledge sources that map user questions to controlled entities, and developing skill modules as on-demand markdown readouts. Without skills, accuracy drops to just 21%. Once integrated, accuracy stabilizes above 95%. The approach frees data science teams from repetitive queries, letting them focus on causal modeling, forecasting, and other strategic work that demands deeper analytical rigor rather than routine report generation.
NVIDIA Cosmos 3 Tops 7 Physical AI Benchmarks
NVIDIA Cosmos 3, the open omni-model for physical AI, now ranks first across seven leaderboards covering world generation, robot action policy, and industrial vision understanding. The benchmarks include Artificial Analysis, PAI-Bench, Physics-IQ, and R-Bench, confirming Cosmos 3's position as the leading open model for physical AI applications spanning manufacturing, logistics, and autonomous systems.
The US should lead on AI by continuing to develop the very best models, making sure they are safe, and getting cyber tools into the hands of trusted defenders. The new EO gets the balance right.
Sam Altman, CEO of OpenAI
Ideogram 4.0: The World's Best Open Image Model
Ideogram 4.0 has been released with open weights, supporting download, fine-tuning, and deployment on user-owned hardware. The model is available across all Ideogram plans and through the API. With strong community reception and integration on Hugging Face as ideogram-4-nf4, the release signals growing momentum for open-weight image generation models that rival proprietary alternatives. Users can generate high-resolution images with precise text rendering and exceptional design capabilities, making it suitable for brand-level creative work without vendor lock-in.
Anthropic Assesses AI Cyber Attack Defense Effectiveness
Anthropic released a safety report analyzing 832 malicious accounts, mapping activity to a long-term threat database, and evaluating the effectiveness of existing defensive techniques against AI-powered cyber attacks.
xAI Launches Grok Imagine 1.5 Preview
xAI introduces Grok Imagine 1.5 preview, offering image, video, and audio generation via API, emphasizing quality, speed, and cost optimization for developers.
MiniMax M3 Achieves 15.6x Decode Speedup with 1M Context
MiniMax M3, accelerated by Fireworks AI, delivers 15.6x decoding speed improvement for 1M token context, supporting cutting-edge coding and agent performance.
Regional Leaders Use NVIDIA Nemotron for Sovereign AI
At NVIDIA GTC Taipei, regional AI leaders build localized datasets, sovereign AI models, and agent applications tailored to local languages and economies using NVIDIA Nemotron.
NVIDIA Publishes Three Physical AI Papers at CVPR 2026
NVIDIA Research presented three papers at CVPR 2026, including GraspGen-X, the first zero-shot grasping foundation model trained on billions of simulated grasps.
Gemma 4 12B Now Available on Ollama
Google DeepMind's Gemma 4 12B model is now accessible via Ollama, supporting chat, Hermes Agent, and Claude Code launch modes for local experimentation.
Google Releases Gemma 4 12B: Laptop-Runnable Multimodal
Google released Gemma 4 12B under Apache 2.0, a dense model with encoder-free unified multimodal architecture supporting image input, tool use, and reasoning on LM Studio.
Perplexity Personal Computer Coming to Windows
Perplexity announced Personal Computer for Windows, running locally and orchestrating common apps and files, initially available to Max and Enterprise Max subscribers.
Reve 2.0 Ranks Second on Text-to-Image Leaderboard
Reve 2.0 ranks second on the arena leaderboard, behind only GPT Image 2 and ahead of Nano Banana Pro, Microsoft, and xAI, improving 125 points in 3 months.
Uber Caps Coding Agent Spending at $1,500 per Employee per Month
Uber has set a $1,500 monthly token cap per employee for each AI coding tool including Cursor and Claude Code. With engineers typically using two tools, the annual cap reaches approximately $36,000, roughly 11% of the median engineering salary of $330,000. The policy reflects enterprise efforts to manage AI tool budgets while still providing broad access. Industry analysts view the cap as a practical signal of how large organizations value coding agent productivity, moving beyond per-seat licensing toward consumption-based governance. The move follows a period of rapid adoption where uncontrolled token expenditure had begun outpacing anticipated returns.
Replit Launches ViBench: First End-to-End App Creation Benchmark
Replit CEO Amjad Masad unveiled ViBench, the first benchmark designed for evaluating end-to-end application creation based on real-world tasks. The benchmark reveals that while GPT-5.5 leads on standard SWE benchmarks, Anthropic's Opus 4.8 remains the king of vibe coding on both price and performance when measured by complete app generation. ViBench tests models across the full development lifecycle including UI design, backend logic, database integration, and deployment readiness, filling a gap in current evaluation frameworks that focus narrowly on isolated coding tasks rather than holistic application building capability.
Claude Mythos Reaches METR 80% Task Duration Prediction
Superforecasters predicted 3-4 hours for METR 80% task duration by end of year; as of late May, Claude Mythos already achieved that number, arriving months ahead of expert forecasts.
Intel AutoRound Integrated into vLLM-Omni, Supports 4-bit
Intel's AutoRound post-training quantization is natively integrated into vLLM-Omni, reducing Qwen3-Omni-30B from 66GB to 25GB with no quality loss, quantizing once offline and serving at full fidelity.
Jensen Huang Details Enterprise Agent Technology Stack
NVIDIA CEO Jensen Huang outlined the enterprise agent stack: models, orchestration, tools plus skills, and secure runtime, introducing the NVIDIA agent toolkit for enterprise deployments.
v0 Launches Snowflake Integration in Public Beta
v0's Snowflake integration enters public preview, allowing users to generate visual dashboards from Snowflake data using prompts, connecting business intelligence with AI-driven frontend generation.
Perplexity Computer Connects 400+ Tools for Business
Perplexity Computer supports integration with over 400 tools including QuickBooks, Vercel, and Shopify, helping growing businesses orchestrate operations through a unified AI interface.
vLLM Day-0 Support for Gemma 4 12B
vLLM announced immediate support for Google Gemma 4 12B, an encoder-free unified multimodal model with 256K context, ready for production serving on day zero.
NVIDIA Upgrades Local AI Agents on DGX Spark and RTX
NVIDIA announced local AI agent upgrades on DGX Spark and RTX PC, including NVIDIA OpenShell on Windows, NVIDIA Broadcast 2.2, and RTX acceleration for Adobe and Blender.
NVIDIA and Microsoft Showcase Agentic AI at MSBuild
Jensen Huang and Satya Nadella demonstrated at MSBuild how NVIDIA and Microsoft are jointly building agentic AI from Windows devices to AI factories at scale.
Grok Model Now Available on Cloudflare AI Gateway
xAI's Grok model is now accessible via Cloudflare AI Gateway, enabling developers to integrate Grok capabilities with the gateway's observability and routing features.
MiniMax M3 Launches on SiliconFlow with 50% Discount
MiniMax M3 launches on SiliconFlow with 50% off during launch week, offering cutting-edge coding, 1M context window, and native multimodal capabilities with open weights.
Pichai: Gemma 4 12B Balances Size and Performance
Google CEO Sundar Pichai praised Gemma 4 12B for its laptop-friendly size enabling powerful multi-step reasoning and agentic workflows, excited for community adoption.
VSTAT Benchmark Released for Visual State Tracking
NYU's Saining Xie released the VSTAT benchmark, highlighting visual state tracking as the grand challenge for vision in coming years, constructing internal world states from partial and noisy observations.
Consumer Hardware Local LLM Ecosystem Adds 4 Models
Sebastian Raschka noted four solid additions to the open-source local LLM ecosystem that run on consumer hardware, continuing the trend toward accessible on-device AI.
Subconscious Learning Arises from Steering Vectors
Neel Nanda's research found that most interesting LLM phenomena, including subconscious learning, can be reduced to adding steering vectors, providing an elegant interpretability explanation.
Vercel CEO: v0 + Snowflake is the Killer App for Coding AI
Guillermo Rauch stated that generating frontends on business data is the killer app for coding AI, with v0 and Snowflake achieving 1000x value compared to traditional dashboards.
Claude Trigger Word Changed to Ultracode
Claude Devs announced the dynamic workflow trigger word changed to ultracode; the old word workflow still works but no longer triggers unintended workflows.
Grok-Powered Shopping Assistant Go Launched
xAI partnered with Gopuff and SpaceXAI to launch Go, a personal shopping assistant powered by Grok text, audio, and image models with minute-level delivery.
Runway Aleph 2.0: Video to Green Screen Without Rotoscoping
Runway Academy demonstrates converting any video into green screen footage or clean plates with Aleph 2.0, eliminating manual rotoscoping workflows.
Replit Launches SEO Agent for App Discoverability
Replit released SEO Agent that scans apps and provides fix suggestions to improve visibility in both web and AI search results.
MiniMax M3 Partners with Mem0 for Persistent Memory
MiniMax M3's 1M context window combined with Mem0 memory layer builds personalized AI agents with persistent memory, 50% off during launch week.
Recraft V4.1 Delivers Brand-Level Image Generation
Recraft released V4.1 supporting fast concepts, expressive typography, and brand-level image creation for professional design workflows.
Alphabet Equity Financing Oversubscribed
Sundar Pichai announced Alphabet's equity financing to seize AI opportunities was well oversubscribed, raising substantial capital for infrastructure expansion.
Anthropic Proposes Democratic Governance for Frontier AI
Anthropic CEO Dario Amodei proposed a blueprint for democratic governance of frontier AI and building lasting safety institutions in the United States.
Lesson from Building Open Models: Talk Is Cheap
Former Tesla AI researcher Nathan Lambert noted the key lesson from the past year: talk is cheap, and finding people who genuinely push open models forward is crucial.
Most People Misunderstand How LLMs Actually Work
Ethan Mollick observed that most people lack accurate mental models of LLM operation, leading to persistent beliefs that AI merely copies or only produces average answers.
Hermes Launches Official Desktop Client
Open-source AI assistant Hermes released its official desktop client, gaining momentum ahead of similar products in the local agent space.
Codex Site Plugin Generates and Deploys Web Pages
Codex's new Site plugin, similar to Claude Design, generates and deploys web pages directly, limited to Business and Organization tier users for launch.
Kuaishou May Spin Off Kling AI for Separate IPO
Rumors suggest Kuaishou plans to spin off its Kling AI video generation platform for a separate public listing, though observers note the timing may be late.
Fei-Fei Li's Analysis of World Models Recommended
Venture Twins recommends Fei-Fei Li's article categorizing world models into three functions and predicting future directions for the overloaded term.
Hedra Agent 2: Turn Ideas into Reality Without a Team
Hedra launches Agent 2, helping users turn visions into reality without a ten-person team, suitable for restaurants, e-commerce, and agencies.
Step 3.7 Flash Deployable via SGLang on Modal
StepFun's open-weight model Step 3.7 Flash is now deployable on Modal serverless AI platform using SGLang and 8x H100 GPUs.
Routing and Post-Training Open Models Yield Faster, Cheaper Systems
HuggingFace CEO Clement Delangue noted that routing and post-training open models improve accuracy while enabling faster, cheaper systems with more control and privacy.
DSPy GEPA Used in Microsoft AI Pretraining Data Curation
Stanford NLP's Omar Khattab noted that the GEPA algorithm in DSPy is used for pretraining data filtering in Microsoft's new AI model effort.
Gemma 4 Encoder-Free Architecture Performs Without Loss
Analysis finds Gemma 4's encoder-free design shows no benchmark penalty or gain, suggesting encoders may be unnecessary for multimodal models going forward.
Human Data May Be a Curse for Novel Discoveries
Discussion around recent research suggests that for novel mathematical discoveries, human training data can act as a constraint, while models without such priors can break new ground.
MiniMax M3 Featured at NVIDIA Microsoft GTC Taipei
MiniMax M3 included in NVIDIA and Microsoft's local LLM lineup, emphasizing open weights, 1M context, strong coding, and native multimodal capabilities for the PC era.
a16z Invests in AI Assistant Town
a16z announced investment in Town, an AI assistant that manages emails, to-do lists, meeting preparation, and reminders with full user context.
Town: The Agent for Everything Else Beyond Devin
AI engineer swyx introduced Town, describing it as Devin for Everything Else, noting it spread organically within teams without additional promotion.
China's Strongest Open-Source Model Matches GPT-5.4-mini
Analysis notes China's strongest open-source model reaches GPT-5.4-mini level, though calibration error metrics offer some room for improvement.
xAI TTS and STT APIs Available on Vapi_AI
xAI offers natural text-to-speech and cost-effective speech-to-text APIs on Vapi_AI for developers.
Vidu Motion Control: Replicate Dance Moves with One Click
Vidu Motion Control lets users upload characters and reference motions to generate dance videos.
Alibaba Wan Adds Line Extraction and Rendering Skills
Alibaba Wan introduces line extraction and rendering to convert complex visuals into clean line art.
PixVerse CPP 2.0 Creator Program Launched
PixVerse launched a global creator program with memberships, credits, and a $2,500 weekly prize pool.
Step 3.7 Flash Excels in Agentic Coding Tasks
Step 3.7 Flash designed for real-world agent coding, maintaining logical and visual consistency.
Use Codex to Build and Deploy Apps for Your Team
Anthropic CEO demonstrates using Codex to build and launch apps for team collaboration.
Hermes Desktop Can Use Local or Cloud Models via Ollama
Hermes Desktop supports Ollama integration, enabling users to choose local or cloud models.
Ethan Mollick: University-Wide AI Access Is Necessary Foundation
Many universities have deployed AI campus-wide; safe and equitable access is the necessary foundation for scholarship.
Clement Delangue: Open Models Can Also Wear Frontier Labels
HuggingFace CEO notes that naming endpoints with frontier model names can drive massive usage, revealing marketing power.
OpenAI Limits Public Web Page Sharing to Prevent SEO Abuse
OpenAI restricted public sharing of ChatGPT-generated web pages to prevent SEO spam and ad proliferation.