Alibaba releases Qwen3.7-Max: Built for the agent era
A versatile foundation for agents that get things done — end-to-end coding, multi-file refactors, real debugging, and productivity assistance in one model.
Alibaba shipped Qwen3.7-Max, its latest flagship model purpose-built for the age of autonomous agents. The model delivers sharper scientific reasoning, stronger agentic capabilities, better coding performance, and reduced hallucination rates. On the coding front, Qwen3.7-Max handles frontend prototypes, multi-file refactors, and real debugging sessions end to end. Paired with its productivity assistant features, the model represents a convergence of reasoning and tool-use capabilities that were previously siloed. Early adopters report that the agentic workflow — planning, executing, and verifying across multiple tools — feels more cohesive than in prior generations. The model also posted a 4.8-point improvement on the Artificial Analysis Intelligence Index over Qwen3.6-Max-Preview, reflecting gains in scientific reasoning and instruction following.
Runway launches Aleph 2.0: Single-frame editing updates entire video
Edit a single frame, preview the change, and Aleph 2.0 carries that edit across the entire video.
Runway shipped Aleph 2.0, a major update that redefines how video editing works. Users can now edit a single frame in their video, preview the result, and the model propagates that edit seamlessly across every other frame in the timeline. The feature is available in the new Edit Studio on the web, lowering the barrier from traditional frame-by-frame compositing to a single-point intervention. This approach treats video as a continuous visual field rather than a sequence of independent frames, and signals a direction where video generation and editing converge into one fluid interface.
Sam Altman: New Codex launches today
OpenAI CEO Sam Altman announced the release of a new version of Codex with a characteristically brief post: "new codex ships today!" The update is part of what OpenAI calls "Codex Thursday," a cadence of regular improvements to the AI coding platform. Details remain sparse, but given the concurrent announcement of Appshots — a new feature that captures app window screenshots and text into Codex threads via a double-tap of the Command key on Mac — this appears to be a substantial refresh of the coding agent experience.
Codex introduces Appshots: One-click screenshot into context
Appshots is a new way to bring the context of what you are working on into Codex. On Mac, double-press the Command key to attach your app window to a Codex thread. Codex receives both a screenshot and the text content from the window, instantly importing visual and textual context into the coding session.
xAI integrated into OpenCode: Code with Grok
Users can now use their Grok or X Premium subscription inside OpenCode, invoking the model that powers Grok Build for high-speed coding. After installing OpenCode, run /connect and select xAI to authenticate via browser OAuth or headless mode for remote hosts.
DeepMind launches AI science toolkit for Antigravity
DeepMind launched Science Skills for Google Antigravity, integrating insights from over thirty major life science sources including UniProt and the AlphaFold Database. The toolkit is designed to accelerate day-to-day research workflows by giving AI models the right scientific toolchain.
Tencent open-sources Hy-MT2: Multilingual translation across 33 languages
Hy-MT2 is a powerful multilingual translation model supporting seamless translation between 33 languages and is fully open-sourced. The release also includes the Tencent Hy Translation mini-app, making the model immediately accessible to users.
Cohere open-sources Command A+ under Apache 2.0
Cohere released the command-a-plus model on Hugging Face under the Apache 2.0 license, marking a significant step in the company's open-source trajectory. The release aims to advance AI through open science and community collaboration, with the model freely available for both research and commercial use.
Command A+ supports W4A4 quantization on Hugging Face
Cohere released a W4A4 quantized version of Command A+ with virtually zero performance degradation, dramatically reducing the serving footprint for production deployments. The quantized model enables efficient inference on hardware with tighter memory constraints, lowering the barrier for self-hosted AI services.
"It is very hard to sleep, man."
A researcher's reaction after seeing AI generate 125 pages of novel mathematical reasoning — connecting fields humans had left unlinked since 1945.
vLLM introduces elastic expert parallelism, hot-swapping deployment topology
vLLM's new Elastic Expert Parallelism eliminates the need to restart deployments when changing DP/EP topology for MoE models. Previously, scaling or swapping configuration meant a full restart with in-flight traffic dropped. Now, a single API call — curl -X POST — resizes a live deployment without interruption. This is a meaningful improvement for production MoE serving where continuous availability is critical.
Sasha Rush shares Cursor Composer training method in new talk
Sasha Rush presented an overview of the methods used at Cursor to build the Composer model in a talk titled "Training Composer." The talk outlines the training techniques, data strategies, and model architecture choices that power Cursor's AI pair-programming experience.
GPT-5.2 reaches expert level in academic paper review
Forty-five scientists spent 469 hours evaluating human and AI reviews on 82 papers. The study found that current AI reviewers are competitive with the top-rated reviewers in Nature's official peer review process. However, the researchers noted that AI reviews still have identifiable weaknesses, particularly in evaluating novelty claims and methodological nuance.
Qwen3.7-Max jumps 4.8 points, surpassing preview version
Qwen3.7-Max scored 56.6 on the Artificial Analysis Intelligence Index, up 4.8 points from the preview, with stronger scientific reasoning, agent, and coding abilities and reduced hallucination.
NVIDIA and Dell launch AI factory platform for enterprise agents
Jensen Huang and Michael Dell unveiled a major update to the Dell AI Factory with NVIDIA, a full-stack platform from deskside workstations to massive data centers powering autonomous AI agents.
Supercomputer beats Gemini 3.5 Flash in motion design benchmarks
Using Seedance 2.0, Opus 4.7 outperformed Gemini 3.5 Flash on motion design tasks, underscoring the arrival of agent-driven prompt crafting in creative AI workflows.
Simon Willison releases Datasette Agent: Conversational data analysis assistant
An AI-powered assistant that answers questions about SQLite databases and extends functionality via plugins, aimed at bringing conversational analysis to structured datasets.
ESI-Bench: New benchmark for embodied spatial intelligence
A benchmark for closed-loop perception and action, covering multi-object placement, capacity comparison, occlusion recognition, and over a dozen additional spatial reasoning tasks.
Mix-Quant: Block quantization optimized for agent LLMs
A mixed-precision method that applies quantization during prefilling while preserving full precision during decoding, tailored to the unique demands of agent-based language models.
LongMINT: Evaluating memory under multi-target interference
A new benchmark evaluates memory retention in long-horizon agent systems under multiple objective distractions, testing recall across extended interaction sequences.
Solving Erdős problem used electricity equivalent to 2-20 EV miles
The AI that autonomously solved the Erdős problem consumed between 0.6 and 6.3 kWh of electricity and roughly three gallons of water — less than three almonds worth of water and the energy equivalent of driving an electric vehicle 2 to 20 miles.
Erase: FLUX now performs model-level image erasure
The FLUX model now supports Erase, allowing users to specify a mask to remove or reconstruct objects, text, and details in images while maintaining visual consistency throughout the result.
NVIDIA GTC 2026 to be held in Taipei on June 1, Jensen Huang to keynote
Huang will unveil the latest breakthroughs in AI, robotics, and accelerated computing at the Taipei keynote, with a GTC Live pregame show starting at 9 a.m. local time.
All-AI-generated feature film 'RAPHAEL' debuts at Cannes
A 100% AI-generated feature film developed by Mateo AI Studio and MBC C&I AI Content Lab premiered at Cannes, marking a milestone for AI cinema on the global stage.
Shoplift: One-click product link to platform-native ad video
Built for DTC teams, PixVerse's Shoplift generates ad videos in minutes from a product URL, automatically supplying new variants when ad performance begins to decline.
Replit Enterprise goes self-service, deploys in minutes
No contract negotiations, no waiting — Replit Enterprise now supports SSO and SCIM out of the box, letting teams start building AI-powered applications immediately.