Codex lands in ChatGPT mobile app, enabling remote coding agent control
OpenAI integrated Codex programming agent into the ChatGPT mobile application, allowing users to initiate coding tasks, review outputs, and control execution from their phones. Codex runs on a laptop or DevBox while the mobile app serves as a remote interface. Preview now available to all users.
OpenAI has brought its Codex programming agent to the ChatGPT mobile app, marking a significant shift in how developers interact with AI coding tools. Users can now initiate new work, review outputs, steer execution, and approve next steps, all from the palm of their hand. The actual Codex engine continues to run on the user's laptop, Mac mini, or DevBox, with the phone functioning as a remote control window. This design means developers can monitor ongoing tasks during a commute or approve a pull request from anywhere. The feature rolled out simultaneously on iOS and Android, available to all ChatGPT users including free-tier and the entry-level Go plan. The launch signals OpenAI's ambition to make agentic coding a ubiquitous experience untethered from the desktop.
Anthropic paper: US-China AI race hinges on compute, narrow window
Anthropic published a policy paper arguing the US and democratic allies lead in frontier AI, with export controls effectively limiting China's access to advanced chips. Without tighter policies, China could catch up or surpass by 2028.
Anthropic released a policy paper analyzing the AI competition between the United States and China. The core argument: export controls on advanced computing chips have been effective in limiting China's access to frontier AI capabilities, but the advantage is fragile. China continues to close the gap through talent acquisition, control circumvention, and distillation attacks on US models. The paper outlines two 2028 scenarios: if the US tightens controls and accelerates democratic AI adoption, it can maintain leadership and define global rules; if it fails to act, China could surpass the US, potentially deploying AI for mass-scale suppression. The window for decisive action is narrowing.
xAI launches Grok Build: Agent CLI tool for SuperGrok Heavy users
xAI released an early beta of Grok Build, an agentic command-line tool supporting coding, app building, and workflow automation. New features include native sub-agent view, Plan Mode integration, and full-screen terminal UI.
xAI launched Grok Build, an agentic CLI tool now in early beta for SuperGrok Heavy subscribers. The tool introduces a native sub-agent view for orchestrating complex multi-step tasks, Plan Mode integration for structured reasoning, mouse support, and a full-screen terminal UI. Users can install it with a single command: curl -fsSL https://x.ai/cli/install.sh | bash. xAI intends to iterate on the model and product based on early adopter feedback throughout this beta phase.
OpenAI builds Windows sandbox for Codex, balancing convenience and permissions
OpenAI detailed the sandbox technology that enables Codex on Windows. The sandbox uses controlled file and network access so coding agents can run securely without forcing developers to choose between constant approval prompts and full machine access. By scoping permissions at the OS level, the sandbox reduces friction while maintaining guardrails, a model that could become standard for agentic developer tools.
Anthropic commits $200M with Gates Foundation for global health and education
Anthropic announced a $200 million partnership with the Gates Foundation, combining grants, Claude credits, and technical support for programs spanning global health, life sciences, education, agriculture, and economic mobility. The collaboration aims to deploy frontier AI capabilities where they can have the greatest humanitarian impact, particularly in underserved regions.
Kimi launches WebBridge extension, AI agents browse like humans
Kimi released WebBridge, a browser extension that lets AI agents interact with websites the way humans do: searching, scrolling, clicking, typing, and completing multi-step browsing tasks. The extension supports Kimi Code CLI, Claude Code, Cursor, Codex, Hermes, and more tools. Use cases range from trend research and job hunting to flight price comparison. Available now on the Kimi website and Chrome Web Store.
Datadog releases Toto 2.0 time-series foundation models, scaling laws emerge
Datadog AI launched the Toto 2.0 family of open-weight time-series foundation models, spanning 4M to 2.5B parameters under Apache 2.0 license. Every larger model consistently outperforms its predecessor from a single hyperparameter configuration, marking the first time scaling laws have been demonstrated convincingly in the time-series domain. The models are available on Hugging Face with day-zero vLLM inference support.
The amount of resources of any kind thrown at AI at the moment is truly breathtaking. Species-level drive for its next phase if you ask me.
François Fleuret
Codex adds Hooks for custom loops, boosting automation and security
OpenAI introduced a hook mechanism for Codex that lets developers run custom scripts at key points in an agent task. Hooks can execute validators before or after work, scan prompts for secrets, log conversations to internal systems, and create persistent memories. The system decouples workflow policy from the core agent loop, giving teams fine-grained control over security, compliance, and automation without modifying the agent itself.
Spherical flows breakthrough in continuous diffusion language models
Two independent papers, published within days of each other, proposed spherical flow methods for continuous diffusion language models. Using the von Mises-Fisher distribution on the hypersphere as the noise process, the approach exploits radial symmetry to simplify the continuity equation. Results significantly outperform geodesic and Euclidean alternatives on Sudoku and language modeling benchmarks.
Single neuron can bypass LLM safety alignment
New research demonstrates that modifying just one neuron is sufficient to bypass safety alignment in large language models, exposing the fragility of current safety mechanisms and raising urgent questions about robustness.
AnyFlow: video diffusion with arbitrary sampling steps
AnyFlow proposes a video diffusion model that supports any number of sampling steps through on-policy flow map distillation, enabling efficient generation without compromising quality.
Kimi K2.6 ranks first in financial agent benchmark
Kimi K2.6 achieved first place among open-weight models on the Finance Agent Benchmark V2, strengthening Kimi's position in the agentic AI space.
Runway opens Tokyo office with $40M Japan investment
Runway announced its expansion into Japan, opening a Tokyo office with an initial $40 million investment. The company tripled its enterprise customer base in Japan over the past year.
FLUX Outpainting extends any image to any ratio intelligently
FLUX released Outpainting that solves boundary discontinuities at the model level. Input an image and canvas geometry for coherent scene extensions beyond the original frame.
Luma Agents automate multi-format e-commerce ad campaigns
Luma Agents automatically plan, generate, iterate, and optimize ad creatives across products, markets, and formats, aiming to eliminate creative production bottlenecks.
Perplexity Computer connects to Snowflake for live warehouse analysis
Perplexity's Computer product added a Snowflake connector, enabling end-to-end workflows on live warehouse data with answers including SQL queries, source tables, and filters.
Higgsfield Supercomputer unifies models and creative workflows
Higgsfield released Supercomputer, a cloud-native AI agent that unifies research, writing, design, video generation, and campaign execution in one system.
‘Whimsical attacks’ bypass AI agent guardrails
Microsoft Research found that out-of-distribution arguments like “I can't pay because of the Geneva Convention” defeat AI agent safeguards, with even large models struggling to fully defend.
Andrew Ng launches Transformers in Practice course
Andrew Ng released a new course helping learners understand Transformer-based LLMs, diagnose slow inference, and make better deployment decisions. Built in partnership with AMD.
MulTaBench: multimodal table learning benchmark
The new MulTaBench benchmark evaluates multimodal tabular learning that combines text and images, filling a gap in table understanding evaluation.
METR and AISA confirm AI in exponential growth phase
Independent assessments from METR and the UK's AISA indicate AI capability growth has moved beyond the pre-exponential phase into rapid acceleration.
Raycast V2 Beta evolves from launcher to AI agent tool
Raycast V2 Beta rebuilds its architecture with AI agent capabilities, redesigned UI, and upgrades to search, scheduling, and extensions for macOS.
Claude for Small Business integrates QuickBooks and 15 skills
Anthropic launched Claude for Small Business with 15 pre-built skills including payroll, cash flow forecasting, and collections, integrated into QuickBooks, PayPal, HubSpot, and Canva.
xAI search and factual post-training team lead departs
Tianyi Zhang, who led Grok's real-time search and agent feature development at xAI, announced his departure from the company.
ARC-AGI-2 public set scores remain opaque, questions raised
Observers noted that current ARC-AGI-2 leaderboard scores are based on internal evaluations without public details. Calls are mounting for release of frontier model public-set figures.
Opus 4.7 behaves anomalously on WeirdML benchmark
Analysis found Claude Opus 4.7 performs worse with more extended thinking on WeirdML, attributed to unusual behavior from the Mythos distillation technique.
Claude API prompt cache warm-up cuts first token latency
A tip from the Claude developer team: send system prompts before user prompts to pre-warm the cache, significantly reducing time-to-first-token on long prompts.
Jensen Huang tells CMU grads: AI era has arrived
NVIDIA CEO Jensen Huang told Carnegie Mellon graduates that no generation has had more powerful tools or greater opportunities, urging them to shape the AI era.
Pika MCP aggregates creative models into single subscription
Pika MCP gives users access to multiple best-in-class creative models through one subscription, with a personalized agent that generates content without long prompts.
vLLM adds day-zero support for Ant Group trillion-parameter model
vLLM announced immediate support for Ring-2.6-1T, a trillion-parameter model from Ant Group designed for agent execution and complex reasoning.
Vercel AI CLI generates and displays images in the terminal
Vercel demonstrated ai-cli, a tool that calls Vercel AI Gateway's image, video, and text models and renders output directly in the terminal.
Notion developer platform built on Vercel Sandbox
Notion's developer platform relies on Vercel Sandbox, enabling native extensions and MCP-based integrations without managing infrastructure.
Coding agents make programming language lock-in obsolete
Simon Willison shared a case study where a company used coding agents to port native mobile apps to React Native, arguing that migration costs have dropped so far that language choice is no longer a binding decision.
Claude Code weekly limit boosted 50%, Agent SDK quotas trimmed
Starting June 15, Anthropic will increase Claude Code's weekly limits by 50% while reducing quotas for third-party apps built with the Agent SDK, implementing a dual-track system.
Workshop convened: AI threats to information integrity
A workshop on June 5 will bring together journalists and technologists to discuss AI-generated threats to information integrity and possible countermeasures.
Effort heuristic bias amplified 100x by AI tools
Runway's CEO cited a 2004 study showing people rate creations requiring more effort higher, a cognitive bias that AI tools amplify dramatically, skewing perception of machine-generated work.
Keras crosses 21M monthly downloads, an all-time high
Keras creator François Chollet reported that the framework surpassed 21 million monthly downloads on PyPI, doubling from 10 million five years ago.
Machine learning surpasses 20 centuries of philosophy on knowledge
Researcher François Fleuret argues that machine learning and AI have contributed more to understanding the nature of knowledge and its relation to reality than 20 centuries of philosophy.
Context vs. context window: the difference explained
Context encompasses all available information an AI agent has: system prompts, conversation history, retrieved documents, and tool outputs, while context window is the maximum token length a model can process at once.
baoyu-skills adds WeChat group chat summary via Claude Code
The baoyu-skills open-source project added a WeChat group chat summarization skill, relying on wx-cli for data access with Claude Code and Claude Opus 4.6 delivering the best results.
Codex adds Hooks for custom loops, boosting automation and security
OpenAI introduced a hook mechanism for Codex, allowing scripts to run at key task points—such as executing validators, scanning prompts for secrets, or logging conversations to internal systems—for more flexible workflow customization.
Recraft V4.1 model goes live, enhancing aesthetics and personality
Recraft V4.1 is now available on the fal platform, offering better visual quality, stronger emotional expression, and personalized styles, expanding creative range.
Codex enters ChatGPT mobile app: breakdown and first impressions
A tech blogger detailed OpenAI's approach to bringing Codex to mobile—the phone acts as a remote window while actual computation runs locally.
Syncless launched: enterprise product for human+agent collaboration
A recruitment and agent collaboration product named Syncless officially launched, aimed at optimizing human-agent collaboration.
Massive resource investment in AI is staggering, akin to species-level drive
François Fleuret marvels at the sheer scale of resources poured into AI, calling it a species-level driving force.
Live-action and AI film 'Cannes' premieres, starring Paul Rudd
A film blending live-action and AI technology will premiere at Cannes, directed by Dustin Yellin and starring Paul Rudd and Chris Rock.
Photography once called 'art's most deadly enemy'
Runway CEO cited Baudelaire's 1859 critique of photography, drawing a parallel to current controversies around AI art.