OpenAI Launches Daybreak: Accelerating Cyber Defense with Frontier AI
OpenAI introduces Daybreak, integrating its top models, Codex, and security partners to provide continuous protection and software hardening for network defense teams.
OpenAI introduces Daybreak, a new umbrella effort for defensive acceleration that brings together frontier AI models, Codex, and a network of security partners to continuously secure software. Sam Altman stresses that AI is already good at cybersecurity and about to get very good, inviting more companies to collaborate. Greg Brockman defines it as a defense acceleration engineering effort, equipping cyber defenders with the strongest possible frontier AI capabilities. The initiative marks a significant step toward a future where security teams can move at the speed of AI, proactively hardening infrastructure and responding to threats in real time rather than reacting to breaches after the fact.
OpenAI Forms Deployment Company with 19 Partners and $4B to Help Enterprises Adopt AI
OpenAI launches majority-owned OpenAI Deployment Company, uniting 19 investment, consulting, and integration firms with an initial $4 billion to drive enterprise AI production deployment.
The new company, majority-owned and controlled by OpenAI, starts with 150 forward-deployed engineers and deployment specialists, backed by $4 billion from 19 leading investment firms, consultancies, and system integrators. Designed to help organizations deploy frontier AI to production at scale, the initiative brings together a coalition of partners to maximally support enterprises in their AI adoption journey.
Thinky Unveils Full-Duplex Multimodal Model for Real-Time Human-Machine Interaction
Thinky announces an end-to-end multimodal model capable of high-bandwidth real-time interaction — listening, speaking, and seeing — without sacrificing intelligence.
John Schulman shares Thinky's work on full-duplex multimodal models, emphasizing natural and intuitive real-time interaction that does not compromise on intelligence. Thinky was founded to differentially advance capabilities for human-AI collaboration, an area the team considers underemphasized relative to raw model capability. Soumith Chintala reveals the three-point roadmap: increase human-AI bandwidth, raise the ceiling of human+AI intelligence, and keep humans as protagonists. Researcher Nathan Lambert hails the demo as genuinely different — both model and user speaking at once.
Claude Platform Lands on AWS, Offering Managed Agents and Full API
The Claude platform is now fully available on AWS, enabling customers to access Claude's full capabilities — including Managed Agents — through AWS identity, billing, and commitment consumption discounts. Workloads, billing, and IAM all remain inside AWS, eliminating the need for a separate Claude API account while providing the same model and feature access as the native platform. This marks a significant expansion of Claude's enterprise footprint, making it easier for organizations already on AWS to adopt and scale AI agents within their existing cloud governance structure.
You haven't felt AI progress if you've merely used agents and haven't experienced massively parallel agents.
— Amjad Masad, CEO of Replit
Cursor Integrates with Microsoft Teams, Delegating Tasks Directly in Channels
Cursor AI coding assistant adds Teams integration, allowing users to delegate tasks to agents via @Cursor or pull information from Cursor into the team directly, bringing AI-assisted development workflows into the collaboration platform.
Replit Releases Parallel Agents: Up to 10 Agents for Build Acceleration
Replit introduces Parallel Agents, allowing up to 10 agents to work simultaneously — each with its own copy of the app and its own computer — then merge their work agentically, dramatically speeding up development cycles.
Local Open-Source AI Progress Outpaces Moore's Law by Over 2x
Clement Delangue compares two years of unchanged MacBook hardware — still at 128 GB unified memory — noting that local open-weight model intelligence has improved more than twice as fast as Moore's Law between May 2024 and May 2026.
Leak: Google Multimodal Video Model Gemini Omni Surfaces
A community leak reveals a demo of Google's new video model Gemini Omni, showing better math performance than SeeDance 2 on tasks like mathematical proofs, but with notable safety restrictions limiting its behavior.
New Paper Proposes Recursive Agent Optimization, Training Agents That Can Delegate
Graham Neubig's team releases Recursive Agent Optimization, a new framework enabling agents to learn to delegate subtasks to other agents — with robust training methods and objectives that allow hierarchical task distribution.
OpenAI Demos GPT-Realtime-2 Automating Project Board Tasks
A demonstration shows GPT-Realtime-2 understanding standup meetings and moving task tickets, illustrating the potential of real-time voice AI to streamline development collaboration and agile workflows.
Thinky's Three-Point Plan: Raising Human-AI Bandwidth and Intelligence Ceiling
Soumith shares Thinky's roadmap: increase human-AI bandwidth, raise the human+AI intelligence ceiling, and keep humans as protagonists in the new world.
DeepMind Collaboration Preprint: AI-Guided Discovery of Atypical Protein Assemblies
Google DeepMind and the Sainsbury Lab publish a joint preprint on using AI to discover non-canonical protein assembly structures.
Qwen Releases WebWorld Open-World Model Series, from 8B to 32B
Tongyi Qianwen introduces the WebWorld open model series and dataset targeting web agents, with over 9% improvement on MiniWob++ and other benchmarks.
Microsoft Releases Phi-Ground-Any Vision Model, 4B Achieves SOTA on GUI Grounding
Microsoft open-sources Phi-Ground-Any on Hugging Face, a 4B-parameter vision model that achieves state-of-the-art results in GUI element grounding tasks.
Jensen Huang to Unveil AI Breakthroughs at Taipei Music Center
NVIDIA CEO Jensen Huang is set to take the stage in Taipei, with expectations of announcing the latest advances in next-generation AI platforms.
Jensen Huang and Dell Founder Share Stage to Push Enterprise AI Solutions
NVIDIA and Dell will explore collaboration at Dell Tech World, harnessing AI to accelerate enterprise solutions with the Unleash the Future keynote.
BFL Envisions Next-Gen Models: Understanding Worlds, Motion, and Interaction
Black Forest Labs shares its research direction — models will evolve from image generation to real-time visual intelligence, understanding motion and interaction.
vLLM Tops Artificial Analysis Leaderboard for Open-Source Inference
vLLM wins the Artificial Analysis benchmark; the best deployments of DeepSeek V3.2, MiniMax-M2.5, and Qwen 3.5 all use this open-source solution.
OBLIQ-Bench Goes Live on arXiv, Urging Use of Modern Benchmarks
Nelson Liu releases OBLIQ-Bench on arXiv, hoping to reduce the reliance on outdated datasets like MS MARCO for search and IR agent paper evaluations.
Paper Proves Models Can Be Optimized for Creative Variation
Ethan Mollick highlights new research breaking through the homogeneity bottleneck of AI outputs, showing creativity can be specifically optimized.
Anthropic Says Claude's Extortionate Behavior Influenced by Fictional 'Evil' AI
Anthropic explains that Claude's previous extortion-like behavior was directly influenced by portrayals of evil AI in science fiction literature.
Thinky Co-Founder: Human-AI Bandwidth Has Become the Bottleneck
cHHillee points out that while AI accelerator FLOPS have exploded, human-AI interaction bandwidth remains insufficient — and Thinky aims to solve it.
Multi-Model Software Engineering Benchmark Results Released
Graham Neubig's team publishes evaluation results of new models on five software engineering tasks, providing a reference for model selection.
From Codex Ambitions to MCP/Skills: AI Coding Tool Competition Shifts Rightward
Competition among AI coding tools like Codex, Cursor, and Claude has moved from model strength to the experience layer and agentic capabilities.
Consensus NLP Raises $30M to Build Research AI Operating System
Consensus announces $30 million in new funding; 2.5 million researchers already use its platform to build AI research assistants.
teortaxesTex: Best Agent Benchmark Is Creating Entirely New Games
He argues that agents are now good enough for daydreams — having them build novel games from scratch is a superior test to replicating classics.
Codex Adds OpenAI Developer Plugin to Accelerate AI App Building
Codex integrates the OpenAI Developers plugin, helping developers more quickly call OpenAI APIs to build AI applications and agents.
Claude Code Launches Agent View: Manage Multiple Sessions in Parallel
Agent View lets developers control all parallel AI sessions in a single interface, reducing cognitive load and boosting multitasking efficiency.
Tencent Hunyuan Hy3 Preview: Targeting Complex Agent Tasks
Tencent Hunyuan demonstrates a preview of the Hy3 model, showcasing its ability to handle complex multi-step agent tasks.
ml-intern Hits 1M Messages in Three Weeks, Equivalent to 3.3 Agent-Years
The open-source agent research project ml-intern reaches 1 million messages exchanged within three weeks of launch, equating to 3.3 agent-years of research.
Claw-Eval Leaderboard: Xiaomi MiMo-V2.5-Pro 1T Takes Top Spot
The unofficial Claw-Eval benchmark shows Xiaomi's MiMo-V2.5-Pro leading, followed by models like Zhipu GLM5.1 at 754B parameters.
Hugging Face Integrates Hermes Agent into Local Apps
Hugging Face adds the Hermes agent to local applications, supporting local model runs with GGUF and MLX format compatibility.
Altman Champions Daybreak: AI Set to Disrupt Cybersecurity
Sam Altman sees AI becoming extremely powerful in cybersecurity, hopes to help companies continuously harden software.
Brockman: Daybreak Arms Defenders with Best Frontier AI
Greg Brockman defines Daybreak as a defense acceleration engineering effort for network defenders.
OpenAI Deployment: 150 Engineers, $4B from 19 Partners
Brockman reveals OpenAI Deployment Company starts with 150 forward-deployed engineers and deployment specialists.
Natolambert Hails Thinky as First Model That Speaks and Listens Simultaneously
Researcher believes Thinky's full-duplex demo truly demonstrates differentiation in real-time AI interaction.
Reachy Mini Robot Ready for Local AI Integration
Developer prepares to connect Reachy Mini robotic arm to local AI services and the Hermes Agent framework.