OpenAI Frontier Models Now on AWS Bedrock for Enterprise
OpenAI's frontier models and Codex are now available for enterprises via Amazon Bedrock, supporting secure and compliant workflows.
OpenAI announced that its frontier models and Codex are now generally available on AWS, giving enterprises a new way to build on Amazon Bedrock with OpenAI through the security, compliance, and governance workflows they already use. This marks the beginning of a broader expansion of OpenAI's enterprise footprint. Developers can now build AI applications and software engineering workflows with OpenAI models using the AWS environments and controls their teams already trust, bridging the gap between cutting-edge AI capabilities and enterprise-grade infrastructure.
NVIDIA Unveils RTX Spark: 1 Petaflop Personal AI Superchip
NVIDIA RTX Spark is a 1 petaflop superchip with full CUDA and RTX ecosystem, enabling native Windows AI agents, marking a new era for PCs.
NVIDIA announced the RTX Spark, a one-petaflop superchip that brings the full CUDA and RTX ecosystem to personal computers. The chip enables Windows-native AI agents, representing what the company calls a new beginning for personal computing. With its massive compute capability in a consumer form factor, RTX Spark could reshape how developers and creators run AI workloads locally, challenging the established boundaries between workstation and desktop AI.
xAI Composer 2.5 Integrated with Grok Build
Composer 2.5 is a fast, highly intelligent model excelling in long-running tasks and complex instruction following, now available inside Grok Build. The model is designed for precise multi-step execution across extended agent tasks.
Alibaba Qwen3.7-Plus Unifies Vision and Language
Qwen3.7-Plus is a multimodal agent model that unifies vision and language into one versatile agent foundation. It supports multimodal interactive hybrid agents with unified GUI and CLI operation across visual and text tasks.
Cosmos Coalition to Open-Source World Models
Runway, NVIDIA, and leading AI labs formed the Cosmos Coalition, a global initiative to build and open-source frontier world models for physical AI. Runway joins as a founding member alongside a set of leading research labs.
NVIDIA Vera Rubin in Full Production for Agentic AI
NVIDIA unveiled the Vera Rubin multi-rack system, including five interconnected rack-scale systems, now in full production.
The NVIDIA Vera Rubin platform is a multi-rack pod-scale system built to process agentic AI workloads and is now in full production. Through extreme co-design, Vera Rubin unifies five connected rack-scale systems: the NVIDIA Vera Rubin NVL72, Vera CPU rack, Groq 3 LPX, and Vera networking fabric. The platform represents NVIDIA's most ambitious infrastructure play yet, purpose-built for the agentic AI era where autonomous AI systems operate at massive scale.
Runway Establishes London European HQ, Invests $100 Million
Runway announced a London European headquarters and new world model research center, planning $100M investment over 18 months and expansion through 2028.
Runway announced London as its new European headquarters and newest research hub focused on general world models. Over the next 18 months, the company plans to invest $100 million into the UK AI ecosystem, with that figure more than doubling through 2028 as European operations scale. The move signals growing European investment in physical AI and world model research, with London positioned as a key hub for frontier AI development outside North America.
One of the new, buzzy jobs in Silicon Valley is the AI Forward Deployed Engineer — an engineer embedded within a client organization to customize solutions, building and tuning agentic workflows for the client's particular needs.
Andrew Ng on the rise of AI FDE roles in Silicon Valley
Luma Launches Open Physical AI Lab for Generalization
Luma established an open-science physical AI lab to solve the generalization problem in physical AI. The lab addresses what stands between current AI systems and a future where AI can meaningfully interact with and improve the physical world through robotic and embodied systems.
Perplexity Launches Search as Code Architecture
Perplexity Agent API introduces Search as Code: agents write Python to call the search stack directly, replacing iterative function calls. Now default in Computer mode, this architecture dramatically streamlines how AI agents interact with search infrastructure, reducing latency and improving reasoning precision.
Tencent Hunyuan Releases Agent Memory Plugin Hy-Memory
Hy-Memory is designed for long-term collaborative agents, based on a 6-layer memory framework and System1/System2 dual-processing system, giving agents a true "second brain." More than a retrieval tool, it enables persistent memory across extended agent tasks and collaborative workflows.
MiniMax M3 Now on Vercel AI Gateway
MiniMax M3 with 1M context and multimodal input is now available on Vercel's AI Gateway, with a 50% discount for the first week. Developers can immediately start building with frontier coding capabilities through the Vercel developer platform.
Replit: Build a Complete Business from a Single Prompt
Replit now generates websites, mobile apps, slide decks, and launch videos from one prompt. The feature includes partner perks from Stripe Atlas, QuickBooks, Mercury, and Doola for launching real businesses.
Runway Aleph 2.0 Introduces Fast Masking
Aleph 2.0 creates compositing mattes in seconds, isolating subjects from backgrounds for compositing, coloring, or applying effects. Users upload video, prompt for a white silhouette on black, review the preview, and export the matte.
Claude Resets Rate Limits and Fixes Sub-Agent Issue
Claude reset 5-hour and weekly rate limits for Pro and Max users, and fixed an issue in Claude Code where excessive parallel sub-agents burned through token usage faster than expected.
MiniMax M3 Now on Ollama Cloud
Ollama Cloud now supports the MiniMax M3 model, running in US regions with zero data retention, ready for coding and agent tasks. Launch via ollama launch claude --model minimax-m3:cloud.
Step 3.7 Flash Available for Free on kilocode
Step 3.7 Flash is free on kilocode, designed for coding agents with multi-step orchestration and reliable tool calling across real codebases, not just fast replies.
vLLM Partners with NVIDIA RTX Spark for Local Agents
vLLM collaborated with NVIDIA to run large NVFP4 models locally on DGX Spark, pushing local AI agents forward with the vLLM inference engine and RTX Spark hardware.
vLLM-Omni Offers Day-Zero Cosmos 3 Support
vLLM-Omni provides day-zero support for NVIDIA Cosmos 3, a unified Mixture-of-Transformers fusing autoregressive reasoning and diffusion generation across text, image, video, audio, and robot action — all behind a single OpenAI-compatible API.
Voice Hackathon Winner: Phone Agentic OS
A voice-first mobile OS called Agentic OS won the OpenAI Voice Hack Night People's Choice award. Users talk, agents answer and take action across the phone. The team won $50,000 in API credits.
GrepSeek: Training Search Agents for Direct Corpus Interaction
GrepSeek researches training search agents to interact directly with corpora, bypassing traditional retrieval pipelines for more efficient information access.
Microsoft and NVIDIA Put Grace and Blackwell in Laptops
Microsoft and NVIDIA collaborate to integrate Grace and Blackwell chips into laptops, challenging Apple Silicon's six-year dominance in personal computing.
Adobe Partners with NVIDIA RTX Spark
Adobe announced a partnership with NVIDIA to leverage RTX Spark in Photoshop, Premiere, and Substance 3D for optimized AI creative workflows and faster rendering.
MiniMax M3 Leads Open Models in Next.js Agent Evaluation
MiniMax M3 became the leading open-source model in Vercel's Next.js AI agent evaluation, behind only Opus and GPT-5, at roughly 10x lower cost per token.