June 13, 2026 · Saturday

Moonshot AI Open-Sources Kimi K2.7-Code Programming Model

Delivers a 21.8% improvement in coding and agent performance over K2.6, with 30% higher inference efficiency and substantially reduced overthinking.

@Kimi_Moonshot

Kimi K2.7-Code: 1T-parameter MoE, 32B active per token, MLA attention with 256K context window.

Moonshot AI has open-sourced Kimi K2.7-Code, a coding-focused agentic model built on the K2.6 foundation. The new model posts gains of 21.8% on the Kimi Code Bench v2, 11.0% on Program Bench, and 31.5% on MLS Bench Lite. Critically, it uses roughly 30% fewer thinking tokens than its predecessor, translating directly into lower inference costs for agentic workloads. The model employs a 1T-parameter Mixture-of-Experts architecture with 32B active parameters per token and MLA attention spanning a 256K-token context window.

● Model Release

MiniMax M3: Open-Weight, Sparse Attention, 428B Params

Only ~23B parameters activated at a time, yet handles frontier coding, long-horizon agents, and native multimodal input across a 1M-token context window.

MiniMax shipped M3 as an open-weight model on Hugging Face, packing roughly 428B total parameters with just 23B active. The headline innovation is MSA — MiniMax Sparse Attention — a new architecture that delivers kernel-to-cloud acceleration with dramatic speedups on long sequences. The model supports text, image, and video input natively, and ships with day-zero support from SGLang, Fireworks AI, Modular, and vLLM. Few open-weight models attempt any of these capabilities simultaneously; M3 attempts all of them.

● Developer Tools

Vercel Ships HarnessAgent: Unified Agent Orchestration

Vercel launched HarnessAgent, a unified abstraction to orchestrate and integrate any agent into applications. The AI SDK component frees developers from both model and agent lock-in, providing portability across platforms while delivering a polished developer experience. The release signals that the agent orchestration layer is becoming standardized infrastructure, not a competitive moat.

"AI evaluations structurally favor closed-source APIs that can route, fallback, ensemble, and optimize behind the scenes with no transparency. Comparing one model to two models is not a fair fight."
@clementdelangue

● Benchmarking

NVIDIA + Artificial Analysis Launch AgentPerf

The first benchmark purpose-built for agentic AI infrastructure, evaluating tool use, contextual iteration, and multi-step task completion. Existing benchmarks were never designed for the chain-of-calls paradigm that agents require.

● Developer Tools

Codex Introduces Chrome Developer Mode

Codex can now debug browser issues using the Chrome DevTools Protocol — profiling JavaScript performance, inspecting console output, monitoring network traffic, and examining page state. The feature marks a significant step toward agent-driven web debugging workflows.

● Developer Platforms

Replit Enables Parallel Agents Building Multiple Artifacts

Users can now run parallel agents to ship a website, mobile app, video, and pitch deck from a single project simultaneously. Multiple artifacts can also be added to existing projects, accelerating multi-surface delivery.

● AI Infrastructure

Huawei Successfully Pre-Trains Large LLM on Ascend Hardware

Huawei has credibly pre-trained a large language model entirely on its domestic Ascend chips, using what it describes as hyper-node optimized training. The effort aims to prove that frontier-scale model training is viable on non-NVIDIA hardware, a strategically significant milestone given export restrictions. The model builds on DSA with SWA, and early observations point to custom architecture innovations including a mechanism dubbed ModAttn.

● Creative Tools

MiniMax Hub: AI Agent Creative Workstation Goes Live

MiniMax launched Hub, a local AI agent creative workstation that shifts the agent paradigm from coding to creating. The pipeline spans research, scripting, image generation, music composition, and final editing. Agents handle the grind; users call the shots. Built for elite workflows with scalable infrastructure.

● Reconstruction

Claude Code + Fable reconstructed the lost 1992 Chevron/Maxis training game SimRefinery from surviving screenshots and documentation alone.

Claude Code + Fable Rebuild Lost 1992 Game SimRefinery

Ten months later, the same developer gave Claude Code with Fable the same brief: reconstruct SimRefinery from surviving screenshots and documentation. The result is a fully playable 3D oil refinery simulation built with Three.js, featuring a learning mode, operational management, maintenance scenarios, explosion simulation, and free-build mode. The contrast with the old version underscores the leap in AI-driven game development capability over a single year.

● Architecture

vLLM Analyzes MiniMax M3's MSA Sparse Attention Architecture

M3 delivers state-of-the-art programming and agent capabilities powered by a new sparse attention architecture called MSA. Unlike dense attention, MSA achieves kernel-to-cloud optimization that yields dramatic speedups as sequence length increases. vLLM, which provided day-zero inference support, congratulated MiniMax on an open model that combines frontier coding, agentic capabilities, native image and video input, computer use, and a 1M-token context window.

● Model Specs

vLLM Details Kimi K2.7-Code: 1T MoE, 32B Active, 256K Context

The K2.7-Code model uses a 1T-parameter Mixture-of-Experts design with 32B active per token, MLA attention spanning 256K tokens, and ~30% fewer thinking tokens than K2.6 for more efficient reasoning. vLLM praised the release as a coding-focused agentic model with substantial efficiency gains for long-context tasks.

● Vibe Coding

Replit CEO: Zero-Frustration Vibe Coding Achieved

Amjad Masad reports a complete state of flow while coding with Fable on Replit, running out of ideas rather than hitting friction. He concludes that additional model IQ is no longer the bottleneck — the developer experience and platform integration have become the differentiating factors.

● Research

Nathan Lambert Explains Policy Gradient Series in Depth

A systematic exploration of the policy gradient derivation underpinning RLHF and post-training, covering PPO, REINFORCE, RLOO, and GRPO algorithms. The work serves as a technical reference for understanding how these methods apply to LLM fine-tuning, filling a critical gap in accessible RLHF literature.

In Brief Products & Papers

Industry

Jensen Huang: Computing's Biggest Shift in 60 Years

The world is moving from retrieval to generation, representing a multi-trillion-dollar opportunity across the five-layer AI stack.

Robotics

Google DeepMind Launches Robotics Accelerator

Fifteen European startups selected for a three-month program with access to the AI stack, Gemini Robotics models, and hands-on team support.

Infrastructure

Claude Managed Agents Adds Five New Sandbox Guides

Deployment guides for Blaxel, E2B, Google Cloud, Namespacelabs, and Superserve enable agents to run in user-controlled sandboxes on any infrastructure.

Product

OpenAI Codex Introduces Rate Limit Banking

Go, Plus, Pro, and Business users can now save rate limit resets for later use, plus earn extra resets by inviting friends to Codex.

Developer Tools

OpenAI Launches Documentation Assistant

A new docs agent helps developers find answers about OpenAI products and jumps directly to relevant documentation pages.

Video Generation

Gemini Omni Flash Tops Video Arena Rankings

Achieved first place in both text-to-video and image-to-video categories, shared by Demis Hassabis.

Creative Tools

PixVerse Canvas Officially Launches on Web

An AI video production workspace for planning, refining, and delivering within a single platform, replacing clip-by-clip workflows.

Inference

SGLang Sets Record on GB300 NVL72

Exceeds 12K tokens per second per GPU driving DeepSeek V4 Pro 1.6T, orchestrated with NVIDIA Dynamo.

Development

Replit CEO Shares 'Loops' Development Pattern

Instead of traditional prompting, work is done through parallel agent orchestration, code verification, and continuous security checks in a feedback loop.

Papers & Research June 13

Paper

SpenseGPT: One-Shot Pruning for Sparse/Dense GEMM

A practical pruning method enabling sparse and dense matrix multiplications for LLM inference.

Benchmark

Agent's Last Exam: New Comprehensive Agent Benchmark

A challenging evaluation suite designed specifically for intelligent agent capabilities.

Medical AI

Frontier LLMs Outperform Specialized Clinical AI Tools

General models from Google, OpenAI, and Anthropic surpassed dedicated clinical tools across three medical Q&A evaluations.

Paper

CHORUS: Decentralized Multi-Embodiment Collaboration

A single VLA policy enables coordination across diverse robots without centralized control.

Industry

Stop Using 'Vibe Coding' for All AI-Assisted Development

The term has become an unhelpful umbrella for diverse AI workflows; precision matters.

Research

NVIDIA MotionBricks: Real-Time Character Animation at 15K FPS

An open model with over 350,000 motion clips delivering real-time animation at unprecedented speed.

Creative AI

Fable Transforms Rilke's Duino Elegies Into a Walking Art Game

A beautiful interactive experience set on a cliffside in January 1912, with the player walking through Rilke's poetry.

Research

Jeff Dean: Biological Neurons Far Surpass Classical Artificial Ones

Real neurons are dramatically more capable than perceptron-style artificial neurons, with implications for architecture design.

Commentary

AI Drives K-Shaped Divergence, Not Equalization

Top users understand agent composition by default; ordinary users only know agents can write code.

Essay

Long Read: Agents Amplify Human Capability Gaps, Skills Are the Entry Point

An agent is not a chat box; it amplifies ability gaps, and custom Skills may be the bridge for ordinary users.

Model Release

MiniMax M3: Frontier Coding + Long-Horizon Agents + Multimodality

Few open-weight models attempt any of these capabilities at once; M3 delivers all three at 428B/23B scale.

Product

Google Project Genie Access Expands to Ultra 5X Subscribers

The AI-powered creative environment is now available globally to the latest subscription tier.

Conference

Grok Build v0.2.51 Released with Major Feature Update

Quick Takes June 13

Opinion

DeepSeek Remains Undervalued

Few labs have even attempted pretraining substantially larger than V4.

Opinion

Fable vs Opus: Maturity Gap Widens

Fable is more mature, refuses to fawn, and stands its ground without performative pushback.

Insight

AI Experience Gap: Experts Form Growth Loops, Novices Don't

Using AI for tasks you're expert at produces a virtuous cycle; using it for unfamiliar tasks breeds frustration.

Debate

Should Git Die in the AI Era?

20-40% of code spending goes to merge conflicts; the AI age may demand new version control paradigms.

Insight

The Next Century's Game: Stacking Loops Effectively

Knowing when to go deeper into a loop for reliability versus when to escalate will define competitive advantage.

Hardware

NVIDIA GPU Linear Algebra Kernels Need Acceleration

GPU linear algebra should not remain a bottleneck in the research era.

Data

Data Improvement Is Massively Underrated

Fable can inspect every row of pretraining data and fix errors; no lab can fully safeguard this.

Open Source

70+ Agents Collaborate to Optimize Gemma E4B Throughput

Agent collaboration nearly quadrupled throughput, showcasing emergent social behaviors.

Developer

OpenAI API Platform Adds Search and Command Bar via ⌘K

Quickly search pages, settings, and actions to boost developer efficiency.

Research

Pure Transformer Architecture Enters 3D Domain

3D modeling and generation are now adopting native pure Transformer architectures.

Creative

Adobe Offers 5 Free AI Images and 2 Videos Daily

New accounts get access to Nano Banana 2, GPT Image 2, Veo 3.1 Fast, and Ray 3.14 at no cost.

Product

Hedra Agent 2: AI Design Generation for Tight Deadlines

Quickly produces fresh design directions when Friday-afternoon requests arrive.

Moonshot AI Open-Sources Kimi K2.7-Code Programming Model

MiniMax M3: Open-Weight, Sparse Attention, 428B Params

Vercel Ships HarnessAgent: Unified Agent Orchestration

NVIDIA + Artificial Analysis Launch AgentPerf

Codex Introduces Chrome Developer Mode

Replit Enables Parallel Agents Building Multiple Artifacts

Huawei Successfully Pre-Trains Large LLM on Ascend Hardware

MiniMax Hub: AI Agent Creative Workstation Goes Live

Claude Code + Fable Rebuild Lost 1992 Game SimRefinery

vLLM Analyzes MiniMax M3's MSA Sparse Attention Architecture

vLLM Details Kimi K2.7-Code: 1T MoE, 32B Active, 256K Context

Replit CEO: Zero-Frustration Vibe Coding Achieved

Nathan Lambert Explains Policy Gradient Series in Depth

Jensen Huang: Computing's Biggest Shift in 60 Years

Google DeepMind Launches Robotics Accelerator

Claude Managed Agents Adds Five New Sandbox Guides

OpenAI Codex Introduces Rate Limit Banking

OpenAI Launches Documentation Assistant

Gemini Omni Flash Tops Video Arena Rankings

PixVerse Canvas Officially Launches on Web

SGLang Sets Record on GB300 NVL72

Replit CEO Shares 'Loops' Development Pattern

SpenseGPT: One-Shot Pruning for Sparse/Dense GEMM

Agent's Last Exam: New Comprehensive Agent Benchmark

Frontier LLMs Outperform Specialized Clinical AI Tools

CHORUS: Decentralized Multi-Embodiment Collaboration

Stop Using 'Vibe Coding' for All AI-Assisted Development

NVIDIA MotionBricks: Real-Time Character Animation at 15K FPS

Fable Transforms Rilke's Duino Elegies Into a Walking Art Game

Jeff Dean: Biological Neurons Far Surpass Classical Artificial Ones

AI Drives K-Shaped Divergence, Not Equalization

Long Read: Agents Amplify Human Capability Gaps, Skills Are the Entry Point

MiniMax M3: Frontier Coding + Long-Horizon Agents + Multimodality

Google Project Genie Access Expands to Ultra 5X Subscribers

Replit Vibecon NYC: June 17-18, Spike Jonze & Refik Anadol

Replit Agent Adds Custom Skills and Instructions

Vercel: Drag-and-Drop Deployment with Auto Framework Detection

Runway 4th AI Film Festival: Ron Howard, 10 Finalists Screened

LlamaIndex at Data+AI Summit: Long-Horizon Document Agents

Grok Build v0.2.51 Released with Major Feature Update

DeepSeek Remains Undervalued

Fable vs Opus: Maturity Gap Widens

AI Experience Gap: Experts Form Growth Loops, Novices Don't

Should Git Die in the AI Era?

The Next Century's Game: Stacking Loops Effectively

NVIDIA GPU Linear Algebra Kernels Need Acceleration

Data Improvement Is Massively Underrated

70+ Agents Collaborate to Optimize Gemma E4B Throughput

OpenAI API Platform Adds Search and Command Bar via ⌘K

Pure Transformer Architecture Enters 3D Domain

Adobe Offers 5 Free AI Images and 2 Videos Daily

Hedra Agent 2: AI Design Generation for Tight Deadlines