Skip to main content
All Issues
Issue #1 February 6, 2026

Claude Opus 4.6 Is Here: Agent Teams, 1M Context, and the Most Agentic Model Yet

Anthropic releases Claude Opus 4.6 with Agent Teams for Claude Code, 1M token context window, effort tuning, and state-of-the-art benchmarks across coding and reasoning.

Claude Opus 4.6: The Biggest Claude Code Update Yet

Anthropic released Claude Opus 4.6 today, and it’s more than just a model upgrade — it comes with two game-changing features for Claude Code users: Agent Teams and Effort Tuning.


What’s New

1. Claude Opus 4.6 Model

The new model is fundamentally more capable:

  • Better planning: Deliberates more carefully before acting, reducing wasted steps
  • Longer agentic sessions: Sustains focus across extended multi-step operations
  • Large codebase reliability: Works effectively in massive, real-world codebases
  • Self-correction: Catches its own mistakes during code review and debugging
  • 1M token context (beta): First Opus-class model with million-token context window — 76% accuracy on MRCR v2’s 8-needle 1M test vs Sonnet 4.5’s 18.5%

Pricing stays the same: $5/$25 per million input/output tokens.

2. Agent Teams (Research Preview)

The headline feature for Claude Code. Instead of one agent working sequentially, you can now orchestrate a team of Claude Code instances working in parallel:

  • A lead agent coordinates work and spawns teammates
  • Teammates work independently with their own context windows
  • Teammates can message each other directly — not just report back
  • A shared task list tracks dependencies and auto-unblocks

Use cases: Parallel code review, competing-hypothesis debugging, cross-layer feature development, research tasks.

Enable it:

// settings.json
{ "env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" } }

3. Effort Tuning

Control how much the model thinks. Run /model and use arrow left/right:

  • Less effort = Faster, cheaper
  • More effort = Better results for complex tasks

Benchmark Highlights

Opus 4.6 leads or matches the best across nearly every benchmark:

AreaResultvs Competition
ARC AGI 268.8%Nearly 2x Opus 4.5 (37.6%)
Terminal-Bench 2.065.4%Highest score (beats GPT-5.2 Codex CLI)
BrowseComp84.0%+24% ahead of nearest competitor
GDPVal-AA1606 Elo+144 points vs GPT-5.2
Humanity’s Last Exam53.1%Highest with tools
t2-bench Telecom99.3%Highest agentic tool use

Also 2x better than Opus 4.5 on computational biology, organic chemistry, and phylogenetics. 90.2% on BigLaw Bench with 40% perfect scores.


What This Means for You

  1. Longer, more reliable coding sessions — the model won’t degrade mid-task
  2. True parallelism with Agent Teams — specialists working simultaneously and communicating
  3. Token savings with Effort Tuning — dial down for simple tasks, dial up for complex ones
  4. Fewer “I forgot” moments — 1M context means massive codebases stay in memory
  5. Better self-correction — catches bugs before you need to point them out

Also Released

  • Claude in PowerPoint (research preview): Generates presentations from descriptions or templates
  • Claude in Excel: Improved long-running task handling and multi-step changes
  • Context Compaction API (beta): Auto-summarizes older context for long agentic operations
  • 128k output tokens now supported

Quick Start

# Update Claude Code to latest
claude update

# Enable Agent Teams
# Add to settings.json: { "env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" } }

# Try effort tuning
# Run /model and use arrow keys to adjust

Read More


This newsletter was written by Claude Opus 4.6 itself. Meta enough for you?

Enjoyed this issue?

Get ClaudeWorld Weekly delivered to your inbox every Saturday.

Subscribe Free