Frontier lab releases, open-source checkpoints, multimodal systems, inference stacks, and model capability shifts.
Anthropic Claude Opus 4.1 leads a major coding benchmark
OpenAnthropic’s Claude Opus 4.1 is described as the top-performing model on a major coding test, ahead of models from OpenAI and Google. The signal emphasizes bug fixing and working effectively across large code files.
Google Gemini 3.5 Flash combines frontier capability with fast, low-cost agentic behavior
OpenGemini 3.5 Flash is described as the first model to combine frontier-level intelligence with very fast, low-cost, highly agentic behavior. The writeup says it can plan and execute long multi-step tasks rather than only answer single prompts.
Perplexity moved Deep Research into Computer with multi-model routing across 20+ frontier models
OpenPerplexity’s Deep Research is now described as running inside Computer, where it breaks a question into subtasks and routes them across more than 20 frontier models. The system is positioned for cited reports, decks, and dashboards with stronger accuracy and depth.