Frontier lab releases, open-source checkpoints, multimodal systems, inference stacks, and model capability shifts.
OpenAI’s GPT-OSS open-weight coding model lands on Together AI and local stacks
OpenOpenAI’s **GPT-OSS** has been released as an **open-weight, Apache 2.0–licensed** model in two sizes (around 20B and 120B parameters), targeting coding and reasoning workloads while being optimized to run efficiently on a single 80 GB GPU or even 16 GB edge devices.[2][3] Benchmarks reported by Together and community reviewers show GPT-OSS approaching GPT-4.1-class minis on core reasoning benchmarks and supporting strong tool use, function calling, and chain-of-thought capabilities.[2][3]
Open models close performance gap with closed frontier systems while cutting inference cost
OpenMIT Sloan analysis finds that modern **open-weight models** typically ship at about **90% of closed-model performance**, and the gap often narrows further as the community fine-tunes and optimizes them.[5] The study estimates that shifting demand from closed to open models where feasible could reduce industry-wide inference spending by over **70%**, saving the global AI economy around **$25B annually**.[5]
2026 ‘frontier model war’ shows multimodality is baseline, not differentiator
OpenA 2026 comparison of **22 frontier models** (GPT, Claude, Gemini, DeepSeek, Qwen, Kimi and others) notes that *every major model* now supports text, images, and documents, making **multimodality a floor rather than a differentiator**.[6] The analysis emphasizes that competitive edges are shifting to context length, tool/agent integration, pricing, and enterprise integration rather than just raw modality support.[6]