Cohere released Command A+ on May 20, 2026, an Apache 2.0 licensed open-weight sparse mixture-of-experts model built for enterprises that want to run artificial intelligence on their own infrastructure. The model accepts text, images and tool-use calls, supports 48 languages, and is the company’s first to use a mixture-of-experts architecture, with 218 billion total parameters and 25 billion active at any one time.
The launch marks a strategic pivot for Cohere, which previously shipped Command A models under more restrictive terms. By offering the weights under a permissive Apache 2.0 license, the company is targeting governments, financial institutions and critical-infrastructure operators that demand data sovereignty and the ability to deploy AI inside air-gapped or virtual-private-cloud environments. The move throws a direct challenge to proprietary models that require data to leave a customer’s network.
The model’s hardware envelope is tailored for self-hosting. Cohere stated that the W4A4 quantized version can run on a single Nvidia B200 GPU or two H100 GPUs. Higher-precision variants need more compute: the FP8 version requires two B200s or four H100s, while the full BF16 weights demand four B200s or eight H100s. The context window allows 128,000 input tokens and up to 64,000 output tokens. All three quantization levels are downloadable from Hugging Face, and the model is also available through Cohere’s API and its Model Vault service.
Cohere described Command A+ as its fastest and most powerful Command model, pointing to improvements over predecessors on selected benchmarks including tau2-Bench Telecom, Terminal-Bench Hard, MMMU and MathVista. However, independent analysis published on May 21 by Artificial Analysis tempered those claims. The firm assigned the model an Intelligence Index score of 37, measured output speed at roughly 281 tokens per second on Cohere’s API, and recorded 63% on the multimodal MMMU-Pro benchmark. It also noted that Command A+ trailed peer models on the hardest scientific reasoning and coding tests—specifically HLE, GPQA Diamond, Terminal-Bench Hard and SciCode—and that its overall intelligence position remains mid-tier rather than frontier-leading.
The enterprise narrative is reinforced by Cohere’s emphasis on low hardware requirements and the ability to run in disconnected or highly regulated settings. The company’s press release described the model as built for “sovereign critical infrastructure,” a phrase aimed squarely at buyers who cannot let data transit to external cloud services. This positioning echoes a broader industry debate about where enterprise AI workloads will ultimately live, though a single product launch does not settle the contest between open-weight and fully proprietary approaches.
The release also leaves important questions unanswered. Cohere has not disclosed training data, training code or a full technical report, which means the model falls short of the strictest open-source definitions. No independent reproduction of the W4A4 deployment on actual production workloads has been published, and the internal performance figures—including claims about quantization quality and speculative decoding—have not been audited by an outside party.
For the executive evaluating deployment options, Command A+ offers a combination of permissive licensing and manageable hardware costs that will appeal to organizations with strict data-control requirements. Yet the independent benchmark record suggests it is best viewed as a competent, cost-conscious contender, not a new state of the art. The move will force competitors to articulate more clearly how they plan to serve the enterprise self-hosting market.