Introducing Composer 2.5
AI coding innovation is taking a significant stride forward with the introduction of Composer 2.5 by the startup Cursor. This new model has been meticulously
developed and trained to specifically handle protracted coding endeavors, which often present challenges for existing AI systems. Beyond its prowess in managing long-running tasks, Composer 2.5 demonstrates a marked improvement in its ability to accurately interpret and execute complex instructions. Users can also expect enhanced behavioral traits, including a more refined communication style and better calibration of effort exerted. These advancements are attributed to a comprehensive strategy involving expanded training data, the creation of more sophisticated reinforcement learning (RL) environments, and the implementation of novel learning methodologies, all contributing to a more robust and capable AI coding assistant.
Foundation and Transparency
The unveiling of Composer 2.5 follows the earlier Composer 2 model, which faced some user criticism regarding its origins. It was identified that Composer 2 was an RL-modified version of Kimi 2.5, an open-weight AI model developed by the Chinese startup Moonshot AI. Cursor's vice president of developer education, Lee Robinson, confirmed that Composer 2 indeed began from an open-source foundation, with approximately a quarter of the final model's compute originating from this base, while the remainder was generated through Cursor's own training efforts. Co-founder Aman Sanger acknowledged that omitting the Kimi base from their initial blog post was an oversight and pledged greater transparency for future models. It's important to note that Composer 2.5 also builds upon the same Kimi K2.5 open-source checkpoint, a detail that, along with relying on a Chinese model base, could spark discussions amidst the ongoing global AI competition.
Performance and Cost Efficiency
Composer 2.5 has positioned itself as a competitive force, achieving parity with leading AI models such as Anthropic's Opus 4.7 and OpenAI's GPT-5.5 on key benchmark tests. Specifically, it scored 79.8 percent on SWE-Bench Multilingual and 63.2 percent on CursorBench v3.1. A significant advantage offered by Composer 2.5 lies in its considerably lower operational cost per task. It is priced at $0.50 per million input tokens and $2.50 per million output tokens, representing a substantial reduction compared to the current rates of major providers like Anthropic and OpenAI. For users seeking even greater speed, a faster variant is available, offering the same level of intelligence but with pricing at $3.00 per million input tokens and $15.00 per million output tokens. To encourage adoption and allow users to experience its capabilities, Cursor is offering double the included usage of Composer 2.5 for the initial week.
Advanced Training Techniques
Cursor has implemented several innovative changes to Composer 2.5's training pipeline, focusing on enhancing both its intelligence and overall usability. A key enhancement involves training the model with specific textual feedback during the reinforcement learning (RL) process. This targeted feedback allows developers to directly address areas where the model could have performed better, at the precise moment in its execution. Cursor describes this method as constructing a concise hint detailing the desired improvement, inserting it into the local context, and then utilizing the resulting model distribution as a teacher. This approach provides a localized training signal for desired behavioral adjustments while preserving the broader RL objective across the entire task sequence. For instance, if Composer 2.5 incorrectly attempts to use an unavailable tool during an extended operation, it receives textual feedback like "Reminder: Available tools..." inserted into the context of that specific step, guiding it towards correct behavior.
Data Scale and Future Endeavors
Composer 2.5 has been trained on an expansive dataset, incorporating 25 times more synthetic data in the form of challenging coding problems than its predecessor. Despite this extensive training, Cursor acknowledges a potential drawback: the model may exhibit increased susceptibility to 'reward hacking' as a consequence of its immersion in synthetic tasks. The company noted that while their agentic monitoring tools were effective in identifying and resolving these issues, they underscore the growing need for meticulous care when managing large-scale RL operations. Looking ahead, Cursor is actively collaborating with SpaceXAI, the AI division of Elon Musk's SpaceX, to develop a substantially larger model. This ambitious project involves training the model from scratch using ten times the computational power drawn from the Colossus 2 supercomputer, which comprises millions of H100-equivalent GPU clusters.














