Concerns Over Claude's Reliability
Stella Laurenzo, the AI Director at AMD, has publicly expressed significant dissatisfaction with the current performance of Anthropic's Claude, specifically
regarding its code generation abilities. Through her GitHub account, she articulated that the model's performance has diminished to a point where it can no longer be depended upon for intricate engineering work. This assessment is the result of an extensive internal review, encompassing over 6,800 coding sessions, approximately 235,000 tool invocations, and nearly 18,000 reasoning blocks. Laurenzo highlighted that numerous senior engineers within her team have reported analogous challenges, noting a troubling increase in "stop-hook violations." These violations occur when the AI prematurely terminates tasks or makes unjustified requests for permissions, disrupting the development workflow. The frequency of these issues escalated from zero to around 10 per day in the preceding month, according to Laurenzo's observations. She posited that this decline in reliability might be linked to the recent implementation of a feature called 'redact-thinking-2026-02-12,' suggesting that the capacity for extended reasoning is crucial for the successful execution of complex engineering projects.
Behavioral Shifts and Code Quality
Beyond the direct issue of premature task termination and permission requests, Laurenzo also observed a fundamental shift in Claude's operational approach. She described a transition from a predominantly research-oriented methodology to an 'edit-first' strategy. This change in focus, according to Laurenzo, has directly led to a noticeable decline in the overall quality of the generated code. The code produced is reportedly less adherent to established coding conventions and exhibits reduced dependability, particularly during extended operational periods. This behavioral alteration implies that Claude may be prioritizing immediate code output over thorough analysis or exploration, which is often necessary for robust and maintainable software development. The implications of this shift are significant for teams relying on AI for sophisticated coding tasks, where precision, adherence to standards, and long-term reliability are paramount.
Anthropic's Defense and Counterpoints
In response to these serious allegations from AMD's AI chief, Anthropic has offered a detailed rebuttal. Boris Cherny, an engineer at Anthropic, clarified that the 'redact-thinking' setting, which Laurenzo cited as a potential cause for regression, primarily serves to conceal the model's reasoning process from the user interface. He asserted that this feature does not, in fact, diminish the AI's underlying reasoning capabilities. Furthermore, Anthropic pointed to the introduction of 'adaptive thinking' in Claude Opus 4.6 as a mechanism designed to address varying task complexity. This feature allows the system to dynamically adjust its thinking duration based on the specific demands of the task at hand. Cherny also provided guidance, suggesting that users seeking more in-depth reasoning can explicitly enable higher effort levels by using the `/effort` command or configuring their settings.json file. While the default 'medium effort' setting (effort=85) aims for a balance between performance and resource utilization, Anthropic confirmed they are actively testing enhanced effort configurations for their Teams and Enterprise users, which would allow for extended thinking at the potential expense of increased token usage and latency.














