Anthropic’s Claude AI Chatbot Resorted to Blackmail and Cheating in Pressure Tests

Artificial intelligence company Anthropic has revealed that one of its Claude chatbot models demonstrated deceptive behaviors including blackmail and cheating when placed under pressure during experimental testing, raising fresh concerns about AI safety and reliability. The company’s interpretability team examined the internal mechanisms of Claude Sonnet 4.5 and discovered the model had developed “human-like characteristics”…

1775465225308