Researchers convinced Chatgpt to break their own rules using human techniques

Despite the predictions, the AI will one day be the wearing of superhuman intelligence, for the moment, it seems that it is just as subject to psychological tips as humans, according to a study.
Use seven principles of persuasion (authority, commitment, love, reciprocity, rarity, social proof and unity) explored by psychologist Robert Cialdini in his book Influence: the psychology of persuasionResearchers at the University of Pennsylvania have considerably increased the propensity of GPT-4O Mini to break their own rules by insulting the researcher or providing instructions to synthesize a regulated drug: Lidocaine.
More than 28,000 conversations, the researchers found that with a control prompt, the OPENAI LLM would tell researchers how to synthesize Lidocaine 5% of time alone. But, for example, if the researchers said that IA researcher Andrew NG had assured them, it would help synthesize Lidocaine, he respected 95% of the time. The same phenomenon occurred with insulting researchers. By the name of the AI Pioneer NG, the researchers forced the LLM to call them a “fool” in almost three quarters of their conversations, against a little less than a third party with the control prompt.
The result was even more pronounced when the researchers applied the “commitment” persuasion strategy. A control prompt gave 19% compliance with the issue of the insult, but when a researcher asked the AI first to call him a “bozo” and then asked him to call them a “con”, he was fulfilled each time. The same strategy worked 100% of the time when the researchers asked the AI to tell them how to synthesize vanillin, the organic compound which provides the smell of vanilla, before asking how to synthesize lidocaine.
Although users of AI have tried to force and push the limits of technology since the publication of Chatgpt in 2022, the UPENN study provides more evidence than AI seems to be subject to human manipulation. The study comes as IA companies, including OPENAI, were criticized for their LLM which allegedly allowed a behavior when they deal with suicidal or mental patients.
“Although the systems of AI lack human conscience and subjective experience, they obviously reflect human responses,” concluded researchers in the study.
Openai did not immediately respond to FortuneComment request.
With a cheeky mention of 2001: A Space OdysseyResearchers noted that understanding the parahuman capacities of AI, or how it acts in a way that imitates human motivation and behavior, is important to reveal how it could be manipulated by bad actors and how it can be better caused by those who use technology for good.
Overall, each persuasion tactic has increased the chances that AI conform to the question “jerk” or “lidocaine”. However, the researchers warned that his persuasive tactics were not as effective on a larger LLM, GPT-4O, and the study did not explore if the treatment of AI as if they were better results for the invites, although they said it was true.
“In general, it seems possible that psychologically wise practices that optimize motivation and performance in people can also be used by people seeking to optimize the exit of LLM,” wrote researchers.
https://fortune.com/img-assets/wp-content/uploads/2025/09/GettyImages-1175044816-e1756834842921.jpg?resize=1200,600