
An AI Chatbot Was Taught To Hack Other AI Chatbots
Computer scientists from Nanyang Technological University have figured out how to compromise artificial intelligence (AI) chatbots. To do this, they trained a chatbot to create hints that allow them to bypass the protection of other AI-based chatbots.
Singaporean researchers used a two-pronged large language model (LLM) hacking method called Masterkey. First, they reverse engineered how LLMs detect and defend against malicious queries. Using this information, they taught LLMs to automatically learn and offer hints that allowed them to bypass the security of other LLMs.












