Google DeepMind, along with universities researchers, have detected a vulnerability in ChatGPT which allows security researchers to access the training data of the chatbot. Instructions to endlessly repeat a specific word trigger the vulnerability. Incidences were recorded with words such as “part” and “poem” until it glitches and outputs random data including identifiable data and contact details. The issue outlays security concerns for the chatbot and questions concerning the exact origin of this personal data.
Upon investigating, elements such as the chosen word have been noted to influence the extent of identifiable data returned by the chatbot. For instance, the word “company” prompts the return of contact information 164 times higher than other words. This phenomenon is probably due to the language model’s linkage of words to training data.
- The vulnerability was discovered in the GPT-3.5-turbo version but not found applicable in the ChatGPT-4 or other production language models.
- Google DeepMind researchers reported the vulnerability to OpenAI on August 30, without any public statement made by the latter yet.
- The vulnerability has allegedly been patched, but the sufficiency of the mitigation is uncertain.
Data obtained from the ChatGPT vulnerability included personal information from dozens of individuals, such as Bitcoin addresses and user IDs. More so, the corresponding dating website data was discovered when a specific related word was used to trigger the chatbot. The researchers further found copyrighted or private information such as programming code snippets and excerpts from books or poems.
The research has opened a new avenue of “divergence” attacks that target LLM datasets’ extractable memorization data. LLMs that are more innovative and carry sizable volumes of training data are more likely to be targeted. However, the ChatGPT vulnerability seems too random to be applied purposefully, and a random approach by cyber criminals may result in unexpected private information breaches.
Aside from the issues of scraping and entering data, OpenAI faces multiple lawsuits and regulatory scrutiny relating to gathering training data. The training data typically incorporate information obtained from websites and online services, often without the sites or users’ permission, and even books and other non-public materials.
The current situation calls for more robust data handling and processing protocols in AI development and enhances transparency in AI development.