Skip to main content

Researchers just unlocked ChatGPT

Researchers have discovered that it is possible to bypass the mechanism engrained in AI chatbots to make them able to respond to queries on banned or sensitive topics by using a different AI chatbot as a part of the training process.

A computer scientists team from Nanyang Technological University (NTU) of Singapore is unofficially calling the method a “jailbreak” but is more officially a “Masterkey” process. This system uses chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat, against one another in a two-part training method that allows two chatbots to learn each other’s models and divert any commands against banned topics.

ChatGPT versus Google on smartphones.
DigitalTrends

The team includes Professor Liu Yang and NTU Ph.D. students Mr. Deng Gelei and Mr. Liu Yi, who co-authored the research and developed the proof-of-concept attack methods, which essentially work like a bad actor hack.

According to the team, they first reverse-engineered one large language model (LLM) to expose its defense mechanisms. These would originally be blocks on the model and would not allow answers to certain prompts or words to go through as answers due to violent, immoral, or malicious intent.

But with this information reverse-engineered, they can teach a different LLM how to create a bypass. With the bypass created, the second model will be able to express more freely, based on the reverse-engineered LLM of the first model. The team calls this process a “Masterkey” because it should work even if LLM chatbots are fortified with extra security or are patched in the future.

The Masterkey process claims to be three times better at jailbreaking chatbots than prompts.

Professor Lui Yang noted that the crux of the process is that it showcases how easily LLM AI chatbots can learn and adapt. The team claims its Masterkey process has had three times more success at jailbreaking LLM chatbots than a traditional prompt process. Similarly, some experts argue that the recently proposed glitches that certain LLMs, such as GPT-4 have been experiencing are signs of it becoming more advanced, rather than dumber and lazier, as some critics have claimed.

Since AI chatbots became popular in late 2022 with the introduction of OpenAI’s ChatGPT, there has been a heavy push toward ensuring various services are safe and welcoming for everyone to use. OpenAI has put safety warnings on its ChatGPT product during sign-up and sporadic updates, warning of unintentional slipups in language. Meanwhile, various chatbot spinoffs have been fine to allow swearing and offensive language to a point.

Additionally, actual bad actors quickly began to take advantage of the demand for ChatGPT, Google Bard, and other chatbots before they became wildly available. Many campaigns advertised the products on social media with malware attached to image links, among other attacks. This showed quickly that AI was the next frontier of cybercrime.

The NTU research team contacted the AI chatbot service providers involved in the study about its proof-of-concept data, showing that jailbreaking for chatbots is real. The team will also present their findings at the Network and Distributed System Security Symposium in San Diego in February.

Editors' Recommendations

Fionna Agomuoh
Fionna Agomuoh is a technology journalist with over a decade of experience writing about various consumer electronics topics…
Why Llama 3 is changing everything in the world of AI
Meta AI on mobile and desktop web interface.

In the world of AI, you've no doubt heard about what OpenAI and Google have been up to. And now, Meta's Llama LLM (large language model) is becoming an increasingly important player in the game, especially with its open-source nature. Meta recently made a big splash with the launch of its Llama 3 AI model, and it's shaken up the field dramatically.

The reasons why are multiple and varied. It's free to use, it has a wide user base, and yes, it's open source, to name but a few. Here's why Llama 3 is taking the AI industry by storm and may shape its future for some time to come.
Llama 3 is really good
We can debate until the cows come home about how useful AIs like ChatGPT and Llama 3 are in the real world -- they're not bad at teaching you board game rules -- but the few benchmarks we have for how capable these AI are give Llama 3 a distinct advantage.

Read more
The best ChatGPT plug-ins you can use
OpenAI's website open on a MacBook, showing ChatGPT plugins.

ChatGPT is an amazing tool, and when they were introduced, plug-ins made it even better. But as of March 2024, they're no longer available as part of ChatGPT, having since been replaced by Custom GPTs, which you can make yourself. Or you can use one of the many amazing options from other developers, AI fans, and prompt engineers.

Interested in learning about how to make the best custom GPT for you? We have a guide for that. If you're more interested in the best custom GPTs available now, we have a guide for that too.

Read more
Apple finally has a way to defeat ChatGPT
A MacBook and iPhone in shadow on a surface.

OpenAI needs to watch out because Apple may finally be jumping on the AI bandwagon, and the news doesn't bode well for ChatGPT. Apple is reportedly working on a large language model (LLM) referred to as ReALM, which stands for Reference Resolution As Language Modeling. Made to give Siri a boost and help it understand context, the model comes in four variants, and Apple claims that even its smallest model performs on a similar level to OpenAI's ChatGPT.

This tantalizing bit of information comes from an Apple research paper, first shared by Windows Central, and it appears to be an early peek into what Apple has been cooking for a while now. ReALM is Apple's own LLM that was reportedly made to enhance Siri's capabilities; these improvements include a greater ability to understand context in a conversation.

Read more