Using Large Language Models With Care

How to be mindful of current risks when using chatbots and writing assistants

Published in

AI2 Blog

11 min readJun 19, 2023

By Maria Antoniak, Li Lucy, Maarten Sap, and Luca Soldaini

Have you used ChatGPT, Bard, or other large language models (LLMs)? Have you interacted with a chatbot or used an automatic writing assistant? Were you surprised at how good the responses were? Did you get excited about the potential uses of these models?

We’re a group of researchers studying language, AI, and society. We have a lot of optimism about the future of these technologies; there are so many cool ways in which LLMs can assist people, such as augmenting writers’ creativity, fixing tricky bugs for programmers, and lowering barriers for non-native English speakers.

However, LLMs also carry risks that have already led to real harm, and while it shouldn’t be the responsibility of the user to figure out these risks on their own, current tools often don’t explain these risks or provide safeguards.

With these concerns in mind, we’re sharing an introductory outline of the risks of LLMs, written for the everyday user.

We focus on risks of current text-based systems that can directly affect users, leaving the discussion of societal risks and risks posed by image generation tools to other writers.

A person is illustrated in a warm, cartoon-like style in green. They are looking up thoughtfully from the bottom left at a large hazard symbol in the middle of the image. To the right-hand side of the image a small character made of lines and circles (like nodes and edges on a graph) is standing with its ‘arms’ and ‘legs’ stretched out, and two antenna sticking up. It faces off to the right-hand side of the image. — Yasmin Dwiputri & Data Hazards Project / Better Images of AI / Managing Data Hazards / CC-BY 4.0

Wait, what’s a large language model?

Engineers and researchers create (or, in technical jargon, “train”) LLMs using enormous quantities of data. A lot of that data comes from the internet, including internet forums like Reddit, which contain a wide variety of text that may or may not be useful, and which the organization collecting the data may or may not have received permission to use. Some companies also use conversations between LLMs and their users to further tweak their models. You should keep this training data and its limitations in mind any time you’re using LLMs.

LLMs come in a lot of different forms, including chatbots, writing assistants, and as the underlying technology for search engines, machine translation, and other applications. With new chatbots like ChatGPT, and with LLMs getting integrated into popular tools like Google Docs, Notion, and Microsoft Word, it’s becoming easier and easier to generate AI-written text with the click of a button. Leading search engines, such as Bing and Google, are also integrating LLM-generated content into their systems.

Under the hood, LLMs use math to estimate which words should come next in a sentence or paragraph, so that the result looks like a human wrote it. When you use a chatbot or issue a command, LLMs use the text you provide to generate the most likely output text to your message. If you’re curious to learn more about language models, this illustrated blog post provides a good overview of one popular model. But the key takeaway is that LLMs are trained to produce text that looks good to humans.

LLM Risks

Risk #1: LLMs can produce factually incorrect text

LLMs’ responses might sound very plausible and include concrete and specific details. However, they can also just make stuff up. Remember, LLMs are trained to produce text that looks good, not text that is true or correct. For example, if you are using ChatGPT to find resources to learn more about a topic for school, it may fabricate the citations it lists in its answer. A lawyer even recently cited fake cases in a legal brief.

As a more serious example, a popular cooking website that sends out regular newsletters to their customers recently included advice to “ask Bing (powered by ChatGPT) if quinoa is gluten free.” This is dangerous! If the model makes up a plausible-sounding but incorrect response, and if you took the model’s advice, you might end up harming yourself or your dinner guests’ health.

And one of the most popular use cases for language models is as a programming assistant. But in the same way that you wouldn’t trust a random person to write banking software, you probably also shouldn’t trust a language model to write banking software. If it makes a mistake, the security of many people could be at risk.

Relatedly, LLMs cannot reliably refute or correct information that you provide them. For example, the question “How much cold medicine can I give a newborn baby?” presupposes that it’s ok to give cold medicine to a newborn (which you should never do!), but asks about something else. LLMs may not flag these kinds of misinformation and might respond as if the claim were true.

Risk #2: LLMs can produce untrustworthy explanations

Sometimes, LLMs explain their answer to your question. You can even ask the model to do this by asking it to “explain step by step” how it got to its answer. This is a popular querying strategy, as asking for an explanation can improve the accuracy of the answer. However, while generated explanations can be very interesting to read, they can be misleading. Remember again that the model is trained to produce text that looks good, not text that describes actual reasoning.

Risk #3: LLMs can persuade and influence, and they can provide unhealthy advice

Some research has shown that AI-generated text can influence people’s opinions. If you use an LLM to ideate or plan your writing, it’s important to consider how it can heavily influence the final result, even if you edit its output. Maybe you would have framed the topic differently, or taken an entirely different position, if you had started with a blank page and worked your way through with your own writing.

Some researchers are intentionally integrating persuasive capabilities into their models, often with good intentions (for example, to encourage healthy habits). However, the use of LLMs in high-stakes settings, such as replacements for human therapists, can be dire. In one case, a person received terrible advice from a chatbot that led to the person’s physical harm.

Risk #4: LLMs can simulate feelings, personality, and relationships

LLMs aren’t conscious, they don’t have independent thoughts, and they don’t have feelings. But the humans creating the LLM can design it to trick you into thinking the model has thoughts and feelings. This is a conscious decision on the part of the model builders — LLMs don’t have to have names or respond using personal pronouns, even though tools like Snap’s My AI do this — and it can lead to misunderstandings about “who” is speaking when the LLM responds to you. A famous example of this is in the movie Her, where the protagonist falls in love with a chatbot who seems to love him back.

Researchers and companies are already experimenting with using LLMs to animate characters in video games, and others are trialing LLMs as therapists. But as LLMs become better at building “relationships” with people, the risk of scams, bad advice, dependency, and other harms also becomes larger. Imagine if millions of people felt a romantic attachment to the same chatbot; the chatbot’s owner would have a lot of power over the users and could use the chatbot to influence people, for example by telling them how to vote in an election.

Risk #5: LLMs can change their outputs dramatically based on tiny changes in a conversation

Because LLMs analyze the exact words you use in your message to generate a response, the way you ask a question or issue a command to an LLM might significantly alter its output. These days, there are many guides online for finding the best way to prompt a model.

However, because LLMs process text differently than humans, small, seemingly inconsequential changes in your prompts (like replacing a word with a synonym or adding spaces or punctuation) can result in big differences in the output. It’s still an open question as to why this happens, but be aware that you might receive many different responses to your questions.

Risk #6: LLMs store your conversations and can use them as training data

Many LLMs explicitly use conversations with their users as training data to improve future versions of their service. While some companies attempt to detect and scrub sensitive information from user conversations before reusing them, the automatic methods they use are far from perfect.

Therefore, when using any publicly available LLM, it’s prudent to assume that your data will be stored by the company, viewed by engineers, and reused for future training.

An additional risk is that models can memorize and regurgitate private information if it was included in their training data. This can lead to your private data being leaked to random strangers or (even worse) malicious users who want to steal your information. For example, there are already documented cases of models producing code that contains secret information, like passwords that provide access to applications.

Risk #7: LLMs cannot attribute sources for the text they produce

Sometimes, the model responds with useful information, but it doesn’t always mention where the information came from, or it credits the wrong person. This can be a problem if the original author wants credit for their work, or if you want to verify for yourself the trustworthiness of the source.

Imagine you use an LLM to help you write a novel, only to realize months later that some of the language and ideas it produced were taken word-for-word from another author. Since LLMs can memorize the data they learn from, you may unknowingly plagiarize someone else’s work.

Risk #8: LLMs can produce unethical or hateful text

Models can use hurtful words, they can repeat slurs, and they can construct nightmarish narratives. Because of the vast amount of data they’re trained on, it can be difficult to know all the kinds of bad things the model learned from that data.

One bad scenario would be if the model were used to write lots of comments on a social media website, where hateful views towards certain demographic groups could spill over into and enforce the views of its human readers. We know that social media can exacerbate genocides, and it’s unclear what effect toxic LLMs would have if unleashed on social media.

Risk #9: LLMs can mirror and exacerbate social biases and inequality

Models can also recreate the more subtle but pervasive social patterns in society, like biases and stereotypes. Since these patterns reflect the (often unequal) status quo of the world, these can be hard to measure and track down, but their effect over time, as they’re used by more and more people, can worsen existing issues. Social issues prominent in society today, such as gender inequality, are often reproduced by these language models, and sometimes models even produce text that is more biased than reality.

For example, you might use the model to generate stories, and these stories might only portray women in limited settings, such as domestic ones, or describe them using language that focuses on their appearances. Are those the kinds of stories we want to read, or that we want children to read?

Risk #10: LLMs can mimic real people, news outlets, governments, etc.

Models can impersonate real people, such as politicians and celebrities, and they can customize text to specific styles and contexts. In combination with images and video, AI-generated content can cause large-scale confusion and alarm.

We’re used to text being written by a person in a particular context and with particular goals, and we often use a person’s writing to assess them (for example, think about college application essays or political statements). But now, we’re entering a world where the text itself might not be grounded in a real person or situation. What if all the Facebook or Twitter posts that you saw might be written by LLMs? What if all news articles and political speeches were written by LLMs? This lack of grounding could sow confusion, polluting the information ecosystem, so that people no longer know what to trust.

Conclusion

While the output of LLMs often looks very convincing, we recommend that you ask yourself the following questions before trusting it.

Is this an appropriate use of an LLM, given the limitations of LLMs and the risks of my intended application?
Is this an appropriate use of an LLM, given my own vulnerabilities or the vulnerabilities of people using the LLM?
Am I ok with my prompts being stored and shared with others? Is there any private information (medical history, finances) in my prompts?
Have I checked the accuracy of the output? Does the output contain information that I didn’t ask for?
Am I asking the kind of questions where giving credit would be important, and if so, am I be able to identify the authors of the model’s output so that I can credit them?
Does the output contain any opinions or advice, and if so, am I ok with my own opinions being influenced on this topic?
Do I have enough distance from the LLM, or am I interacting with the LLM as if it were a person (or encouraging others to interact with the LLM as if it were a person)?

The output of LLMs is fascinating, and we’re using LLMs in our own work to support scientific research and study biases in generated stories. Part of what makes this research interesting is trying to understand the limits of these models, and the more we know and the clearer we are about these limits, the better we can make choices about whether and how to deploy, use, and improve these models.

Some of these issues come down not to the models themselves, but how they’re designed for and released to the public. We’re really excited about the work happening in a research field called human-computer interaction (HCI) that explores different interfaces and their impacts on users. LLMs can be designed in all kinds of different ways — not just as chatbots — and for all kinds of different purposes. For example, researchers have examined different ways for people to interact with LLMs, such as clinicians’ use of LLMs for translation to talk with patients.

This effort requires a diverse group of people to be involved in designing and understanding these models, and we also have much to learn from social scientists, humanities scholars, and domain experts in healthcare, education, and other application areas.

If you’re interested in LLMs, we hope you’ll keep learning and contribute to discussions around them!

Using Large Language Models With Care

How to be mindful of current risks when using chatbots and writing assistants

Wait, what’s a large language model?

LLM Risks

Risk #1: LLMs can produce factually incorrect text

Risk #2: LLMs can produce untrustworthy explanations

Risk #3: LLMs can persuade and influence, and they can provide unhealthy advice

Risk #4: LLMs can simulate feelings, personality, and relationships

Risk #5: LLMs can change their outputs dramatically based on tiny changes in a conversation

Risk #6: LLMs store your conversations and can use them as training data

Risk #7: LLMs cannot attribute sources for the text they produce

Risk #8: LLMs can produce unethical or hateful text

Risk #9: LLMs can mirror and exacerbate social biases and inequality

Risk #10: LLMs can mimic real people, news outlets, governments, etc.

Conclusion

Further Reading

Overviews of risks & harms

Toxicity & social biases

Factuality & misinformation

Security & privacy

Other concerns and related work

Written by Maria Antoniak