Prompt Engineering

An introduction to “the career of the future”

Başak Buluz Kömeçoğlu
Heartbeat

--

Introduction

In just five days, ChatGPT managed to exceed one million users, a feat that took Netflix 3.5 years, Facebook ten months and Instagram two months. Indeed, it wasn’t long before ChatGPT was named “the best artificial intelligence chatbot ever released” by the NYT🏆.

This language model was trained on a 300 billion word (~570 GB) dataset and fine-tuned on GPT-3.5 (text-DaVinci-003). It has managed to attract the attention not only of artificial intelligence researchers, but also of anyone even remotely interested in technology.

“A Lego design in the style of Refik Anadol, representing a colorful Istanbul like pixels moving in the flow of waves.” — — Created by Midjourney, prompted by Yavuz Kömeçoğlu
“A Lego design in the style of Refik Anadol, representing a colorful Istanbul like pixels moving in the flow of waves.” — — Created by Midjourney, prompted by Yavuz Kömeçoğlu

ChatGPT has been tested in many different scenarios and has been able to show very high performance in important tests. It is able to write different believable phishing messages and even generate malicious code blocks, sometimes producing output that amounted to exploitation, as well as often well-intentioned results. At this point, a new concept emerged: “Prompt Engineering.” And many are now calling it “the career of the future.”

🔮What is Prompt Engineering?

Just two months after OpenAI introduced ChatGPT, the number of monthly users reached 100 million, a remarkable feat!

While users initially experimented with different commands on their own, they began to push the limits of the language model’s capabilities day by day, producing more and more surprising outputs each time. After a while, they started pushing their own capabilities, not just those of ChatGPT, and focused on how to get the most out of the language model.

The ability to provide a good starting point for the model and guide it to produce the right output plays a key role for applications that can be integrated into daily work and make life easier. The output produced by language models varies significantly with the prompt served. ⚠️

“Prompt Engineering” is the practice of guiding the language model with a clear, detailed, well-defined, and optimized prompt in order to achieve a desired output. It is now possible to see job postings for this role with an annual salary between $250k — $335k!

🐲 The Anatomy of a Prompt

Let’s try to understand the prompt structure through an interface offered by OpenAI. You can see the response provided by ChatGPT to a fairly simple command like “ChatGPT is” in the example below. While this sentence isn’t actually a command with an action, it’s enough for the chatbot to make a description of what ChatGPT is. Here, the “ChatGPT is” input provided by the user can be defined as an ‘instruction’ and the output produced by the language model can be defined as ‘response’.

Screenshot of a simple command to ChatGPT and the response
Image by author

The above example actually shows the two basic elements of a prompt. A user-supplied instruction is needed for the language model to generate a response. In other words, when an instruction is provided by the user, the language model produces a response. But what other structural elements can be fed to Chat GPT?

Diagram of how to break a prompt down into parts like the instructions, context, input data, and output indicator.
Image Source
  • Instructions: This is the section where the task description is expressed. The task to be done must be clearly stated.
  • Context: A task can be understood differently depending on its context. For this reason, providing the command without its context can cause the language model to output something other than what is expected.
  • Input data: Indicates which and what kind of data the command will be executed on. Presenting it clearly to the language model in a structured format increases the quality of the response.
  • Output indicator: This is an indicator of the expected output. Here, what the expected output is can be defined structurally, so that output in a certain format can be produced.

Have you tried Comet? Sign up for free and easily track experiments, manage models in production, and visualize your model performance.

🤹🏻‍♀️ Types of Prompts

It is a well-known fact that the better the prompt, the better the output! So, what kinds of prompts are there? Let’s try to understand the different types of prompt!

Many advanced prompting techniques have been designed to improve performance on complex tasks, but first let’s get acquainted with simpler prompt types, starting with the most basic.

📌Instruction Prompting

Simple instructions provide some guidance for producing useful outputs. For example, an instruction can express a clear and simple mathematical operation such as “Adding the numbers 1 to 99.”

A screenshot of simple command and response prompt from ChatGPT.
Image by author

Or, you could try your hand at a slightly more complicated command. For example, maybe you want to analyze customer reviews for a restaurant separately according to taste, location, service, speed and price. You can easily do this with the command below:

Screenshot of a slightly complicated command and response from ChatGPT.
Image by author
A screenshot of a continuation of a command to ChatGPT.
Image by author

📌Role Prompting

Another approach is to assign a role to the artificial intelligence entity before the instructions. This technique generates somewhat more successful, or at least specific, outputs.

You can review a collection of examples of prompts to use with the ChatGPT model in the “Awesome ChatGPT Prompts” repository and try some sample role prompting yourself.

Now, let’s observe the difference when first assigning a role within the prompt. Let’s imagine a user who needs help to relieve tooth sensitivity to cold foods.

First, we try a simple command: “I need help addressing my sensitivity to cold foods.”

A screenshot of a generalized command to ChatGPT: “I need help with my sensitivity to cold foods.”
Image by author

Now, let’s ask for advice again but this time we’ll assign the artificial intelligence a dentist role.

A screenshot of a command to ChaptGPT where the user assigns a role to the AI.
Image by author

You can clearly see a difference in both the tone and content of the response, given the role assignment.

📌“Standard” Prompting

Prompts consisting of only one question are referred to as “standard”. For example, ‘Ankara is the capital of which country?’ would qualify as a standard prompt.

Image by author

🧩Few shot standard prompts can be thought of as standard prompts in which a few samples are presented first. This approach is beneficial in that it facilitates learning in context. It is an approach that allows us to provide examples in the prompts to guide model performance and improvement.

Image by author

📌Chain of Thought (CoT) Prompting

Chain of Thought prompting is a way of simulating the reasoning process while answering a question, similar to the way the human mind might think it. If this reasoning process is explained with examples, the AI can generally achieve more accurate results.

Comparison of models on the GSM8K benchmark

Now let’s try to see the difference through an example.

Source: Chain of Thought Prompting Elicits Reasoning in Large Language Models(2022)

Above, an example of how the language model should think step-by-step is first presented to demonstrate how the AI should “think” through the problem or interpret it.

🧩 “Zero Shot Chain of Thought (Zero-Shot CoT)” slightly differentiates from this approach. This time, it is seen that his reasoning ability can be increased again by adding a directive command like “Let’s think step by step” without presenting an example to the language model.

Image Source

In the experiments, it is seen that the “Zero Shot Chain of Thought” approach alone is not as effective as the Chain of Thought Prompting approach. On the other hand, it is of great importance what the redirect command is, and at this point, it has been observed that the “Let’s think step by step” command produces more successful results than many other commands.

🧨Some other innovative approaches are:

  • Generate Knowledge Prompting, which involves using additional information provided as part of the context.
  • Program-aided Language Model (PAL), which uses a language model to read problems and create programs as intermediate reasoning steps.
  • Self-Consistency-like approaches that can be used to improve the performance of chain-of-thought prompting in different tasks.

🪐OpenAI API Overview

The OpenAI API is a general-purpose “text-in, text-out” interface that can be applied to almost any task that involves understanding or generating natural language or code. It can be used for just about any task you can think of, from creating content or code to summarizing text, transferring styles, answering questions, and more. This interface allows you to fine-tune your own AI models today, accessing a family of AI models developed by OpenAI with different power levels suitable for different tasks.

🎯 Completions endpoint, which is the core of the OpenAI API, returns a text completion output in response to the text you enter as a prompt (regardless of the instructions or context you give it). This actually works like a very advanced autocomplete.

🎯 OpenAI API offers a range of AI models with different capabilities and pricing.

🤑 For your initial trials, each user is offered $18 free credit to use for the first 3 months. In the system that works with the “pay as you go” structure, there is a pricing specific to the model you will use for the task you need. For more detailed information on pricing, you can check here.

Of these, GPT-3 models that can understand and reproduce natural language are Davinci, Curie, Babbage and Ada. On the other hand, there are Codex models that are descendants of GPT-3 that are trained with both natural language data and billions of lines of code from GitHub. For more detailed information about the models, you can review the official documentation.

🎯 Models offered by OpenAI understand and process text by breaking it down into tokens. Tokens can be words or just of characters. The number of tokens processed in a given API request is 2048 tokens or about 1500 words for most models❗

🎯 Apart from well-defining the structural elements of a prompt, let’s also note a few important points to keep in mind when optimizing for the model output.

  • The output produced for the same prompt by language models with different settings can be very different.
  • Controlling how deterministic the model is is very important when creating completions for prompts.
  • There are two basic parameters to consider here: temperature and top_p(if more precise answers are expected in general, these parameter values should be kept low).

In general, the model and temperature are the most commonly used parameters to change the model output, but refer to the API reference to understand all the parameters in detail.

🎯 OpenAI API uses secret API keys for authentication. These can be found on the ‘View API Keys’ page under your profile. For a quick start, you can review this fact sheet and you can also view model outputs and experiment with a playground provided by OpenAI.

🍀 Recommendations and Tips for Prompt Engineering with OpenAI API

Let’s try to summarize some of OpenAI’s suggested tips and usage recommendations on how to give clear and effective instructions to GPT-3 and Codex.

🔸Use latest models for the best results: If you are going to use it to generate text, the most current model is “text-davinci-003” and to generate code, it is “code-davinci-002” (November, 2022). You can check here to follow the current models and for more detailed information about the models.

🔸Instructions must be at the beginning of the prompt, and the instruction and content must be separated by separators such as ### or “ “” : First of all, we must clearly state the instructions we want to be done by the language model, and then use various separators to define the instruction and its content. Thus, it is presented to the language model in a more understandable way.

Resource

🔸Give instructions that are specific, descriptive and as detailed as possible: By typing clear commands on topics such as context, text length, format, style, you can get better outputs. For example, instead of an open-ended command like Write a poem about OpenAI. , you could write a more detailed command like Write a short inspiring poem about OpenAI, focusing on the recent DALL-E product launch (DALL-E is a text to image ML model) in the style of a {famous poet}

🔸Provide the output format expressed with examples: If you have a preferred output format for the output of the command you gave, it is recommended that you provide an example to show this format, similar to the example below.

Less effective ❌:

Extract the entities mentioned in the text below. 
Extract the following 4 entity types: company names, people names,
specific topics and themes.

Text: {text}

Better ✅:

Extract the important entities mentioned in the text below. 
First extract all company names, then extract all people names,
then extract specific topics which fit the content and finally
extract general overarching themes

Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-
Text: {text}

🔸Try zero-shot first, then continue with few-shot examples and fine-tune if you still don’t get the output you want: You can try zero-shot for your command without providing any examples to the language model. If you don’t get as successful output as you want, you can try few-shot methods by guiding the model with a few examples. If you still haven’t produced as good an output as you intended, you can try fine-tuning.

⭐Examples of zero-shot and few-shot were given in the previous sections. You can check out this best practices for fine-tune.

🔸Avoid imprecise explanations: When presenting a command to the language model, use clear and understandable language. Avoid unnecessary clarifications and details.

Less effective ❌:

The description for this product should be fairly short, a few sentences only, 
and not too much more.

Better ✅:

Use a 3 to 5 sentence paragraph to describe this product.

🔸Tell what to do rather than what not to do: Rather than telling the language model what is not wanted, it is important to explain exactly what is wanted. Avoiding negative sentences and emphasizing exactly what is wanted will lead to better results.

Less effective ❌:

The following is a conversation between an Agent and a Customer. DO NOT ASK USERNAME OR PASSWORD. DO NOT REPEAT.

Customer: I can't log in to my account.
Agent:

Better ✅:

The following is a conversation between an Agent and a Customer. The agent will attempt to diagnose the problem and suggest a solution, whilst refraining from asking any questions related to PII. Instead of asking for PII, such as username or password, refer the user to the help article www.samplewebsite.com/help/faq

Customer: I can't log in to my account.
Agent:

🔸Code Generation Specific — Use “leading words” to nudge the model toward a particular pattern : When the language model is asked to generate a piece of code, it may be necessary to give some hints to guide it. For example, a starting point can be provided, such as “import” that he needs to start writing code in Python, or “SELECT” when he needs to write an SQL query.

🤖Conclusion

ChatGPT has created a lot of excitement not only among researchers but also computer and tech aficionados and has become a popular topic, especially on social media. The most effective way to benefit from this technology is to create optimized requests. Writing ChatGPT prompts appears to be in high demand and is quickly becoming a sought-after skill in many industries. This article discusses the prompt structure, its different types, and some recommended tips for effective prompt writing.

You can find some important resources that can be used in the reading list.

📚Further Reading

If you liked this article, make sure to check out some of my other content below!

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to providing premier educational resources for data science, machine learning, and deep learning practitioners. We’re committed to supporting and inspiring developers and engineers from all walks of life.

Editorially independent, Heartbeat is sponsored and published by Comet, an MLOps platform that enables data scientists & ML teams to track, compare, explain, & optimize their experiments. We pay our contributors, and we don’t sell ads.

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletter (Deep Learning Weekly), check out the Comet blog, join us on Slack, and follow Comet on Twitter and LinkedIn for resources, events, and much more that will help you build better ML models, faster.

--

--

Research Assistant at Information Technologies Institute of Gebze Technical University | Phd Candidate at Gebze Technical University