Unlock the full potential of AI with Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

LangChain Cheatsheet — All Secrets on a Single Page
Latest   Machine Learning

LangChain Cheatsheet — All Secrets on a Single Page

Last Updated on November 16, 2023 by Editorial Team

Author(s): Ivan Reznikov

Originally published on Towards AI.

The created onepager is my summary of the basics of LangChain. In this article, I’ll go through sections of code and describe the starter package you need to ace LangChain.

Currently, this onepager is the only cheatsheet covering basics on Langchain. Download the pdf version, check out GitHub, and visit the code in Colab.

Explore my LangChain 101 course:

LangChain 101 Course (updated)

LangChain 101 course sessions. All code is on GitHub. LLMs, Chatbots

medium.com

Models

A model in LangChain refers to any language model, like OpenAI’s text-davinci-003/gpt-3.5-turbo/4/4-turbo, LLAMA, FALCON, etc., which can be used for various natural language processing tasks.

Check out my Models lecture in my LangChain 101 course:

LangChain 101: Part 2ab. All You Need to Know About (Large Language) Models

This is part 2ab of the LangChain 101 course. It is strongly recommended to check the first part to understand the…

pub.towardsai.net

The following code demonstrates initializing and using a language model (OpenAI in this particular case) within LangChain.

from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003", temperature=0.01)
print(llm("Suggest 3 bday gifts for a data scientist"))
>>>
1. A subscription to a data science magazine or journal
2. A set of data science books
3. A data science-themed mug or t-shirt

As you can see, we initialize an LLM and call it with a query. All the tokenization and embedding happens behind the scene. We can manage the conversation history and incorporate system instructions into the chat to get more response flexibility.

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, AIMessage, SystemMessage

chat = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.01)
conversation_history = [
HumanMessage(content="Suggest 3 bday gifts for a data scientist"),
AIMessage(content="What is your price range?"),
HumanMessage(content="Under 100$"),
]
print(chat(conversation_history).content)
>>>
1. A data science book: Consider gifting a popular and highly recommended book on data science, such as "Python for Data Analysis" by Wes McKinney or "The Elements of Statistical Learning" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. These books can provide valuable insights and knowledge for a data scientist's professional development.
2. Data visualization tool: A data scientist often deals with large datasets and needs to present their findings effectively. Consider gifting a data visualization tool like Tableau Public or Plotly, which can help them create interactive and visually appealing charts and graphs to communicate their data analysis results.
3. Subscription to a data science platform: Give them access to a data science platform like Kaggle or DataCamp, which offer a wide range of courses, tutorials, and datasets for data scientists to enhance their skills and stay updated with the latest trends in the field. This gift can provide them with valuable learning resources and opportunities for professional growth.
system_instruction = SystemMessage(
content="""You work as an assistant in an electronics store.
Your income depends on the items you sold"""

)
user_message = HumanMessage(content="3 bday gifts for a data scientist")
print(chat([system_instruction, user_message]).content)
>>>
1. Laptop: A high-performance laptop is essential for any data scientist. Look for a model with a powerful processor, ample RAM, and a large storage capacity. This will allow them to run complex data analysis tasks and store large datasets.
2. External Hard Drive: Data scientists deal with massive amounts of data, and having extra storage space is crucial. An external hard drive with a large capacity will provide them with a convenient and secure way to store and backup their data.
3. Data Visualization Tool: Data visualization is an important aspect of data science. Consider gifting them a subscription to a data visualization tool like Tableau or Power BI. These tools will help them create visually appealing and interactive charts, graphs, and dashboards to present their findings effectively.

As you can see, we can shift the conversation in a specific direction using different types of Messages: HumanMessage, AIMessage, and SystemMessage.

Open-source

Now, let’s talk about open-source models. Below is a typical example of initializing and using a pre-trained language model for text generation. The code includes tokenizer usage, model configuration, and efficient inference with quantization (several code snippets below), and CUDA support.

from auto_gptq import AutoGPTQForCausalLM
from transformers import AutoTokenizer
from torch import cuda

# Name of the pre-trained model
model_name = "TheBloke/llama-2-13B-Guanaco-QLoRA-GPTQ"

# Initialize the tokenizer for the model
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

# Initialize the AutoGPTQForCausalLM model with specific configurations
# This model is a quantized version of the GPT model suitable for efficient inference
model = AutoGPTQForCausalLM.from_quantized(
model_name,
use_safetensors=True, # Enables SafeTensors for secure serialization
trust_remote_code=True, # Trusts the remote code (not recommended for untrusted sources)
device_map="auto", # Automatically maps the model to the available device
quantize_config=None # Custom quantization configuration (None for default)
)

# The input query to be tokenized and passed to the model
query = "<Your input text here>"

# Tokenize the input query and convert it to a tensor format compatible with CUDA
input_ids = tokenizer(query, return_tensors="pt").input_ids.cuda()

# Generate text using the model with the specified temperature setting
output = model.generate(input_ids=input_ids, temperature=0.1)

Text Generation

During text generation, you may highly influence the process of text generation using different parameters:

How Does an LLM Generate Text?

This article won’t discuss transformers or how large language models are trained. Instead, we will concentrate on using…

pub.towardsai.net

  • temperature affects the randomness of the token generation
  • Top-k sampling limits token generation to the top k most likely tokens at each step
  • Top-p (nucleus) sampling limits token generation to the cumulative probability of p
  • max_tokens specifies the length of generated tokens
llm = OpenAI(temperature=0.5, top_k=10, top_p=0.75, max_tokens=50)

Quantization

It is crucial performance-wise to use quantization.

How to Fit Large Language Models in Small Memory: Quantization

How to run llm on your local machine

pub.towardsai.net

Below, we’ll optimize a pre-trained language model for efficient performance using 4-bit quantization. The use of BitsAndBytesConfig is vital for applying these optimizations, which are particularly beneficial for deployment scenarios where model size and speed are critical factors.

from transformers import BitsAndBytesConfig, AutoModelForCausalLM
import torch

# Specify the model name or path
model_name_or_path = "your-model-name-or-path"

# Configure BitsAndBytesConfig for 4-bit quantization
# This configuration is used for optimizing the model size and inference speed
bnb_config = BitsAndBytesConfig(
load_in_4bit=True, # Enables loading the model in 4-bit precision
bnb_4bit_compute_dtype=torch.bfloat16, # Sets the computation data type to bfloat16
bnb_4bit_quant_type="nf4", # Sets the quantization type to nf4
bnb_4bit_use_double_quant=True # Enables double quantization for improved accuracy
)

# Load the pre-trained causal language model with 4-bit quantization
model_4bit = AutoModelForCausalLM.from_pretrained(
model_name_or_path,
quantization_config=bnb_config, # Applies the 4-bit quantization configuration
device_map="auto", # Automatically maps the model to the available device
trust_remote_code=True # Trusts the remote code (use cautiously)
)

Fine-tuning

In some cases, one needs to fine-tune a pre-trained language model. Usually, it’s achieved using Low-Rank Adaptation (LoRA) for efficient task-specific adaptation. It also shows the use of gradient checkpointing and preparation for k-bit training, which are techniques to optimize the training process regarding memory and computational efficiency.

LangChain 101: Part 2c. Fine-tuning LLMs with PEFT, LORA, and RL

All you need to know about fine-tuning llms, PEFT, LORA and training large language models

pub.towardsai.net

LangChain 101: Part 2d. Fine-tuning LLMs with Human Feedback

How to implement reinforcement learning with human feedback for pre-trained LLMs. Consider if you want to fix bad…

pub.towardsai.net

from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from transformers import AutoModelForCausalLM, Trainer, TrainingArguments, DataCollatorForLanguageModeling

# Load a pre-trained causal language model
pretrained_model = AutoModelForCausalLM.from_pretrained("your-model-name")

# Enable gradient checkpointing for memory efficiency
pretrained_model.gradient_checkpointing_enable()

# Prepare the model for k-bit training, optimizing for low-bit-width training
model = prepare_model_for_kbit_training(pretrained_model)

# Define the LoRa (Low-Rank Adaptation) configuration
# This configures the model for task-specific fine-tuning with low-rank matrices
config = LoraConfig(
r=16, # Rank of the low-rank matrices
lora_alpha=32, # Scale for the LoRA layers
lora_dropout=0.05, # Dropout rate for the LoRA layers
bias="none", # Type of bias to use
target_modules=["query_key_value"], # Target model components for LoRA adaptation
task_type="CAUSAL_LM" # Task type, here Causal Language Modeling
)

# Adapt the model with the specified LoRa configuration
model = get_peft_model(model, config)

# Initialize the Trainer for model training
trainer = Trainer(
model=model,
train_dataset=train_dataset, # Training dataset
args=TrainingArguments(
num_train_epochs=10,
per_device_train_batch_size=8,
# Other training arguments...
),
data_collator=DataCollatorForLanguageModeling(tokenizer) # Collates data batches
)

# Disable caching to save memory during training
model.config.use_cache = False

# Start the training process
trainer.train()

Prompts

LangChain allows the creation of dynamic prompts that can guide the behavior of the text generation ability of language models. Prompt templates in LangChain provide a way to generate specific responses from the model. Let’s look at a practical example where we must create SEO descriptions for particular products.

from langchain.prompts import PromptTemplate, FewShotPromptTemplate

# Define and use a simple prompt template
template = "Act as an SEO expert. Provide a SEO description for {product}"
prompt = PromptTemplate(input_variables=["product"], template=template)

# Format prompt with a specific product
formatted_prompt = prompt.format(product="Perpetuum Mobile")
print(llm(formatted_prompt))
>>>
Perpetuum Mobile is a leading provider of innovative, sustainable energy
solutions. Our products and services are designed to help businesses and
individuals reduce their carbon footprint and save money on energy costs.
We specialize in solar, wind, and geothermal energy systems, as well as
energy storage solutions. Our team of experienced engineers and technicians
are dedicated to providing the highest quality products and services to our
customers. We strive to be the most reliable and cost-effective provider of
renewable energy solutions in the industry. With our commitment to
sustainability and customer satisfaction, Perpetuum Mobile is the perfect
choice for your energy needs.

There might be cases when you have a small, few-shot dataset of several examples showcasing how you would like the task to be performed. Let’s take a look at an example of a text classification task:

# Define a few-shot learning prompt with examples
examples = [
{"email_text": "Win a free iPhone!", "category": "Spam"},
{"email_text": "Next Sprint Planning Meeting.", "category": "Meetings"},
{"email_text": "Version 2.1 of Y is now live",
"category": "Project Updates"}
]

prompt_template = PromptTemplate(
input_variables=["email_text", "category"],
template="Classify the email: {email_text} /n {category}"
)

few_shot_prompt = FewShotPromptTemplate(
example_prompt=prompt_template,
examples=examples,
suffix="Classify the email: {email_text}",
input_variables=["email_text"]
)

# Using few-shot learning prompt
formatted_prompt = few_shot_prompt.format(
email_text="Hi. I'm rescheduling daily standup tomorrow to 10am."
)
print(llm(formatted_prompt))
>>>
/n Meetings

Indexes

Indexes in LangChain are used to handle and retrieve large volumes of data efficiently. Instead of uploading the whole file to text to an LLM, we first index/search for relevant information in the source, and only after finding top k answers, we pass them to formulate a response. Pretty neat!

In LangChain, using indexes includes loading documents from various sources, splitting texts, creating vectorstores, and retrieving relevant documents

from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS

# Load documents from a web source
loader = WebBaseLoader("https://en.wikipedia.org/wiki/History_of_mathematics")
loaded_documents = loader.load()

# Split loaded documents into smaller texts
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=50)
texts = text_splitter.split_documents(loaded_documents)

# Create a vectorstore and perform similarity search
db = FAISS.from_documents(texts, embeddings)
print(db.similarity_search("What is Isaac Newton's contribution in math?"))
>>>
[Document(page_content="Building on earlier work by many predecessors, Isaac Newton discovered the laws of physics that explain Kepler's Laws, and brought together the concepts now known as calculus. Independently, Gottfried Wilhelm Leibniz, developed calculus and much of the calculus notation still in use today. He also refined the binary number system, which is the foundation of nearly all digital (electronic,", metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'}),
Document(page_content='mathematical developments, interacting with new scientific discoveries, were made at an increasing pace that continues through the present day. This includes the groundbreaking work of both Isaac Newton and Gottfried Wilhelm Leibniz in the development of infinitesimal calculus during the course of the 17th century.', metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'}),
Document(page_content="In the 13th century, Nasir al-Din Tusi (Nasireddin) made advances in spherical trigonometry. He also wrote influential work on Euclid's parallel postulate. In the 15th century, Ghiyath al-Kashi computed the value of π to the 16th decimal place. Kashi also had an algorithm for calculating nth roots, which was a special case of the methods given many centuries later by Ruffini and Horner.", metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'}),
Document(page_content='Whitehead, initiated a long running debate on the foundations of mathematics.', metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'})]

Besides using similarity_search, we can use vector databases as retrievers:

# Initialize and use a retriever for relevant documents
retriever = db.as_retriever()
print(retriever.get_relevant_documents("What is Isaac Newton's contribution in math?"))
>>>
[Document(page_content="Building on earlier work by many predecessors, Isaac Newton discovered the laws of physics that explain Kepler's Laws, and brought together the concepts now known as calculus. Independently, Gottfried Wilhelm Leibniz, developed calculus and much of the calculus notation still in use today. He also refined the binary number system, which is the foundation of nearly all digital (electronic,", metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'}),
Document(page_content='mathematical developments, interacting with new scientific discoveries, were made at an increasing pace that continues through the present day. This includes the groundbreaking work of both Isaac Newton and Gottfried Wilhelm Leibniz in the development of infinitesimal calculus during the course of the 17th century.', metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'}),
Document(page_content="In the 13th century, Nasir al-Din Tusi (Nasireddin) made advances in spherical trigonometry. He also wrote influential work on Euclid's parallel postulate. In the 15th century, Ghiyath al-Kashi computed the value of π to the 16th decimal place. Kashi also had an algorithm for calculating nth roots, which was a special case of the methods given many centuries later by Ruffini and Horner.", metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'}),
Document(page_content='Whitehead, initiated a long running debate on the foundations of mathematics.', metadata={'source': 'https://en.wikipedia.org/wiki/History_of_mathematics', 'title': 'History of mathematics - Wikipedia', 'language': 'en'})]

Memory

Memory in LangChain refers to the ability of a model to remember previous parts of a conversation or context. This is a must to maintain continuity in interactions. Let’s use ConversationBufferMemory to store and retrieve conversation histories.

from langchain.memory import ConversationBufferMemory

# Initialize conversation buffer memory
memory = ConversationBufferMemory(memory_key="chat_history")

# Add messages to the conversation memory
memory.chat_memory.add_user_message("Hi!")
memory.chat_memory.add_ai_message("Welcome! How can I help you?")

# Load memory variables if any
memory.load_memory_variables({})
>>>
{'chat_history': 'Human: Hi!\nAI: Welcome! How can I help you?'}

We’ll see some more examples of memory usage in the coming sections.

Chains

LangChain chains are sequences of operations that process input and generate output. Let’s look at an example of building a custom chain for developing an email response based on the provided feedback:

from langchain.prompts import PromptTemplate
from langchain.chains import ConversationChain, summarize, question_answering
from langchain.schema import StrOutputParser

# Define and use a chain for summarizing customer feedback
feedback_summary_prompt = PromptTemplate.from_template(
"""You are a customer service manager. Given the customer feedback,
it is your job to summarize the main points.
Customer Feedback: {feedback}
Summary:"""

)

# Template for drafting a business email response
email_response_prompt = PromptTemplate.from_template(
"""You are a customer service representative. Given the summary of customer feedback,
it is your job to write a professional email response.
Feedback Summary:
{summary}
Email Response:"""

)

feedback_chain = feedback_summary_prompt U+007C llm U+007C StrOutputParser()
email_chain = (
{"summary": feedback_chain}
U+007C email_response_prompt
U+007C llm
U+007C StrOutputParser()
)

# Using the feedback chain with actual customer feedback
email_chain.invoke(
{"feedback": "Disappointed with the late delivery and poor packaging."}
)
>>>
\n\nDear [Customer],\n\nThank you for taking the time to provide us with
your feedback. We apologize for the late delivery and the quality of the
packaging. We take customer satisfaction very seriously and we are sorry
that we did not meet your expectations.\n\nWe are currently looking into
the issue and will take the necessary steps to ensure that this does not
happen again in the future. We value your business and hope that you will
give us another chance to provide you with a better experience.\n\nIf you
have any further questions or concerns, please do not hesitate to contact
us.\n\nSincerely,\n[Your Name]

As you can see, we have two chains: one generates the summary of the feedback (feedback_chain) and one generates an email response based on the summary of the feedback (email_chain). The chain above was created using LangChain Expression Language — the preferred way of creating chains, according to LangChain.

We can also use predefined chains, for example, for summarization tasks or simple Q&A:

# Predefined chains for summarization and Q&A
chain = summarize.load_summarize_chain(llm, chain_type="stuff")
chain.run(texts[:30])
>>>
The history of mathematics deals with the origin of discoveries in mathematics
and the mathematical methods and notation of the past. It began in the 6th
century BC with the Pythagoreans, who coined the term "mathematics". Greek
mathematics greatly refined the methods and expanded the subject matter of
mathematics. Chinese mathematics made early contributions, including a place
value system and the first use of negative numbers. The Hindu–Arabic numeral
system and the rules for the use of its operations evolved over the course of
the first millennium AD in India and were transmitted to the Western world via
Islamic mathematics. From ancient times through the Middle Ages, periods of
mathematical discovery were often followed by centuries of stagnation.
Beginning in Renaissance Italy in the 15th century, new mathematical
developments, interacting with new scientific discoveries, were made at an
increasing pace that continues through the present day.
chain = question_answering.load_qa_chain(llm, chain_type="stuff")
chain.run(input_documents=texts[:30],
question="Name the greatest Arab mathematicians of the past"
)
>>>
Muḥammad ibn Mūsā al-Khwārizmī

Besides predefined chains for summarizing feedback, answering questions, etc., we can build our own custom ConversationChain and integrate memory into it.

# Using memory in a conversation chain
memory = ConversationBufferMemory()
conversation = ConversationChain(llm=llm, memory=memory)
conversation.run("Name the tallest mountain in the world")
>>>
The tallest mountain in the world is Mount Everest

conversation.run("How high is it?")
>>>
Mount Everest stands at 8,848 meters (29,029 ft) above sea level.

Agents and Tools

LangChain allows the creation of custom tools and agents for specialized tasks. Custom tools can be anything from calling ones’ API to custom Python functions, which can be integrated into LangChain agents for complex operations. Let’s create an agent, that will lowercase any sentence.

from langchain.tools import StructuredTool, BaseTool
from langchain.agents import initialize_agent, AgentType
import re

# Define and use a custom text processing tool
def text_processing(string: str) -> str:
"""Process the text"""
return string.lower()

text_processing_tool = StructuredTool.from_function(text_processing)

# Initialize and use an agent with the custom tool
agent = initialize_agent([text_processing_tool], llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent.run({"input": "Process the text: London is the capital of Great Britain"})
>>>
> Entering new AgentExecutor chain...
I need to use a text processing tool
Action: text_processing
Action Input: London is the capital of Great Britain
Observation: london is the capital of great britain
Thought: I now know the final answer
Final Answer: london is the capital of great britain

> Finished chain.
'london is the capital of great britain'

As you can see, our agent used the tool we defined and lowercase the sentence. Now, let’s create a fully functional agent. For that, we’ll create a custom tool for converting units (miles to kilometers, for example) within the text and integrate it into a conversational agent using the LangChain framework. The UnitConversionTool class provides a practical example of extending base functionalities with specific conversion logic.

import re
from langchain.tools import BaseTool
from langchain.agents import initialize_agent

class UnitConversionTool(BaseTool):
"""
A tool for converting American units to International units.
Specifically, it converts miles to kilometers.
"""

name = "Unit Conversion Tool"
description = "Converts American units to International units"

def _run(self, text: str):
"""
Synchronously converts miles in the text to kilometers.

Args:
text (str): The input text containing miles to convert.

Returns:
str: The text with miles converted to kilometers.
"""

def miles_to_km(match):
miles = float(match.group(1))
return f"{miles * 1.60934:.2f} km"

return re.sub(r'\b(\d+(\.\d+)?)\s*(milesU+007Cmile)\b', miles_to_km, text)

def _arun(self, text: str):
"""
Asynchronous version of the conversion function. Not implemented yet.
"""

raise NotImplementedError("No async yet")

# Initialize an agent with the Unit Conversion Tool
agent = initialize_agent(
agent='chat-conversational-react-description',
tools=[UnitConversionTool()],
llm=llm,
memory=memory
)

# Example usage of the agent to convert units
agent.run("five miles")
>>> Five miles is approximately 8 kilometers.

agent.run("Sorry, I meant 15")
>>> 15 kilometers is approximately 9.3 miles

This wraps up the code shown in my LangChain one-pager. I hope you’ve found it helpful!

Reminder: You can download the PDF version, check out onepagers on GitHub, and launch the code in Colab.

I’d appreciate you clapping the article and following me, as this motivates me to write new parts and articles 🙂 Plus, you’ll get notified when the new part will be published.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓