Artificial Intelligence Zone

Accelerate NLP inference with ONNX Runtime on AWS Graviton processors

AWS Machine Learning Blog

MAY 15, 2024

The first figure illustrates the ONNX software stack, highlighting (in orange) the components optimized for inference performance improvement on the AWS Graviton3 platform. You can see that for the BERT, RoBERTa, and GPT2 models, the throughput improvement is up to 65%. for the same fp32 model inference. for the same model inference.

NLP

NLP BERT Natural Language Processing Software Development

Transformer Library by HuggingFace

Mlearning.ai

FEBRUARY 19, 2024

Let’s examine some code examples to illustrate the usage of the pipeline function. Text Generation In the previous code, we did not specify a particular model, so by default, it uses the gpt2 model. However, in the following code, we explicitly indicate the use of distilgpt2, which can be seen as a compact version of the gpt2 model.

NLP

NLP BERT AI AI

Get started with generative AI on AWS using Amazon SageMaker JumpStart

AWS Machine Learning Blog

MAY 4, 2023

This post provides an overview of generative AI with a real customer use case, provides a concise description and outlines its benefits, references an easy-to-follow demo of AWS DeepComposer for creating new musical compositions, and outlines how to get started using Amazon SageMaker JumpStart for deploying GPT2, Stable Diffusion 2.0,

Generative AI

Generative AI Machine Learning AI AI

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

How GPT3 Works - Visualizations and Animations

Jay Alammar

JULY 26, 2020

This is a high-level view of that process: You can see a detailed explanation of everything inside the decoder in my blog post The Illustrated GPT2. Each of these layers has its own 1.8B parameter to make its calculations. That is where the “magic” happens. This is an X-ray of an input and response (“Okay human”) within GPT3.

Deep Learning

Deep Learning Robotics Automation ML

Exploring Creativity in Large Language Models: From GPT-2 to GPT-4

Topbots

APRIL 28, 2023

These examples and analyses illustrate that creativity tests with a single correct answer might be limited. This finding illustrates how GPT-4 seems to have “cracked the code” for generating what it considers to be diverse words. Starbucks is a well-known coffee shop, so it's possible that his character would have frequented Starbucks.

Large Language Models

Large Language Models Natural Language Processing Robotics Explainability

Interfaces for Explaining Transformer Language Models

Jay Alammar

DECEMBER 16, 2020

We illustrate how some key interpretability methods apply to transformer-based language models. The first example for this interface asks GPT2-XL for William Shakespeare's date of birth. GPT2-XL is able to tell the birth date of William Shakespeare expressed in two tokens. Tap or hover over the output tokens.

Explainability

Explainability Auto-complete Auto-classification Neural Network

Deploy large models at high performance using FasterTransformer on Amazon SageMaker

AWS Machine Learning Blog

APRIL 17, 2023

An example pipeline for large model hosting with runtime partitioning is illustrated in the following diagram. Supported models All Hugging Face models All GPT family, Stable Diffusion, and T5 family GPT2/OPT/BLOOM/T5 GPT2/OPT/GPTJ/GPT-NeoX* Streaming tokens ✓ ✓. Model partitioning on CPU memory. ✓. Fast model loading ✓ ✓ ✓.

Prompt Engineer

Prompt Engineer Prompt Engineering Deep Learning Large Language Models

Boost inference performance for Mixtral and Llama 2 models with new Amazon SageMaker containers

AWS Machine Learning Blog

APRIL 8, 2024

of the DLC grows the list of supported models for JIT compilation, introducing Baichuan, ChatGLM , GPT2, GPT-J, InternLM, Mistral, Mixtral, Qwen, SantaCoder and StarCoder models. The following figure illustrates multi-head, grouped-query, and multi-query attention methods ( source ). Version 0.26.0

Auto-complete

Auto-complete LLM Deep Learning Auto-classification

Google’s Dr. Arsanjani on Enterprise Foundation Model Challenges

Snorkel AI

MARCH 2, 2023

It came to its own with the creation of the transformer architecture: Google’s BERT, OpenAI, GPT2 and then 3, LaMDA for conversation, Mina and Sparrow from Google DeepMind. And this example illustrates that it may be challenging to constrain the possible nefarious misuses of a foundational model.

Large Language Models

Large Language Models Prompt Engineer Prompt Engineering Neural Network

Google’s Arsanjani on Enterprise Foundation Model Challenges

Snorkel AI

MARCH 2, 2023

It came to its own with the creation of the transformer architecture: Google’s BERT, OpenAI, GPT2 and then 3, LaMDA for conversation, Mina and Sparrow from Google DeepMind. And this example illustrates that it may be challenging to constrain the possible nefarious misuses of a foundational model.

Large Language Models

Large Language Models Prompt Engineer Prompt Engineering Neural Network

Artificial Intelligence Zone

Accelerate NLP inference with ONNX Runtime on AWS Graviton processors

Transformer Library by HuggingFace

Webinars

Trending Sources

Get started with generative AI on AWS using Amazon SageMaker JumpStart

Webinars

How GPT3 Works - Visualizations and Animations

Exploring Creativity in Large Language Models: From GPT-2 to GPT-4

Interfaces for Explaining Transformer Language Models

Deploy large models at high performance using FasterTransformer on Amazon SageMaker

Boost inference performance for Mixtral and Llama 2 models with new Amazon SageMaker containers

Google’s Dr. Arsanjani on Enterprise Foundation Model Challenges

Google’s Arsanjani on Enterprise Foundation Model Challenges

Stay Connected