Artificial Intelligence Zone

Modeling Extremely Large Images with xT

BAIR

MARCH 21, 2024

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Computer Vision

On the Stepwise Nature of Self-Supervised Learning

BAIR

JULY 10, 2023

Figure 1: stepwise behavior in self-supervised learning. When training common SSL algorithms, we find that the loss descends in a stepwise fashion (top left) and the learned embeddings iteratively increase in dimensionality (bottom left). Direct visualization of embeddings (right; top three PCA directions shown) confirms that embeddings are initially collapsed to a point, which then expands to a 1D manifold, a 2D manifold, and beyond concurrently with steps in the loss.

Neural Network

Neural Network Deep Learning Algorithm

Rethinking the Role of PPO in RLHF

BAIR

OCTOBER 16, 2023

Rethinking the Role of PPO in RLHF TL;DR : In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement learning from absolute feedback and relative feedback.

Algorithm

Algorithm Large Language Models LLM OpenAI

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

MORE WEBINARS

Goal Representations for Instruction Following

BAIR

OCTOBER 17, 2023

Goal Representations for Instruction Following page width. --> A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions.

Robotics

Asymmetric Certified Robustness via Feature-Convex Neural Networks

BAIR

NOVEMBER 14, 2023

Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR : We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds.

Neural Network

Neural Network Machine Learning Deep Learning

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

BAIR

NOVEMBER 14, 2023

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Large Language Models

Large Language Models ChatGPT AI AI

The Shift from Models to Compound AI Systems

BAIR

FEBRUARY 18, 2024

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

LLM

LLM Neural Network AI AI

2024 BAIR Graduate Directory

BAIR

MARCH 11, 2024

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Robotics

Robotics Natural Language Processing Machine Learning Computer Vision

Modeling Extremely Large Images with xT

BAIR

MARCH 21, 2024

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Computer Vision

2024 BAIR Graduate Directory

BAIR

MARCH 11, 2024

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Robotics

Robotics Natural Language Processing Machine Learning Computer Vision

The Shift from Models to Compound AI Systems

BAIR

FEBRUARY 17, 2024

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

LLM

LLM Neural Network AI AI

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

BAIR

NOVEMBER 14, 2023

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Large Language Models

Large Language Models ChatGPT AI AI

Asymmetric Certified Robustness via Feature-Convex Neural Networks

BAIR

NOVEMBER 14, 2023

Asymmetric Certified Robustness via Feature-Convex Neural Networks TLDR : We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios. This focused setting allows us to introduce feature-convex classifiers, which produce closed-form and deterministic certified radii on the order of milliseconds.

Neural Network

Neural Network Machine Learning Deep Learning

Goal Representations for Instruction Following

BAIR

OCTOBER 17, 2023

Goal Representations for Instruction Following page width. --> A longstanding goal of the field of robot learning has been to create generalist agents that can perform tasks for humans. Natural language has the potential to be an easy-to-use interface for humans to specify arbitrary tasks, but it is difficult to train robots to follow language instructions.

Robotics

Rethinking the Role of PPO in RLHF

BAIR

OCTOBER 16, 2023

Rethinking the Role of PPO in RLHF TL;DR : In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way? Figure 1: This diagram illustrates the difference between reinforcement learning from absolute feedback and relative feedback.

Algorithm

Algorithm Large Language Models LLM OpenAI

Training Diffusion Models with Reinforcement Learning

BAIR

JULY 14, 2023

Training Diffusion Models with Reinforcement Learning replay Diffusion models have recently emerged as the de facto standard for generating complex, high-dimensional outputs. You may know them for their ability to produce stunning AI art and hyper-realistic synthetic images , but they have also found success in other applications such as drug design and continuous control.

Algorithm

Algorithm Robotics Neural Network Chatbots

On the Stepwise Nature of Self-Supervised Learning

BAIR

JULY 9, 2023

Figure 1: stepwise behavior in self-supervised learning. When training common SSL algorithms, we find that the loss descends in a stepwise fashion (top left) and the learned embeddings iteratively increase in dimensionality (bottom left). Direct visualization of embeddings (right; top three PCA directions shown) confirms that embeddings are initially collapsed to a point, which then expands to a 1D manifold, a 2D manifold, and beyond concurrently with steps in the loss.

Neural Network

Neural Network Deep Learning Algorithm

Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention

BAIR

JUNE 29, 2023

Figure 1: CoarsenConf architecture. (II) Equivariant MLPs are applied to learn the mean and log variance of both the posterior and prior distributions. (III) The posterior (training) or prior (inference) is sampled and fed into the Channel Selection module, where an attention layer is used to learn the optimal pathway from CG to FG structure. (IV) Given the FG latent vector and the RDKit approximation, the decoder $p_theta(X |mathcal{R}, z)$ learns to recover the low-energy FG structure through

Deep Learning

Deep Learning Algorithm

GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

BAIR

MAY 23, 2023

TL;DR : Text Prompt -> LLM -> Intermediate Representation (such as an image layout) -> Stable Diffusion -> Image. Recent advancements in text-to-image generation with diffusion models have yielded remarkable results synthesizing highly realistic and diverse images. However, despite their impressive capabilities, diffusion models, such as Stable Diffusion , often struggle to accurately follow the prompts when spatial or common sense reasoning is required.

Large Language Models

Large Language Models LLM

Interactive Fleet Learning

BAIR

APRIL 6, 2023

Figure 1: “Interactive Fleet Learning” (IFL) refers to robot fleets in industry and academia that fall back on human teleoperators when necessary and continually learn from them over time. In the last few years we have seen an exciting development in robotics and artificial intelligence: large fleets of robots have left the lab and entered the real world.

Robotics

Robotics Algorithm Continuous Learning Artificial Intelligence

Koala: A Dialogue Model for Academic Research

BAIR

APRIL 3, 2023

These are comments in HTML. The above header text is needed to format the title, authors, etc. The "example_post" is an example representative image (not GIF) that we use for each post for tweeting (see below as well) and for the emails to subscribers. Please provide this image (and any other images and GIFs) in the blog to the BAIR Blog editors directly.

Large Language Models

Large Language Models ChatGPT OpenAI Chatbots

Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers

BAIR

APRIL 20, 2022

A demonstration of the RvS policy we learn with just supervised learning and a depth-two MLP. It uses no TD learning, advantage reweighting, or Transformers! Offline reinforcement learning (RL) is conventionally approached using value-based methods based on temporal difference (TD) learning. However, many recent algorithms reframe RL as a supervised learning problem.

Algorithm

Algorithm Robotics Automation

Should I Use Offline RL or Imitation Learning?

BAIR

APRIL 25, 2022

Figure 1: Summary of our recommendations for when a practitioner should BC and various imitation learning style methods, and when they should use offline RL approaches. Offline reinforcement learning allows learning policies from previously collected data, which has profound implications for applying RL in domains where running trial-and-error learning is impractical or dangerous, such as safety-critical settings like autonomous driving or medical treatment planning.

Robotics

Robotics Algorithm Explainability

Designing Societally Beneficial Reinforcement Learning Systems

BAIR

APRIL 29, 2022

Deep reinforcement learning (DRL) is transitioning from a research field focused on game playing to a technology with real-world applications. Notable examples include DeepMind’s work on controlling a nuclear reactor or on improving Youtube video compression , or Tesla attempting to use a method inspired by MuZero for autonomous vehicle behavior planning.

Robotics

Robotics Machine Learning Computer Vision ML

Rethinking Human-in-the-Loop for Artificial Augmented Intelligence

BAIR

MAY 3, 2022

Figure 1: In real-world applications, we think there exist a human-machine loop where humans and machines are mutually augmenting each other. We call it Artificial Augmented Intelligence. How do we build and evaluate an AI system for real-world applications? In most AI research, the evaluation of AI methods involves a training-validation-testing process.

AI Modeling

AI Modeling Automation AI Developer AI Development

The Berkeley Crossword Solver

BAIR

MAY 20, 2022

We recently published the Berkeley Crossword Solver (BCS), the current state of the art for solving American-style crossword puzzles. The BCS combines neural question answering and probabilistic inference to achieve near-perfect performance on most American-style crossword puzzles, like the one shown below: Figure 1: Example American-style crossword puzzle An earlier version of the BCS, in conjunction with Dr.Fill, was the first computer program to outscore all human competitors in the world’s t

FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART

BAIR

JUNE 30, 2022

FIGS (Fast Interpretable Greedy-tree Sums): A method for building interpretable models by simultaneously growing an ensemble of decision trees in competition with one another. Recent machine-learning advances have led to increasingly complex predictive models, often at the cost of interpretability. We often need interpretability, particularly in high-stakes applications such as in clinical decision-making; interpretable models help with all kinds of things, such as identifying errors, leveraging

Machine Learning

Machine Learning Algorithm

Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation

BAIR

JULY 10, 2022

In cooperative multi-agent reinforcement learning (MARL), due to its on-policy nature, policy gradient (PG) methods are typically believed to be less sample efficient than value decomposition (VD) methods, which are off-policy. However, some recent empirical studies demonstrate that with proper input representation and hyper-parameter tuning, multi-agent PG can achieve surprisingly strong performance compared to off-policy VD methods.

Algorithm

Reverse engineering the NTK: towards first-principles architecture design

BAIR

AUGUST 29, 2022

Deep neural networks have enabled technological wonders ranging from voice recognition to machine transition to protein engineering, but their design and application is nonetheless notoriously unprincipled. The development of tools and methods to guide this process is one of the grand challenges of deep learning theory. In Reverse Engineering the Neural Tangent Kernel , we propose a paradigm for bringing some principle to the art of architecture design using recent theoretical breakthroughs: fir

Neural Network

Neural Network Deep Learning

Keeping Learning-Based Control Safe by Regulating Distributional Shift

BAIR

SEPTEMBER 19, 2022

To regulate the distribution shift experience by learning-based controllers, we seek a mechanism for constraining the agent to regions of high data density throughout its trajectory (left). Here, we present an approach which achieves this goal by combining features of density models (middle) and Lyapunov functions (right). In order to make use of machine learning and reinforcement learning in controlling real world systems, we must design algorithms which not only achieve good performance, but a

Machine Learning

Machine Learning Neural Network Algorithm Robotics

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

BAIR

JANUARY 20, 2023

Reinforcement learning provides a conceptual framework for autonomous agents to learn from experience, analogously to how one might train a pet with treats. But practical applications of reinforcement learning are often far from natural: instead of using RL to learn through trial and error by actually attempting the desired task, typical RL applications use a separate (usually simulated) training phase.

Robotics

Robotics Continuous Learning Algorithm

Artificial Intelligence Zone

BAIR

Modeling Extremely Large Images with xT

On the Stepwise Nature of Self-Supervised Learning

Webinars

Trending Sources

Rethinking the Role of PPO in RLHF

Webinars

Goal Representations for Instruction Following

Asymmetric Certified Robustness via Feature-Convex Neural Networks

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

The Shift from Models to Compound AI Systems

2024 BAIR Graduate Directory

Modeling Extremely Large Images with xT

2024 BAIR Graduate Directory

The Shift from Models to Compound AI Systems

Ghostbuster: Detecting Text Ghostwritten by Large Language Models

Asymmetric Certified Robustness via Feature-Convex Neural Networks

Goal Representations for Instruction Following

Rethinking the Role of PPO in RLHF

Training Diffusion Models with Reinforcement Learning

On the Stepwise Nature of Self-Supervised Learning

Generating 3D Molecular Conformers via Equivariant Coarse-Graining and Aggregated Attention

GPT-4 + Stable-Diffusion = ?: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Interactive Fleet Learning

Koala: A Dialogue Model for Academic Research

Offline RL Made Easier: No TD Learning, Advantage Reweighting, or Transformers

Should I Use Offline RL or Imitation Learning?

Designing Societally Beneficial Reinforcement Learning Systems

Rethinking Human-in-the-Loop for Artificial Augmented Intelligence

The Berkeley Crossword Solver

FIGS: Attaining XGBoost-level performance with the interpretability and speed of CART

Why do Policy Gradient Methods work so well in Cooperative MARL? Evidence from Policy Representation

Reverse engineering the NTK: towards first-principles architecture design

Keeping Learning-Based Control Safe by Regulating Distributional Shift

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

Stay Connected