Artificial Intelligence Zone

research pretrained-language-models

Meet Dolma: An Open English Corpus of 3T Tokens for Language Model Pretraining Research

Marktechpost

FEBRUARY 8, 2024

Large Language Models (LLMs) are a recent trend as these models have gained significant importance for handling tasks related to Natural Language Processing (NLP), such as question-answering, text summarization, few-shot learning, etc. A team of researchers have discussed transparency and openness in their recent study.

Large Language Models

Large Language Models Natural Language Processing NLP ML

XLang NLP Lab Researchers Propose Lemur: The State-of-the-Art Open Pretrained Large Language Models Balancing Text and Code Capabilities

Marktechpost

SEPTEMBER 3, 2023

In a world increasingly driven by the intersection of language and technology, the demand for versatile and powerful language models has never been greater. Traditional large language models (LLMs) have excelled in textual comprehension or coding tasks but seldom managed to strike a harmonious balance between the two.

Large Language Models

Large Language Models NLP AI Researcher AI Research

Join 5,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Researchers from UC Berkeley and Meta Present AST-T5: A Novel Pretraining Paradigm that Harnesses the Power of Abstract Syntax Trees (ASTs) to Boost the Performance of Code-Centric Language Models

Marktechpost

JANUARY 14, 2024

These models, trained on extensive code datasets such as GitHub, excel in tasks like text-to-code conversion, code-to-code transpilation, and understanding code. However, many current models merely treat code as sequences of subword tokens, overlooking its structure.

Neural Network

Neural Network NLP Algorithm ML

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Databricks claims DBRX sets ‘a new standard’ for open-source LLMs

AI News

MARCH 28, 2024

Databricks has announced the launch of DBRX, a powerful new open-source large language model that it claims sets a new bar for open models by outperforming established options like GPT-3.5 It even outperforms Anthropic’s closed-source model Claude on certain benchmarks. on industry benchmarks.

Large Language Models

Large Language Models Big Data LLM Generative AI

Researchers from Cerebras & Neural Magic Introduce Sparse Llama: The First Production LLM based on Llama at 70% Sparsity

Marktechpost

MAY 17, 2024

Natural Language Processing (NLP) is a cutting-edge field that enables machines to understand, interpret, & generate human language. It has applications in various domains, such as language translation, text summarization, sentiment analysis, and the development of conversational agents.

LLM

LLM Large Language Models Natural Language Processing NLP

01.AI Introduces Yi-1.5-34B Model: An Upgraded Version of Yi with a High-Quality Corpus of 500B Tokens and Fine-Tuned on 3M Diverse Fine-Tuning Samples

Marktechpost

MAY 18, 2024

34B model introduced by 01.AI Positioned as a major improvement over its predecessors, this unique model bridges the gap between Llama 3 8B and 70B. 34B model, its creation, and its possible effects on the AI community have been explored in depth by the team of researchers. 34B model’s development.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence AI AI

Researchers at Intel Labs Introduce LLaVA-Gemma: A Compact Vision-Language Model Leveraging the Gemma Large Language Model in Two Variants (Gemma-2B and Gemma-7B)

Marktechpost

APRIL 6, 2024

Recent advancements in large language models (LLMs) and Multimodal Foundation Models (MMFMs) have spurred interest in large multimodal models (LMMs). Models like GPT-4, LLaVA, and their derivatives have shown remarkable performance in vision-language tasks such as Visual Question Answering and image captioning.

Large Language Models

Large Language Models LLM ML AI

Poro 34B: A 34B Parameter AI Model Trained for 1T Tokens of Finnish, English, and Programming languages, Including 8B Tokens of Finnish-English Translation Pairs

Marktechpost

APRIL 5, 2024

State-of-the-art language models require vast amounts of text data for pretraining, often in the order of trillions of words, which poses a challenge for smaller languages needing more extensive resources.

Data Scarcity

Data Scarcity AI Modeling AI AI

UC Berkeley Researchers Introduce Learnable Latent Codes as Bridges (LCB): A Novel AI Approach that Combines the Abstract Reasoning Capabilities of Large Language Models with Low-Level Action Policies

Marktechpost

MAY 11, 2024

The emergence of large language models (LLMs) has renewed interest in hierarchical control architectures, with recent studies utilizing LLMs to replace symbolic planners, achieving significant feats like mobile object rearrangement based on open-vocabulary instructions.

Large Language Models

Large Language Models Robotics Neural Network LLM

This Machine Learning Research Introduces Premier-TACO: A Robust and Highly Generalizable Representation Pretraining Framework for Few-Shot Policy Learning

Marktechpost

FEBRUARY 25, 2024

Much like how foundation models in language, such as BERT and GPT, have transformed natural language processing by leveraging vast textual data, pretrained foundation models hold similar promise for SDM. All credit for this research goes to the researchers of this project.

Machine Learning

Machine Learning Natural Language Processing Robotics BERT

Exploring the Scaling Laws in Large Language Models For Enhanced Translation Performance

Marktechpost

FEBRUARY 20, 2024

Studying scaling laws in large language models (LLMs) is crucial for enhancing machine translation performance. Of all the major challenges associated with the field, a key challenge in advancing LLMs is determining the effect of pretraining data size and its alignment with downstream tasks, particularly in machine translation.

Large Language Models

Large Language Models LLM Algorithm ML

Uni3D: Exploring Unified 3D Representation at Scale

Unite.AI

OCTOBER 27, 2023

Scaling up representations of text and visuals has been a major focus of research in recent years. Developments and research conducted in the recent past have led to numerous revolutions in language learning and vision. Today, we will discuss Uni3D, a 3D foundation model that aims to explore unified 3D representations.

Computer Vision

Computer Vision NLP Natural Language Processing Large Language Models

Google AI Introduces Cappy: A Small Pre-Trained Scorer Machine Learning Model that Enhances and Surpasses the Performance of Large Multi-Task Language Models

Marktechpost

MARCH 18, 2024

In a new AI research paper , Google researchers introduced a pre-trained scorer model, Cappy , to enhance and surpass the performance of large multi-task language models. The paper aims to resolve challenges faced in the large language models (LLMs). Check out the Paper. Check out the Paper.

Machine Learning

Machine Learning Natural Language Processing Large Language Models LLM

Can We Transfer the Capabilities of LLMs like LLaMA from English to Non-English Languages? A Deep Dive into Multilingual Model Proficiency

Marktechpost

JANUARY 6, 2024

Significant achievements have been made in LLMs, exemplified by ChatGPT, excelling in complex language processing tasks. This limits the performance of LLMs in other non-English languages, which is a matter of concern for non-English users. While many LLMs comprehend diverse languages, imbalanced language resources pose challenges.

LLM

LLM ChatGPT ML Large Language Models

XGen-MM: A Series of Large Multimodal Models (LMMS) Developed by Salesforce Al Research

Marktechpost

MAY 15, 2024

Salesforce AI Research has unveiled a groundbreaking development – the XGen-MM series. The Genesis of XGen-MM: XGen-MM emerges from Salesforce’s unified XGen initiative, reflecting a concerted effort to pioneer large foundation models. Conclusion: In multimodal language models, XGen-MM emerges as a beacon of innovation.

AI Researcher

AI Researcher AI Research AI AI

Researchers from NVIDIA Introduce Retro 48B: The Largest LLM Pretrained with Retrieval before Instruction Tuning

Marktechpost

OCTOBER 17, 2023

Researchers from Nvidia and the University of Illinois at Urbana Champaign introduce Retro 48B, a significantly larger language model than previous retrieval-augmented models like Retro (7.5B Their approach reduces model perplexity, improves factuality, and enhances task performance post-fine-tuning. parameters).

LLM

LLM Large Language Models AI Researcher AI Research

This AI Paper Unveils the Key to Extending Language Models to 128K Contexts with Continual Pretraining

Marktechpost

FEBRUARY 28, 2024

Large language models can accomplish tasks that surpass current paradigms, such as reading code at the repository level, modeling long-history dialogs, and powering autonomous agents with language models with a context window of 128K tokens. The team’s foundational models are LLaMA-2 7B and 13B.

Large Language Models

Large Language Models AI AI ML

Researchers from Johns Hopkins and UC Santa Cruz Unveil D-iGPT: A Groundbreaking Advance in Image-Based AI Learning

Marktechpost

DECEMBER 10, 2023

Natural language processing (NLP) has entered a transformational period with the introduction of Large Language Models (LLMs), like the GPT series, setting new performance standards for various linguistic tasks. Autoregressive pretraining has substantially contributed to computer vision in addition to NLP.

BERT

BERT Computer Vision Natural Language Processing Large Language Models

Researchers from the University of Washington and Allen Institute for AI Present Proxy-Tuning: An Efficient Alternative to Finetuning Large Language Models

Marktechpost

JANUARY 21, 2024

The inherent capabilities of pretrained large language models are notable, yet achieving desired behaviors often requires additional adaptation. When dealing with models whose weights are kept private, the challenge intensifies, rendering tuning either excessively costly or outright impossible.

Large Language Models

Large Language Models AI AI OpenAI

Meet MaLA-500: A Novel Large Language Model Designed to Cover an Extensive Range of 534 Languages

Marktechpost

JANUARY 29, 2024

With new releases and introductions in the field of Artificial Intelligence (AI), Large Language Models (LLMs) are advancing significantly. They are showcasing their incredible capability of generating and comprehending natural language.

Large Language Models

Large Language Models Data Scarcity Artificial Intelligence Artificial Intelligence

Maximizing Efficiency in AI Training: A Deep Dive into Data Selection Practices and Future Directions

Marktechpost

MARCH 4, 2024

The recent success of large language models relies heavily on extensive text datasets for pre-training. Despite the expanding interest in this area, limited resources hinder extensive research. Despite the expanding interest in this area, limited resources hinder extensive research.

Large Language Models

Large Language Models Categorization AI AI

NeuScraper: Pioneering the Future of Web Scraping for Enhanced Large Language Model Pretraining

Marktechpost

MARCH 1, 2024

The quest for clean, usable data for pretraining Large Language Models (LLMs) resembles searching for treasure amidst chaos. NeuScraper’s architecture begins by dissecting webpages into blocks, analyzing them through the prism of a shallow neural model that understands the webpage’s layout.

Large Language Models

Large Language Models Data Extraction Neural Network LLM

CMU Researchers Propose MOMENT: A Family of Open-Source Machine Learning Foundation Models for General-Purpose Time Series Analysis

Marktechpost

MAY 15, 2024

Pre-training large models on time series data faces several challenges: the lack of a comprehensive public time series repository, the complexity of diverse time series characteristics, and the infancy of experimental benchmarks for model evaluation, especially under resource-constrained and minimally supervised scenarios.

Machine Learning

Machine Learning Deep Learning ML Artificial Intelligence

Meet MosaicBERT: A BERT-Style Encoder Architecture and Training Recipe that is Empirically Optimized for Fast Pretraining

Marktechpost

JANUARY 10, 2024

BERT is a language model which was released by Google in 2018. It is based on the transformer architecture and is known for its significant improvement over previous state-of-the-art models. In this research paper, the authors have shown that speed optimizations can be incorporated into the BERT architecture and training recipe.

BERT

BERT Large Language Models Natural Language Processing NLP

Google DeepMind Researchers Propose a Novel AI Method Called Sparse Fine-grained Contrastive Alignment (SPARC) for Fine-Grained Vision-Language Pretraining

Marktechpost

JANUARY 24, 2024

These models align global image and text features in a shared space through similar and dissimilar pairs, excelling in tasks like image classification and retrieval. Despite these advancements, challenges like computational expense and reliance on pretrained models persist.

AI AI ML Computer Vision

Researchers from Inception, MBZUAI, and Cerebras Open-Sourced ‘Jais’: The World’s Most Advanced Arabic Large Language Model

Marktechpost

SEPTEMBER 3, 2023

Large language models like GPT-3 and their impact on various aspects of society are a subject of significant interest and debate. Large language models have significantly advanced the field of NLP. Similar to large language models in other languages, Arabic LLMs may inherit biases from the training data.

Large Language Models

Large Language Models Natural Language Processing NLP Artificial Intelligence

Meet InternLM-20B: An Open-Sourced 20B Parameter Pretrained Artificial Intelligence AI Framework

Marktechpost

SEPTEMBER 30, 2023

Researchers continually strive to build models that can understand, reason, and generate text like humans in the rapidly evolving field of natural language processing. These models must grapple with complex linguistic nuances, bridge language gaps, and adapt to diverse tasks.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Natural Language Processing NLP

Meet SaulLM-7B: A Pioneering Large Language Model for Law

Marktechpost

MARCH 14, 2024

Advancements in large language models (LLMs) have been witnessed across various domains, such as translation, healthcare, and code generation. These models have shown exceptional capabilities in understanding and generating human-like text. Despite their success, the legal domain has yet to benefit fully from this technology.

Large Language Models

Large Language Models LLM Artificial Intelligence Artificial Intelligence

Researchers from China Propose Vision Mamba (Vim): A New Generic Vision Backbone With Bidirectional Mamba Blocks

Marktechpost

JANUARY 22, 2024

Many people are now interested in the state space model (SSM) because of how recent research has advanced. Modern SSMs, which derive from the classic state space model, benefit from concurrent training and excel at capturing long-range dependencies. These methods excel at modeling long-range dependencies.

Computer Vision

Computer Vision Artificial Intelligence Artificial Intelligence Robotics

This AI Paper from Microsoft Introduces a New Approach to Training Language Models: Mimicking Human Reading Comprehension for Enhanced Performance in Biomedicine, Finance, and Law

Marktechpost

SEPTEMBER 26, 2023

Domain-specific big language models have emerged due to the oversaturation of general large language models (LLMs). The first builds models from scratch using a combination of generic and domain-specific corpora. The second method, which is more economical, refines the language model using supervised datasets.

Large Language Models

Large Language Models Natural Language Processing LLM AI

Meet ‘AboutMe’: A New Dataset And AI Framework that Uses Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Marktechpost

JANUARY 18, 2024

With the advancements in Natural Language Processing and Natural Language Generation, Large Language Models (LLMs) are being frequently used in real-world applications. With their ability to mimic human behavior, these models, with their general-purpose nature, have stepped into every field and domain.

Natural Language Processing

Natural Language Processing Large Language Models LLM AI

Researchers at Stanford and MIT Introduced the Stream of Search (SoS): A Machine Learning Framework that Enables Language Models to Learn to Solve Problems by Searching in Language without Any External Support

Marktechpost

APRIL 10, 2024

Language models often need more exposure to fruitful mistakes during training, hindering their ability to anticipate consequences beyond the next token. Transformer-based models struggle with planning due to error snowballing and difficulty in lookahead tasks. Yet, these methods are limited by the demonstrated procedures.

Machine Learning

Machine Learning Algorithm ML Artificial Intelligence

Researchers at Brown University Introduce Bonito: An Open-Source AI Model for Conditional Task Generation to Convert Unannotated Texts into Instruction Tuning Datasets

Marktechpost

MARCH 7, 2024

Recent advancements in language technology have revolutionized the adaptation of Large Language Models (LLMs), leveraging extensive in-domain datasets or even just a handful of task-specific examples. How can language models be adapted to follow instructions in specialized domains without annotated data?

AI Modeling

AI Modeling Large Language Models NLP AI

JP Morgan AI Research Introduces FlowMind: A Novel Machine Learning Approach that Leverages the Capabilities of LLMs such as GPT to Create an Automatic Workflow Generation System

Marktechpost

APRIL 24, 2024

Existing research in Robotic Process Automation (RPA) has focused on rule-based systems like UiPath and Blue Prism, which automate routine tasks such as data entry and customer service. Researchers at J.P. In conclusion, the research introduced FlowMind, developed by J.P. Morgan AI Research. Check out the Paper.

Machine Learning

Machine Learning AI Researcher AI Research Large Language Models

This AI Paper from Google DeepMind Studies the Gap Between Pretraining Data Composition and In-Context Learning in Pretrained Transformers

Marktechpost

NOVEMBER 13, 2023

Researchers from Google DeepMind explore the in-context learning (ICL) capabilities of large language models, specifically transformers, trained on diverse task families. However, their study needs to work on out-of-domain tasks, revealing limitations in generalization for functions beyond the pretraining distribution.

Large Language Models

Large Language Models AI AI AI Researcher

Meet Sailor: A Suite of Open Language Models for Bridging Linguistic Barriers in Southeast Asia

Marktechpost

MARCH 7, 2024

In the ever-evolving landscape of computational linguistics, bridging language barriers has led to remarkable innovations, particularly in regions characterized by a rich tapestry of languages. Southeast Asia, with its linguistic diversity, presents a unique challenge for language technology. Built upon the robust Qwen 1.5

Computational Linguistics

Computational Linguistics Algorithm ML AI

CT-LLM: A 2B Tiny LLM that Illustrates a Pivotal Shift Towards Prioritizing the Chinese Language in Developing LLMs

Marktechpost

APRIL 10, 2024

For too long, the world of natural language processing has been dominated by models that primarily cater to the English language. However, a groundbreaking new development is set to challenge this status quo and usher in a more inclusive era of language models – the Chinese Tiny LLM (CT-LLM). billion code tokens.

LLM

LLM Natural Language Processing NLP ML

Meta raises the bar with open source Llama 3 LLM

AI News

APRIL 19, 2024

Meta has introduced Llama 3 , the next generation of its state-of-the-art open source large language model (LLM). The tech giant claims Llama 3 establishes new performance benchmarks, surpassing previous industry-leading models like GPT-3.5 Meta’s 70 billion parameter instruction fine-tuned model outperformed GPT-3.5,

LLM

LLM Large Language Models Big Data Responsible AI

MIT Researchers Introduce MechGPT: A Language-Based Pioneer Bridging Scales, Disciplines, and Modalities in Mechanics and Materials Modeling

Marktechpost

NOVEMBER 19, 2023

Researchers confront a formidable challenge within the expansive domain of materials science—efficiently distilling essential insights from densely packed scientific texts. Current methodologies within this domain often lean on general-purpose language models for information extraction.

LLM

LLM AI Researcher AI Research ML

The Rise of Time-Series Foundation Models for Data Analysis and Forecasting

Unite.AI

APRIL 4, 2024

However, compared to domains like natural language processing and image recognition , the integration of advanced artificial intelligence (AI) techniques into time series forecasting has been relatively slow. Recently, various foundational models have been developed for time series data.

Data Analysis

Data Analysis Natural Language Processing Artificial Intelligence Artificial Intelligence

This AI Paper from China Introduces Emu2: A 37 Billion Parameter Multimodal Model Redefining Task Solving and Adaptive Reasoning

Marktechpost

DECEMBER 24, 2023

In contrast, present multimodal models have not mastered people’s ability to learn new tasks in context, meaning that they can do so with minimal demonstrations or instructions. Generative pretrained language models have recently shown impressive skills in learning from context.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence AI AI

LayerSkip: An End-to-End AI Solution to Speed-Up Inference of Large Language Models (LLMs)

Marktechpost

MAY 1, 2024

Many applications have used large language models (LLMs). Researchers from FAIR, GenAI, and Reality Labs at Meta, University of Toronto, Carnegie Mellon University, University of Wisconsin-Madison, and Dana-Farber Cancer Institute investigate the possibility of decreasing the layer count for each token through early inference exit.

Large Language Models

Large Language Models Auto-complete LLM Deep Learning

Tensoic AI Releases Kan-Llama: A 7B Llama-2 LoRA PreTrained and FineTuned on ‘Kannada’ Tokens

Marktechpost

JANUARY 26, 2024

Tensoic has recently introduced Kannada Llama ( Kan-LLaMA ) to address the limitations of language models (LLMs), focusing specifically on proprietary characteristics, computational resources, and barriers to broader research community contributions.

Natural Language Processing

Natural Language Processing NLP LLM AI

Rising Impact of Small Language Models

Unite.AI

DECEMBER 29, 2023

The Emergence of Small Language Models In the rapidly evolving world of artificial intelligence, the size of a language model has often been synonymous with its capability. Smaller language models, once overshadowed by their larger counterparts, are emerging as potent tools in various AI applications.

Large Language Models

Large Language Models AI Researcher AI Research Artificial Intelligence

Meet Dolma: An Open English Corpus of 3T Tokens for Language Model Pretraining Research

XLang NLP Lab Researchers Propose Lemur: The State-of-the-Art Open Pretrained Large Language Models Balancing Text and Code Capabilities

Webinars

Trending Sources

Researchers from UC Berkeley and Meta Present AST-T5: A Novel Pretraining Paradigm that Harnesses the Power of Abstract Syntax Trees (ASTs) to Boost the Performance of Code-Centric Language Models

Webinars

Databricks claims DBRX sets ‘a new standard’ for open-source LLMs

Researchers from Cerebras & Neural Magic Introduce Sparse Llama: The First Production LLM based on Llama at 70% Sparsity

01.AI Introduces Yi-1.5-34B Model: An Upgraded Version of Yi with a High-Quality Corpus of 500B Tokens and Fine-Tuned on 3M Diverse Fine-Tuning Samples

Researchers at Intel Labs Introduce LLaVA-Gemma: A Compact Vision-Language Model Leveraging the Gemma Large Language Model in Two Variants (Gemma-2B and Gemma-7B)

Poro 34B: A 34B Parameter AI Model Trained for 1T Tokens of Finnish, English, and Programming languages, Including 8B Tokens of Finnish-English Translation Pairs

UC Berkeley Researchers Introduce Learnable Latent Codes as Bridges (LCB): A Novel AI Approach that Combines the Abstract Reasoning Capabilities of Large Language Models with Low-Level Action Policies

This Machine Learning Research Introduces Premier-TACO: A Robust and Highly Generalizable Representation Pretraining Framework for Few-Shot Policy Learning

Exploring the Scaling Laws in Large Language Models For Enhanced Translation Performance

Uni3D: Exploring Unified 3D Representation at Scale

Google AI Introduces Cappy: A Small Pre-Trained Scorer Machine Learning Model that Enhances and Surpasses the Performance of Large Multi-Task Language Models

Can We Transfer the Capabilities of LLMs like LLaMA from English to Non-English Languages? A Deep Dive into Multilingual Model Proficiency

XGen-MM: A Series of Large Multimodal Models (LMMS) Developed by Salesforce Al Research

Researchers from NVIDIA Introduce Retro 48B: The Largest LLM Pretrained with Retrieval before Instruction Tuning

This AI Paper Unveils the Key to Extending Language Models to 128K Contexts with Continual Pretraining

Researchers from Johns Hopkins and UC Santa Cruz Unveil D-iGPT: A Groundbreaking Advance in Image-Based AI Learning

Researchers from the University of Washington and Allen Institute for AI Present Proxy-Tuning: An Efficient Alternative to Finetuning Large Language Models

Meet MaLA-500: A Novel Large Language Model Designed to Cover an Extensive Range of 534 Languages

Maximizing Efficiency in AI Training: A Deep Dive into Data Selection Practices and Future Directions

NeuScraper: Pioneering the Future of Web Scraping for Enhanced Large Language Model Pretraining

CMU Researchers Propose MOMENT: A Family of Open-Source Machine Learning Foundation Models for General-Purpose Time Series Analysis

Meet MosaicBERT: A BERT-Style Encoder Architecture and Training Recipe that is Empirically Optimized for Fast Pretraining

Google DeepMind Researchers Propose a Novel AI Method Called Sparse Fine-grained Contrastive Alignment (SPARC) for Fine-Grained Vision-Language Pretraining

Researchers from Inception, MBZUAI, and Cerebras Open-Sourced ‘Jais’: The World’s Most Advanced Arabic Large Language Model

Meet InternLM-20B: An Open-Sourced 20B Parameter Pretrained Artificial Intelligence AI Framework

Meet SaulLM-7B: A Pioneering Large Language Model for Law

Researchers from China Propose Vision Mamba (Vim): A New Generic Vision Backbone With Bidirectional Mamba Blocks

This AI Paper from Microsoft Introduces a New Approach to Training Language Models: Mimicking Human Reading Comprehension for Enhanced Performance in Biomedicine, Finance, and Law

Meet ‘AboutMe’: A New Dataset And AI Framework that Uses Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Researchers at Stanford and MIT Introduced the Stream of Search (SoS): A Machine Learning Framework that Enables Language Models to Learn to Solve Problems by Searching in Language without Any External Support

Researchers at Brown University Introduce Bonito: An Open-Source AI Model for Conditional Task Generation to Convert Unannotated Texts into Instruction Tuning Datasets

JP Morgan AI Research Introduces FlowMind: A Novel Machine Learning Approach that Leverages the Capabilities of LLMs such as GPT to Create an Automatic Workflow Generation System

This AI Paper from Google DeepMind Studies the Gap Between Pretraining Data Composition and In-Context Learning in Pretrained Transformers

Meet Sailor: A Suite of Open Language Models for Bridging Linguistic Barriers in Southeast Asia

CT-LLM: A 2B Tiny LLM that Illustrates a Pivotal Shift Towards Prioritizing the Chinese Language in Developing LLMs

Meta raises the bar with open source Llama 3 LLM

MIT Researchers Introduce MechGPT: A Language-Based Pioneer Bridging Scales, Disciplines, and Modalities in Mechanics and Materials Modeling

The Rise of Time-Series Foundation Models for Data Analysis and Forecasting

This AI Paper from China Introduces Emu2: A 37 Billion Parameter Multimodal Model Redefining Task Solving and Adaptive Reasoning

LayerSkip: An End-to-End AI Solution to Speed-Up Inference of Large Language Models (LLMs)

Tensoic AI Releases Kan-Llama: A 7B Llama-2 LoRA PreTrained and FineTuned on ‘Kannada’ Tokens

Rising Impact of Small Language Models

Stay Connected