LoRA vs. QLoRA

Copiar URL

LoRA (Low-Rank adaptation) and QLoRA (quantized Low-Rank adaptation) are both techniques for training AI models. More specifically, they are forms of parameter-efficient fine-tuning (PEFT), a fine-tuning technique that has gained popularity because it is more resource-efficient than other methods of training large language models (LLMs)

LoRA and QLoRA both help fine-tune LLMs more efficiently, but differ in how they manipulate the model and utilize storage to reach intended results.

Explore Red Hat AI 

LLMs are complex models made up of large numbers of parameters—some can reach into the billions. These parameters allow the model to be trained on a certain amount of information. More parameters means more data storage and, overall, a more capable model.

Traditional fine-tuning requires the refitting (updating or adjusting) of each individual parameter in order to update the LLM. This can mean fine-tuning billions of parameters, which takes a large amount of compute time and money.

Updating each parameter can lead to “overfitting,” a term used to describe an AI model that is learning “noise,” or unhelpful data, in addition to the general training data.

What are foundation models? 

Imagine a teacher and their classroom. The class has learned math all year long. Just before the test, the teacher emphasizes the importance of long division. Now during the test, many of the students find themselves overly preoccupied with long division and have forgotten key mathematical equations for questions that are just as important. This is what overfitting can do to an LLM during traditional fine-tuning.

In addition to issues with overfitting, traditional fine-tuning also presents a significant cost when it comes to resources.

QLoRA and LoRA are both fine-tuning techniques that provide shortcuts to improve the efficiency of full fine-tuning. Instead of training all of the parameters, it breaks the model down into matrices and only trains the parameters necessary to learn new information.

To follow our metaphor, these fine-tuning techniques are able to introduce new topics efficiently, without distracting the model from other topics on the test.

Learn more about parameter-efficient fine-tuning (PEFT)

Inteligencia artificial de Red Hat

Inteligencia artificial de Red Hat

Utilice las plataformas open source de Red Hat para diseñar, implementar y supervisar los modelos y las aplicaciones de inteligencia artificial.

Más información

Los vLLM

Los vLLM son conjuntos de código open source que permiten que los modelos de lenguaje realicen cálculos de manera más eficiente.

¿Qué es la inferencia de la inteligencia artificial?

La inferencia de la inteligencia artificial ocurre cuando un modelo de inteligencia artificial proporciona una respuesta a partir de datos. Es la fase final de un proceso complejo de tecnología de machine learning (aprendizaje automático).

Diferencias entre la inteligencia artificial predictiva y la generativa

La inteligencia generativa y la predictiva presentan diferencias significativas y se aplican a distintos casos prácticos. A medida que la inteligencia artificial evoluciona, establecer una distinción entre estos dos tipos permite conocer mejor sus diferentes funciones.

IA/ML: lecturas recomendadas

Artículos relacionados