Llama 3 - Meta’s most sophisticated large language model

April 23, 2024

Why in news?

Meta has introduced its most capable Large Language Model (LLM), the Meta Llama 3. It also introduced an image generator, which updates pictures in real-time even as the user types out the prompt.

What’s in today’s article?

Large Language Models (LLMs)
Generative Pre-trained Transformers (GPTs)
Llama 3

Large Language Models (LLMs)

Large language models use deep learning techniques to process large amounts of text.
They work by processing vast amounts of text, understanding the structure and meaning, and learning from it.
LLMs are trained to identify meanings and relationships between words.
The greater the amount of training data a model is fed, the smarter it gets at understanding and producing text.
- The training data is usually large datasets, such as Wikipedia, Open Web Text, and the Common Crawl Corpus.
- These contain large amounts of text data, which the models use to understand and generate natural language.

Generative Pre-trained Transformers (GPTs)

GPTs are a type of LLM that use transformer neural networks to generate human-like text.
GPTs are trained on large amounts of unlabelled text data from the internet, enabling them to understand and generate coherent and contextually relevant text.
They can be fine-tuned for specific tasks like: Language generation, Sentiment analysis, Language modelling, Machine translation, Text classification.
GPTs use self-attention mechanisms to focus on different parts of the input text during each processing step.
This allows GPT models to capture more context and improve performance on natural language processing (NLP) tasks.
- NLP is the ability of a computer program to understand human language as it is spoken and written -- referred to as natural language.

Llama 3

About
- Llama or Large Language Model Meta AI is a family of LLMs introduced by Meta AI in February 2023.
- The first version of the model was released in four sizes — 7B, 13B, 33B, and 65 billion parameters.
- As per the reports, the 13B model of Llama outperformed OpenAI’s GPT-3 which had 135 billion parameters.
  - Parameters are a measure of the size and complexity of an AI model.
  - Generally, a larger number of parameters means an AI model is more complex and powerful.
Features
- Llama 3 is claimed to be the most sophisticated model with significant progress in terms of performance and AI capabilities.
- Llama 3, which is based on the Llama 2 architecture, has been released in two sizes, 8B and 70B parameters.
- Both sizes come with a base model and an instruction-tuned version that has been designed to augment performance in specific tasks.
  - The instruction-tuned version is meant for powering AI chatbots that are meant to hold conversations with users.
- For now, Meta has released text-based models in the Llama 3 collection of models.
  - However, the company has plans to make it multilingual and multimodal, accept longer context, all while continuing to improve performance across LLM abilities such as coding and reasoning.
- All models of Llama 3 support context lengths of 8,000 tokens. This allows for more interactions, and complex input handling compared to Llama 2 or 1.
  - More tokens here mean more content input or prompts from users and more content as a response from the model.