Language Models

Grok-2 [Blog]
- xAI
- Grok-2 Beta was released on 2024/08/13.
Gemma 2: Improving Open Language Models at a Practical Size (arXiv:2408.00118) [arXiv] [Code]
- Gemma Team, Google DeepMind
- Gemma 2
- Models: https://www.kaggle.com/models/google/gemma
The Llama 3 Herd of Models (arXiv:2407.21783) [arXiv] [Blog] [Code]
- MetaAI
- Llama 3
- Models
  - Llama 3 8B: https://huggingface.co/meta-llama/Meta-Llama-3-8B
  - Llama 3 70B
  - Llama 3 405B
Mixtral 8x7B (arXiv:2401.04088) [arXiv] [Blog] [Code]
- Mistral AI
- Mixtral 8x7B
- Model: https://huggingface.co/mistralai/Mixtral-8x7B-v0.1
Llama 2: Open Foundation and Fine-Tuned Chat Models (arXiv 2307.09288) [Paper] [Homepage]
- Meta AI
- Llama 2
- Released with a permissive community license and is available for commercial use.
LLaMA: Open and Efficient Foundation Language Models (arXiv 2302.13971) [Paper] [Code]
- Meta AI
- 6.7B, 13B, 32.5B, 65.2B
- Open-access
PaLM: Scaling Language Modeling with Pathways (JMLR 2023) [Paper] [PaLM API]
- 540B; open access to PaLM APIs in March 2023.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (arXiv 2211.05100) [Paper] [Model] [Blog]
- 176B
- open-access
OPT: Open Pre-trained Transformer Language Models (arXiv: 2205.01068) [Paper] [Code]
- Meta AI
- Range from 125M to 175B parameters.
- Open-access

Last updated 1 year ago