LiteLlama
2023-01-07 https://huggingface.co/ahxt/LiteLlama-460M-1T
In this series of repos, we present an open-source reproduction of Meta AI’s LLaMa 2. However, with significantly reduced model sizes, LiteLlama-460M-1T has 460M parameters trained with 1T tokens.
TinyLlama
2023-01-04 https://github.com/jzhang38/TinyLlama https://arxiv.org/abs/2401.02385 https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of “just” 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.