LLaMA

Welcome to LLaMA, my library for training and fine-tuning the LLaMA model. I find it helpful to implement something from scratch to gain a better understanding. I hope the simplicity of this repo could potentially serve as a good starting point for beginners.

Features

Currently, this library supports:

Flash Attention, Triton RMSNorm, Flash RoPE (Triton/CUDA acceleration)
KV Cache
Tensor Parallelism
DDP with bucket

Experience

Speedup/Loss benchmark results under LLaMA/tools/benchmark

Coming Soon

I'm actively working on integrating the following features:

Training on real data
More benchmarks
Zero Optimizer

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
notes		notes
src		src
tests		tests
tools		tools
.gitignore		.gitignore
README.md		README.md
generate.py		generate.py
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaMA

Features

Experience

Coming Soon

About

Releases

Packages

Languages

zzhhjjj/LLaMA

Folders and files

Latest commit

History

Repository files navigation

LLaMA

Features

Experience

Coming Soon

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages