Latest GGML v2 format for LLaMa-13B
Browse files- README.md +29 -0
- llama-13b.fp16.ggml.bin +3 -0
- llama-13b.ggml.q4_0.bin +3 -0
- llama-13b.ggml.q4_1.bin +3 -0
- llama-13b.ggml.q5_0.bin +3 -0
- llama-13b.ggml.q5_1.bin +3 -0
- llama-13b.ggml.q8_0.bin +3 -0
README.md
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
inference: false
|
| 3 |
+
license: other
|
| 4 |
+
---
|
| 5 |
+
# LLaMa 13B GGML
|
| 6 |
+
|
| 7 |
+
This repo contains GGML format model files for the original LLaMa.
|
| 8 |
+
|
| 9 |
+
These files are for CPU (+ CUDA) inference using [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
| 10 |
+
|
| 11 |
+
I've uploaded them mostly for my own convenience, allowing me to easily grab them if and when I need them for future testing and comparisons.
|
| 12 |
+
|
| 13 |
+
## Provided files
|
| 14 |
+
|
| 15 |
+
The following formats are included:
|
| 16 |
+
* float16
|
| 17 |
+
* q4_0 - 4-bit
|
| 18 |
+
* q4_1 - 4-bit
|
| 19 |
+
* q5_0 - 5-bit
|
| 20 |
+
* q5_1 - 5-bit
|
| 21 |
+
* q8_0 - 8-bit
|
| 22 |
+
|
| 23 |
+
## THESE FILES REQUIRE LATEST LLAMA.CPP (May 12th 2023 - commit b9fd7ee)!
|
| 24 |
+
|
| 25 |
+
llama.cpp recently made a breaking change to its quantisation methods.
|
| 26 |
+
|
| 27 |
+
I have quantised the GGML files in this repo with the latest version. Therefore you will require llama.cpp compiled on May 12th or later (commit `b9fd7ee` or later) to use them.
|
| 28 |
+
|
| 29 |
+
I will not be providing GGML formats for the older llama.cpp code. They're already uploaded all over HF if you really need them!
|
llama-13b.fp16.ggml.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2b206e9b21fb1076f11cafc624e2af97c9e48ea09312a0962153acc20d45f808
|
| 3 |
+
size 26033013888
|
llama-13b.ggml.q4_0.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:489b28377429646520b99cb071e108ee61a08a997103a8ecc37626a0e6f82fbf
|
| 3 |
+
size 8136770688
|
llama-13b.ggml.q4_1.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b6b2b7b91d3dc51c168599a219c92c4e1db51bfcbb6c824efbf5f632802db44c
|
| 3 |
+
size 9763701888
|
llama-13b.ggml.q5_0.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4b937c76a9f57da06421d0e9583699bbaa123e64575289c6858233d06564de9e
|
| 3 |
+
size 8950236288
|
llama-13b.ggml.q5_1.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5d54dbc45b820064ac387da5fc60bd36adb406134617c28b271ad8a4dc692bec
|
| 3 |
+
size 9763701888
|
llama-13b.ggml.q8_0.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4a6e06643c99e9280d22e38279be7bd78243f6e661bc565060e16e52a2f9cff2
|
| 3 |
+
size 14644495488
|