Upload 2 files
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ pipeline_tag: text-to-speech
|
|
| 24 |
|
| 25 |
| Model | Published | Training Data | Compute (A100 80GB) | Langs & Voices | SHA256 |
|
| 26 |
| ----- | --------- | ------------- | ------------------- | -------------- | ------ |
|
| 27 |
-
| **v1.0** | **2025 Jan 27** | **Few hundred hrs** | **$1000 for 1000 hrs** | [**6 &
|
| 28 |
| [v0.19](https://huggingface.co/hexgrad/kLegacy/tree/main/v0.19) | 2024 Dec 25 | <100 hrs | $400 for 500 hrs | 1 & 10 | `3b0c392f` |
|
| 29 |
|
| 30 |
### Usage
|
|
|
|
| 24 |
|
| 25 |
| Model | Published | Training Data | Compute (A100 80GB) | Langs & Voices | SHA256 |
|
| 26 |
| ----- | --------- | ------------- | ------------------- | -------------- | ------ |
|
| 27 |
+
| **v1.0** | **2025 Jan 27** | **Few hundred hrs** | **$1000 for 1000 hrs** | [**6 & 47**](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md) | `496dba11` |
|
| 28 |
| [v0.19](https://huggingface.co/hexgrad/kLegacy/tree/main/v0.19) | 2024 Dec 25 | <100 hrs | $400 for 500 hrs | 1 & 10 | `3b0c392f` |
|
| 29 |
|
| 30 |
### Usage
|
VOICES.md
CHANGED
|
@@ -10,14 +10,15 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 10 |
|
| 11 |
**Training Duration**
|
| 12 |
- How much audio was seen during training? Smaller durations result in a lower overall grade.
|
| 13 |
-
- 10 hours <= HH hours < 100 hours
|
| 14 |
- 1 hour <= H hours < 10 hours
|
| 15 |
- 10 minutes <= MM minutes < 100 minutes
|
| 16 |
-
- 1 minute <=
|
| 17 |
|
| 18 |
### American English ๐บ๐ธ
|
| 19 |
|
| 20 |
-
- [`misaki[en]`](https://github.com/hexgrad/misaki)
|
|
|
|
| 21 |
|
| 22 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
|
| 23 |
| ---- | ------ | -------------- | ----------------- | ------------- | ------ |
|
|
@@ -30,7 +31,7 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 30 |
| af_nova | ๐บ | B | MM minutes | C | `e0233676` |
|
| 31 |
| af_river | ๐บ | C | MM minutes | D | `e149459b` |
|
| 32 |
| af_sarah | ๐บ | B | H hours | C+ | `49bd364e` |
|
| 33 |
-
| af_sky |
|
| 34 |
| am_adam | ๐น | D | H hours | F+ | `ced7e284` |
|
| 35 |
| am_echo | ๐น | C | MM minutes | D | `8bcfdc85` |
|
| 36 |
| am_eric | ๐น | C | MM minutes | D | `ada66f0e` |
|
|
@@ -39,10 +40,12 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 39 |
| am_michael | ๐น | B | H hours | C+ | `9a443b79` |
|
| 40 |
| am_onyx | ๐น | C | MM minutes | D | `e8452be1` |
|
| 41 |
| am_puck | ๐น | B | H hours | C+ | `dd1d8973` |
|
|
|
|
| 42 |
|
| 43 |
### British English ๐ฌ๐ง
|
| 44 |
|
| 45 |
-
- [`misaki[en]`](https://github.com/hexgrad/misaki)
|
|
|
|
| 46 |
|
| 47 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
|
| 48 |
| ---- | ------ | -------------- | ----------------- | ------------- | ------ |
|
|
@@ -57,6 +60,7 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 57 |
|
| 58 |
### French ๐ซ๐ท
|
| 59 |
|
|
|
|
| 60 |
- espeak-ng `fr-fr`
|
| 61 |
- Total French training data: <11 hours
|
| 62 |
|
|
@@ -66,6 +70,7 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 66 |
|
| 67 |
### Hindi ๐ฎ๐ณ
|
| 68 |
|
|
|
|
| 69 |
- espeak-ng `hi`
|
| 70 |
- Total Hindi training data: H hours
|
| 71 |
|
|
@@ -78,6 +83,7 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 78 |
|
| 79 |
### Italian ๐ฎ๐ณ
|
| 80 |
|
|
|
|
| 81 |
- espeak-ng `it`
|
| 82 |
- Total Italian training data: H hours
|
| 83 |
|
|
@@ -88,20 +94,20 @@ Subjectively, voices will sound better or worse to different people.
|
|
| 88 |
|
| 89 |
### Japanese ๐ฏ๐ต
|
| 90 |
|
| 91 |
-
- [`misaki[ja]`](https://github.com/hexgrad/misaki)
|
| 92 |
- Total Japanese training data: H hours
|
| 93 |
|
| 94 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 | CC BY |
|
| 95 |
| ---- | ------ | -------------- | ----------------- | ------------- | ------ | ----- |
|
| 96 |
| jf_alpha | ๐บ | B | H hours | C+ | `1bf4c9dc` | |
|
| 97 |
| jf_gongitsune | ๐บ | B | MM minutes | C | `1b171917` | [gongitsune](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__gongitsune.txt) |
|
| 98 |
-
| jf_nezumi |
|
| 99 |
| jf_tebukuro | ๐บ | B | MM minutes | C | `0d691790` | [tebukurowokaini](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__tebukurowokaini.txt) |
|
| 100 |
-
| jm_kumo |
|
| 101 |
|
| 102 |
### Mandarin Chinese ๐จ๐ณ
|
| 103 |
|
| 104 |
-
- [`misaki[zh]`](https://github.com/hexgrad/misaki)
|
| 105 |
- Total Mandarin Chinese training data: H hours
|
| 106 |
|
| 107 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
|
|
|
|
| 10 |
|
| 11 |
**Training Duration**
|
| 12 |
- How much audio was seen during training? Smaller durations result in a lower overall grade.
|
| 13 |
+
- 10 hours <= **HH hours** < 100 hours
|
| 14 |
- 1 hour <= H hours < 10 hours
|
| 15 |
- 10 minutes <= MM minutes < 100 minutes
|
| 16 |
+
- 1 minute <= _M minutes_ < 10 minutes ๐ค
|
| 17 |
|
| 18 |
### American English ๐บ๐ธ
|
| 19 |
|
| 20 |
+
- `lang_code='a'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
|
| 21 |
+
- espeak-ng `en-us` fallback
|
| 22 |
|
| 23 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
|
| 24 |
| ---- | ------ | -------------- | ----------------- | ------------- | ------ |
|
|
|
|
| 31 |
| af_nova | ๐บ | B | MM minutes | C | `e0233676` |
|
| 32 |
| af_river | ๐บ | C | MM minutes | D | `e149459b` |
|
| 33 |
| af_sarah | ๐บ | B | H hours | C+ | `49bd364e` |
|
| 34 |
+
| af_sky | ๐บ๐ค | B | _M minutes_ | C- | `c799548a` |
|
| 35 |
| am_adam | ๐น | D | H hours | F+ | `ced7e284` |
|
| 36 |
| am_echo | ๐น | C | MM minutes | D | `8bcfdc85` |
|
| 37 |
| am_eric | ๐น | C | MM minutes | D | `ada66f0e` |
|
|
|
|
| 40 |
| am_michael | ๐น | B | H hours | C+ | `9a443b79` |
|
| 41 |
| am_onyx | ๐น | C | MM minutes | D | `e8452be1` |
|
| 42 |
| am_puck | ๐น | B | H hours | C+ | `dd1d8973` |
|
| 43 |
+
| am_santa | ๐น๐ค | C | _M minutes_ | D- | `7f2f7582` |
|
| 44 |
|
| 45 |
### British English ๐ฌ๐ง
|
| 46 |
|
| 47 |
+
- `lang_code='b'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
|
| 48 |
+
- espeak-ng `en-gb` fallback
|
| 49 |
|
| 50 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
|
| 51 |
| ---- | ------ | -------------- | ----------------- | ------------- | ------ |
|
|
|
|
| 60 |
|
| 61 |
### French ๐ซ๐ท
|
| 62 |
|
| 63 |
+
- `lang_code='f'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
|
| 64 |
- espeak-ng `fr-fr`
|
| 65 |
- Total French training data: <11 hours
|
| 66 |
|
|
|
|
| 70 |
|
| 71 |
### Hindi ๐ฎ๐ณ
|
| 72 |
|
| 73 |
+
- `lang_code='h'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
|
| 74 |
- espeak-ng `hi`
|
| 75 |
- Total Hindi training data: H hours
|
| 76 |
|
|
|
|
| 83 |
|
| 84 |
### Italian ๐ฎ๐ณ
|
| 85 |
|
| 86 |
+
- `lang_code='i'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
|
| 87 |
- espeak-ng `it`
|
| 88 |
- Total Italian training data: H hours
|
| 89 |
|
|
|
|
| 94 |
|
| 95 |
### Japanese ๐ฏ๐ต
|
| 96 |
|
| 97 |
+
- `lang_code='j'` in [`misaki[ja]`](https://github.com/hexgrad/misaki)
|
| 98 |
- Total Japanese training data: H hours
|
| 99 |
|
| 100 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 | CC BY |
|
| 101 |
| ---- | ------ | -------------- | ----------------- | ------------- | ------ | ----- |
|
| 102 |
| jf_alpha | ๐บ | B | H hours | C+ | `1bf4c9dc` | |
|
| 103 |
| jf_gongitsune | ๐บ | B | MM minutes | C | `1b171917` | [gongitsune](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__gongitsune.txt) |
|
| 104 |
+
| jf_nezumi | ๐บ๐ค | B | _M minutes_ | C- | `d83f007a` | [nezuminoyomeiri](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__nezuminoyomeiri.txt) |
|
| 105 |
| jf_tebukuro | ๐บ | B | MM minutes | C | `0d691790` | [tebukurowokaini](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__tebukurowokaini.txt) |
|
| 106 |
+
| jm_kumo | ๐น๐ค | B | _M minutes_ | C- | `98340afd` | [kumonoito](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__kumonoito.txt) |
|
| 107 |
|
| 108 |
### Mandarin Chinese ๐จ๐ณ
|
| 109 |
|
| 110 |
+
- `lang_code='z'` in [`misaki[zh]`](https://github.com/hexgrad/misaki)
|
| 111 |
- Total Mandarin Chinese training data: H hours
|
| 112 |
|
| 113 |
| Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
|