Generate Chinese flashcards from raw learning text using Wiktionary and OpenAI.
- Python 3.9+
- OpenAI API key
- Install dependencies:
python3 -m pip install -r requirements.txt- Create a
.envin project root:
OPENAI_API_KEY=sk-...
# Optional model override
OPENAI_MODEL=gpt-4o- Create instance directories under
output/, each with aninput.txt:
mkdir -p output/book
cp your_input.txt output/book/input.txt- Run generator (streams progress, skips existing
.mdfiles, halts on first error):
.venv/bin/python generate.py --verbose- Setup venv + deps:
make setup - Run generator via Makefile:
make generate- Per-word files:
<HEADWORD>.mdin each instance directory underoutput/.
- Automatically creates/uses
extracted.txtin each instance to list vocab; edit it to change the processing set/order. - Skips any vocab that already has
<HEADWORD>.mdin the instance directory. - Recurses for multi‑character words: writes parent, subword character cards, and component cards until no more named components.
- Halts on the first BLOCKED with a detailed reason; re‑run to resume (completed files are skipped).
- Pronunciation is Pinyin‑only (tone marks), multiple readings separated by
/. - Examples are formatted as
ZH (pinyin) - EN.
.envis auto-loaded on startup (no manual export needed).- The tool calls OpenAI for judgment steps (headword extraction and single-headword field extraction), then validates and writes Markdown following a fixed schema.