Consolidate the scattered build inputs (dictionaries/english/, dictprep/russian/) into one sources/ tree keyed by the variant labels (scrabble_en/scrabble_ru/ erudit_ru), and move the Russian prep pipeline to tools/. The dawg outputs and their filenames are unchanged — rebuilt byte-identical (en_sowpods/ru_scrabble/ ru_erudit) — so the release artifact and the backend are unaffected. ru_stage2.py OUT_DIR and the ruwords flag defaults are repointed to sources/scrabble_ru/; Makefile / CI / cmd/builddict default / README updated; pipeline intermediates git-ignored. Verified: make dawg byte-identical to the committed baseline, py_compile + go vet of the moved tools. The full Russian regeneration pipeline (pymorphy3/libmorph/orfo PDF) was not run here.
This commit is contained in:
@@ -0,0 +1,9 @@
|
||||
# scrabble_ru source
|
||||
|
||||
`scrabble.txt` — Russian Scrabble common nouns (nominative singular), produced by the prep
|
||||
pipeline under [`../../tools/`](../../tools/README.md) from the Russian academic orthographic
|
||||
dictionary, cross-checked against OpenCorpora and libmorph. `manual_confirm.txt` holds the
|
||||
hand-reviewed additions the pipeline merges in. Built to `dawg/ru_scrabble.dawg` (`make dawg-ru`).
|
||||
|
||||
The pipeline's uncommitted intermediates (`orfo_dict_2025.txt`, `all.txt`, debug dumps) are
|
||||
regenerated here locally and are git-ignored.
|
||||
Reference in New Issue
Block a user