Files
scrabble-dictionary/README.md
T
Ilia Denisov 1d34753611
build / dawg (pull_request) Successful in 1m11s
CI: build-only validation (no make/python/contexts); commit folded erudit.txt
- build.yaml dropped the release step whose ${{ github.* }} contexts failed the Gitea
  workflow compile (the run produced 0 jobs); it now inlines go run (no make dependency)
  and reads the committed dictprep/russian/erudit.txt (no python dependency).
- erudit.txt is scrabble.txt with Ё→Е folded (dictprep/fold_yo.py); it reproduces the
  canonical ru_erudit.dawg byte-for-byte. Release artifacts are published manually for now
  (see README).
2026-06-04 19:43:44 +02:00

3.1 KiB

scrabble-dictionary

Versioned dictionary artifacts for the Scrabble game backend: the word-list sources and the build pipeline that produces the dictionary DAWGs, published as a release artifact (the DAWGs are data, not a Go module).

The build uses the published scrabble-solver dictdawg/wordlist packages (pinned in go.mod) over github.com/iliadenisov/{dafsa,alphabet} (v1.1.0), so the on-disk format and letter indexing match the running backend exactly — there is no index drift, because the backend pins the same dafsa/alphabet. The DAWGs this repo builds are byte-identical to the solver's committed test fixtures.

Artifact

make dawg builds three DAWGs into dawg/:

file variant source
en_sowpods.dawg English (SOWPODS) dictionaries/english/sowpods.txt
ru_scrabble.dawg Russian Scrabble dictprep/russian/scrabble.txt
ru_erudit.dawg Эрудит dictprep/russian/erudit.txt (Ё→Е folded scrabble.txt, via dictprep/fold_yo.py)

The CI (.gitea/workflows/build.yaml) rebuilds them on every push/PR as a validation gate (inlined go run, no make/python needed on the runner). Release artifacts are published per version (see Release below): the three DAWGs packaged flat into scrabble-dawg-<tag>.tar.gz and attached to the Gitea release for the vX.Y.Z tag. The backend deploy unpacks that tarball into BACKEND_DICT_DIR; one semver label versions the whole set (additive — a new version is a new release, never breaking a running backend).

Sources / provenance

  • English: dictionaries/english/sowpods.txt, vendored from kamilmielnik/scrabble-dictionaries.
  • Russian: dictprep/russian/scrabble.txt, derived from the Russian academic orthographic dictionary by the tooling under dictprep/ (see dictprep/README.md); dictprep/russian/erudit.txt is its Ё→Е folded form (dictprep/fold_yo.py). Only the prepared word lists are vendored; the heavy upstream source (the orfo PDF/text) is not.

Build

make dawg     # -> dawg/{en_sowpods,ru_scrabble,ru_erudit}.dawg

Requires Go (module deps fetched with GOPRIVATE=gitea.iliadenisov.ru/*, exported by the Makefile). No python is needed for the build — the Ё→Е fold is committed as erudit.txt; regenerate it with python3 dictprep/fold_yo.py dictprep/russian/scrabble.txt > dictprep/russian/erudit.txt.

Release

CI builds and validates the DAWGs but does not upload them (the release upload needs a write token, kept out of CI for now — a future enhancement). To publish a version, tag it and attach the artifact to its Gitea release:

make dawg
tar czf scrabble-dawg-vX.Y.Z.tar.gz -C dawg en_sowpods.dawg ru_scrabble.dawg ru_erudit.dawg
# create the Gitea release for tag vX.Y.Z and upload scrabble-dawg-vX.Y.Z.tar.gz as an asset

The backend consumes it at https://gitea.iliadenisov.ru/developer/scrabble-dictionary/releases/download/vX.Y.Z/scrabble-dawg-vX.Y.Z.tar.gz.