Merge pull request 'Implement Scrabble move generator (DAWG)' (#1) from feat/scrabble-solver into master
Reviewed-on: owner/scrabble-solver#1 Reviewed-by: Ilia <3+owner@noreply.gitea.iliadenisov.ru>
This commit was merged in pull request #1.
This commit is contained in:
+18
@@ -0,0 +1,18 @@
|
||||
# Cached serialized dictionaries, built from the dictionaries/ submodule by
|
||||
# cmd/builddict. They are reproducible artifacts, not source.
|
||||
/testdata/*.dawg
|
||||
/testdata/*.gaddag
|
||||
/testdata/*.bin
|
||||
|
||||
# Local scratch
|
||||
/tmp/
|
||||
|
||||
# Compiled libmorph bridge (build artifact; see dictprep/README.md)
|
||||
/dictprep/libmorph_check
|
||||
|
||||
# Stage 2 --dump debug buckets (committed: all, scrabble, manual_confirm, orfo_dict_2025)
|
||||
/dictprep/russian/undefined.txt
|
||||
/dictprep/russian/adjectives.txt
|
||||
/dictprep/russian/verbs.txt
|
||||
/dictprep/russian/singulars.txt
|
||||
/dictprep/russian/fate.tsv
|
||||
@@ -0,0 +1,3 @@
|
||||
[submodule "dictionaries"]
|
||||
path = dictionaries
|
||||
url = https://github.com/kamilmielnik/scrabble-dictionaries
|
||||
+164
@@ -0,0 +1,164 @@
|
||||
# Scrabble move generation — algorithm reference (single source of truth)
|
||||
|
||||
This is the authoritative description of the algorithm the solver implements: the DAWG
|
||||
move generator of Appel & Jacobson. It is distilled from the source paper (extracted
|
||||
from the PDF in this repo) plus our adaptation. **Work from this file; do not re-parse
|
||||
the PDFs.**
|
||||
|
||||
Source: **[AJ88]** A. Appel, G. Jacobson, *The World's Fastest Scrabble Program*,
|
||||
Commun. ACM 31(5):572–578, 585 (1988).
|
||||
|
||||
> A GADDAG (Gordon, 1994) was also implemented and benchmarked against the DAWG, then
|
||||
> **removed**: for a scoring solver it was ~7× larger and no faster. See `RESULTS.md`.
|
||||
|
||||
Notation: letters are alphabet **indexes** (English `a..z` → `0..25`). "Across" =
|
||||
horizontal play; "down" = vertical. The board grid and all I/O use the byte encoding in
|
||||
`internal/encoding` (low 6 bits = letter+1, `0` = empty, bit 7 = blank).
|
||||
|
||||
---
|
||||
|
||||
## 1. Reduction to one dimension [AJ88 §3.1]
|
||||
|
||||
Every play is either **across** (one row) or **down** (one column). Down plays are
|
||||
across plays on the **transposed** board, so the generator implements only "across" and
|
||||
runs on the board and/or its transpose. This is the two-mode requirement:
|
||||
|
||||
- `OnlyHorizontal` = across on the board.
|
||||
- `OnlyVertical` = across on the transpose.
|
||||
- `Both` = both. (A move found on the transpose has its coordinates swapped back.)
|
||||
|
||||
### Anchors [AJ88 §3.1.2]
|
||||
|
||||
An across word must include a newly-placed tile adjacent to a tile already on the board.
|
||||
Generation starts only from **anchors** — empty squares orthogonally adjacent to a
|
||||
filled square — which guarantees connectivity and prunes most of the search. The very
|
||||
first move has no anchors and is special-cased through the board's centre square.
|
||||
|
||||
### Cross-checks (cross-sets) [AJ88 §3.1.1]
|
||||
|
||||
For each empty square the **cross-set** is the set of letters that, placed there, form a
|
||||
legal **perpendicular** word with the tiles above/below it; it is stored as a bit-vector
|
||||
(`letterSet`). A square with no perpendicular neighbour allows every letter. Cross-sets
|
||||
let move generation stay one-dimensional. Off-board squares are treated as blocking
|
||||
sentinels.
|
||||
|
||||
---
|
||||
|
||||
## 2. Lexicon: trie → DAWG [AJ88 §3.2]
|
||||
|
||||
The lexicon is a trie whose edges are labelled by letters; a word is a root-to-node
|
||||
path; terminal nodes are marked. Minimizing the trie (merging equivalent sub-tries)
|
||||
yields a **DAWG**, the minimum-state automaton for the word set. `dafsa` is exactly such
|
||||
a minimized, bit-packed DAWG, with a compact ≤6-bit alphabet and a **per-node** `final`
|
||||
flag. Its node identity is a bit-offset; edges of a node are sorted by letter.
|
||||
|
||||
---
|
||||
|
||||
## 3. Move generation — [AJ88 §3.3] (verbatim)
|
||||
|
||||
Two phases per anchor: place a **left part**, then **extend right**. The left part is
|
||||
either tiles already on the board (read them, then `ExtendRight` from the corresponding
|
||||
node) or rack tiles placed by a pruned DAWG traversal bounded by `limit` = number of
|
||||
non-anchor squares to the left (this bound, reset at each anchor, makes each move appear
|
||||
once).
|
||||
|
||||
```
|
||||
ExtendRight(PartialWord, node N, square):
|
||||
if square is vacant then
|
||||
if N is terminal then LegalMove(PartialWord)
|
||||
for each edge E out of N:
|
||||
let l = letter of E
|
||||
if l is in the rack and l is in the cross-set of square then
|
||||
remove a tile l from the rack
|
||||
ExtendRight(PartialWord || l, node(E), square+1)
|
||||
put the tile l back into the rack
|
||||
else
|
||||
let l = letter occupying square
|
||||
if N has an edge labelled l leading to N' then
|
||||
ExtendRight(PartialWord || l, N', square+1)
|
||||
|
||||
LeftPart(PartialWord, node N, limit):
|
||||
ExtendRight(PartialWord, N, AnchorSquare)
|
||||
if limit > 0 then
|
||||
for each edge E out of N:
|
||||
let l = letter of E
|
||||
if l is in the rack then
|
||||
remove a tile l from the rack
|
||||
LeftPart(PartialWord || l, node(E), limit - 1)
|
||||
put the tile l back into the rack
|
||||
```
|
||||
|
||||
To generate from an anchor with `k` non-anchor squares to its left: `LeftPart("", root,
|
||||
k)`. Implementation notes:
|
||||
|
||||
- A word is **recorded only past the anchor** (`square > anchor`), so every recorded play
|
||||
covers the anchor square — the connection guaranteed by the anchor's filled neighbour.
|
||||
- **Empty prefix** when a tile sits left of the anchor: skip `LeftPart`; `ExtendRight`
|
||||
directly from the node reached by walking the on-board left context.
|
||||
- **Blanks**: when scanning the rack for a letter, also allow a blank to stand for it;
|
||||
the placed tile is flagged (bit 7) and scores 0; restore on backtrack.
|
||||
- **First move**: the only anchor is the centre; the left limit equals its column, so the
|
||||
word covers the centre.
|
||||
|
||||
This maps onto the `dafsa` traversal API: `Cursor.Root`, `Cursor.Next(node, letter)`
|
||||
(an edge, with the destination's `final` flag), `Cursor.Final(node)`, and `Cursor.Arcs`
|
||||
(enumerate a node's edges, used for placements and cross-sets).
|
||||
|
||||
---
|
||||
|
||||
## 4. Cross-set computation
|
||||
|
||||
For a square with tiles `above` (top→bottom) and `below`, the cross-set is
|
||||
`{ X : above·X·below ∈ dict }`:
|
||||
|
||||
- **Right extension** (no `below`): deterministic — `X` just completes the prefix
|
||||
`above`. Walk `above` to a node, then read the **completers** (one arc enumeration:
|
||||
the letters whose arc leads straight to an accepting node).
|
||||
- **Left extension** (tiles `below`): non-deterministic — probe each `X` (walk `above`,
|
||||
`X`, then `below`, test acceptance). This is the asymmetry inherent to a left-to-right
|
||||
DAWG.
|
||||
|
||||
Cross-sets are recomputed per generation for the squares that need them (cached within a
|
||||
call). Scoring is done separately by `Evaluate` (§5), so cross-sums are not precomputed.
|
||||
|
||||
---
|
||||
|
||||
## 5. Scoring (full tournament rules + breakdown)
|
||||
|
||||
A play's score = main word + every cross-word formed + the all-tiles bonus. Per word:
|
||||
|
||||
- Sum `tileValue(letter)` over its tiles; a **blank** scores 0.
|
||||
- A **letter** premium (DL/TL) multiplies the value of a tile placed on it **only when
|
||||
newly placed** this turn.
|
||||
- A **word** premium (DW/TW) multiplies the whole word **only when a newly-placed tile
|
||||
sits on it**; multiple word premiums multiply.
|
||||
- Each cross-word counts its new tile plus the existing perpendicular run.
|
||||
|
||||
The all-tiles bonus is added when the play uses a full rack. Board geometry, premium
|
||||
layout, tile values/counts, blank count, rack size and the bonus are all part of the
|
||||
`rules.Ruleset` (`English`, `RussianScrabble`, `Erudit`). `Evaluate` returns the main
|
||||
word, the cross-words and a per-tile breakdown. `ValidatePlay` adds dictionary and
|
||||
connectivity checks.
|
||||
|
||||
---
|
||||
|
||||
## 6. Rulesets
|
||||
|
||||
- **English** Scrabble: 15×15, standard premiums, 26 letters, 100 tiles (98 + 2 blanks),
|
||||
7-tile rack, 50-point bonus.
|
||||
- **Russian Scrabble**: 33-letter alphabet (incl. Ё), standard board, 104 tiles, 50-bonus.
|
||||
- **Эрудит**: 33-letter alphabet with **Ё unused** (no tile — fold Ё→Е when preparing the
|
||||
dictionary, `wordlist.FoldYo`; an out-of-engine step), the **centre does not double**,
|
||||
131 tiles (128 + 3 blanks), blanks score 0, **15-point bonus**.
|
||||
|
||||
---
|
||||
|
||||
## 7. Special cases checklist
|
||||
|
||||
- **First move**: no anchors; the play must cover the centre.
|
||||
- **Blanks**: any letter, score 0, flagged via bit 7; expand to every cross-set-allowed
|
||||
letter during generation.
|
||||
- **Off-board sentinels**: stop extension at the edge.
|
||||
- **A single newly-placed tile** can form both an across and a down word.
|
||||
- **Dedup**: each legal move is generated once (anchor + left-part limit); a canonical
|
||||
move key guards against any residual duplicate.
|
||||
@@ -0,0 +1,49 @@
|
||||
# scrabble-solver — project guide
|
||||
|
||||
A Go library that, given a dictionary, a board position and a rack, returns every legal
|
||||
play ranked by score, and scores/validates arbitrary plays. The move generator is the
|
||||
**DAWG** algorithm (Appel & Jacobson) over `github.com/iliadenisov/dafsa` — a bit-packed,
|
||||
minimised DAWG with a compact ≤63-symbol alphabet. A GADDAG generator was also built,
|
||||
measured by self-play, and **removed**: DAWG won for this scoring-solver workload
|
||||
(~7× smaller, comparable speed) — see `RESULTS.md`.
|
||||
|
||||
Module `scrabble-solver`, Go 1.26. Rulesets: English Scrabble, Russian Scrabble, and
|
||||
Russian **Эрудит** (`rules` package); Эрудит has no Ё tile and folds Ё→Е in its dictionary.
|
||||
|
||||
## Layout
|
||||
|
||||
- `scrabble/` — the public API: `Solver` (`NewSolver`, `GenerateMoves`, `ScorePlay`,
|
||||
`ValidatePlay`), the `Move`/`Placement`/`Word` types, the DAWG generator and scoring.
|
||||
- `board/`, `rack/`, `rules/` — board grid (+ transpose), rack as per-letter counts,
|
||||
and rulesets (geometry, premium layout, tile values/counts, alphabet, bonus):
|
||||
`rules.English()`, `rules.RussianScrabble()`, `rules.Erudit()`.
|
||||
- `internal/` — `dictdawg` (build/load/serialise DAWGs over dafsa), `wordlist`
|
||||
(encode/filter/sort/dedupe + `FoldYo`), `graph`, `dict`.
|
||||
- `cmd/builddict` — word list → serialised DAWG (`-alphabet latin|russian`).
|
||||
- `cmd/stress`, `selfplay/` — the self-play stress harness behind `RESULTS.md`.
|
||||
- `dawg/` — **committed** dictionaries: `en_sowpods.dawg`, `ru_scrabble.dawg`,
|
||||
`ru_erudit.dawg` (Ё→Е folded). Rebuild with `make dawg`.
|
||||
- `dictionaries/` — `kamilmielnik/scrabble-dictionaries` git submodule (English source).
|
||||
- `dictprep/` — self-contained tooling that turns the Russian academic orthographic
|
||||
dictionary into a common-noun word list. See `dictprep/README.md`. Committed output is
|
||||
`dictprep/russian/{all,scrabble}.txt` (+ `orfo_dict_2025.{pdf,txt}`, `manual_confirm.txt`).
|
||||
Running Stage 2 needs a Python venv with `mawo-pymorphy3` and the `libmorph` apt packages
|
||||
(see `dictprep/README.md`).
|
||||
|
||||
## Build & test
|
||||
|
||||
go test ./... # all packages green; also run go vet ./... and gofmt
|
||||
make dawg # rebuild dawg/*.dawg from the word lists
|
||||
|
||||
Scoring and move generation are validated against **real tournament games** in GCG format
|
||||
(`scrabble/gcg_test.go` + `scrabble/testdata/*.gcg`, including the 700+ club): for every
|
||||
move the test checks the score, the running total, and that the generator actually
|
||||
produces the played move with that score — canonical play, not invented cases.
|
||||
|
||||
## Key facts
|
||||
|
||||
- Compact byte encoding: low 6 bits = alphabet index; `0x80` = blank/wildcard (board, rack
|
||||
and output bytes only — never inside the graph). The public API is byte-indexed.
|
||||
- DAWG is the production generator; the GADDAG was removed after measurement.
|
||||
- Detailed docs: `ALGORITHM.md` (the algorithm — single source of truth), `PLAN.md`
|
||||
(design and decisions), `RESULTS.md` (DAWG-vs-GADDAG), `dictprep/README.md` (RU pipeline).
|
||||
@@ -0,0 +1,28 @@
|
||||
# Scrabble-solver build helpers.
|
||||
#
|
||||
# `make dawg` (re)builds the committed dictionary DAWGs under dawg/ from their word lists:
|
||||
# en_sowpods.dawg — English SOWPODS (Latin alphabet)
|
||||
# ru_scrabble.dawg — Russian Scrabble nouns (Cyrillic, 33 letters)
|
||||
# ru_erudit.dawg — Эрудит (the same list with Ё→Е folded and de-duped)
|
||||
|
||||
GO ?= go
|
||||
PYTHON ?= python3
|
||||
DAWG_DIR := dawg
|
||||
BUILDDICT := $(GO) run ./cmd/builddict
|
||||
|
||||
.PHONY: dawg dawg-en dawg-ru dawg-erudit clean-dawg
|
||||
|
||||
dawg: dawg-en dawg-ru dawg-erudit
|
||||
|
||||
dawg-en:
|
||||
$(BUILDDICT) -dict dictionaries/english/sowpods.txt -alphabet latin -name en_sowpods -out $(DAWG_DIR)
|
||||
|
||||
dawg-ru:
|
||||
$(BUILDDICT) -dict dictprep/russian/scrabble.txt -alphabet russian -name ru_scrabble -out $(DAWG_DIR)
|
||||
|
||||
dawg-erudit:
|
||||
$(PYTHON) dictprep/fold_yo.py dictprep/russian/scrabble.txt > /tmp/ru_erudit_words.txt
|
||||
$(BUILDDICT) -dict /tmp/ru_erudit_words.txt -alphabet russian -name ru_erudit -out $(DAWG_DIR)
|
||||
|
||||
clean-dawg:
|
||||
rm -f $(DAWG_DIR)/*.dawg
|
||||
@@ -0,0 +1,174 @@
|
||||
# Scrabble Solver — Implementation Plan
|
||||
|
||||
## Outcome (current state)
|
||||
|
||||
Both generators were implemented and verified to produce identical moves, then compared
|
||||
by self-play stress test (`RESULTS.md`). The **GADDAG was removed**: for a scoring solver
|
||||
it was ~7× larger and no faster than the **DAWG**, which is now the sole generator.
|
||||
Shipped: the DAWG generator, full scoring + breakdown, the public `Solver`
|
||||
(`GenerateMoves`/`ScorePlay`/`ValidatePlay`), and three rulesets (English Scrabble,
|
||||
Russian Scrabble, Эрудит). The rest of this document is the original roadmap, kept for
|
||||
history; the DAWG/GADDAG comparison it describes is preserved in `RESULTS.md`.
|
||||
|
||||
## Context
|
||||
|
||||
We are building a Go library that, given a dictionary, a current game position and a
|
||||
player's rack, returns every legal new play ranked by descending score. The core is a
|
||||
fast finite-automaton move generator based on two papers (analysed in `ALGORITHM.md`):
|
||||
|
||||
- Appel & Jacobson, *The World's Fastest Scrabble Program* (CACM 1988) — the **DAWG**
|
||||
algorithm (cross-checks, anchor squares, `LeftPart`/`ExtendRight`, edge encoding,
|
||||
cross-sums for scoring, transpose for the perpendicular direction).
|
||||
- Gordon, *A Faster Scrabble Move Generation Algorithm* (SP&E 1994) — the **GADDAG**
|
||||
(`REV(x)◊y` representation, single `Gen`/`GoOn` generator, deterministic cross-sets,
|
||||
construction algorithm).
|
||||
|
||||
The graph engine is `github.com/iliadenisov/dafsa` — a compact, bit-packed minimized
|
||||
DAWG with a 6-bit "compact alphabet" (`alphabet.Indexer`, ≤63 symbols) and an
|
||||
index-based (`*B`) API, checked out locally at `../dafsa`.
|
||||
|
||||
### Headline approach
|
||||
|
||||
**Implement BOTH generators — DAWG and GADDAG — behind one shared `Generator`
|
||||
interface, then decide which becomes the production default empirically**, via a clean
|
||||
self-play stress test (two greedy players, several games) on the *same* dictionary,
|
||||
measuring speed and memory. The choice is made **after** implementation and measurement.
|
||||
Both implementations are kept; the comparison output (`RESULTS.md`) picks the default.
|
||||
|
||||
### Locked decisions
|
||||
|
||||
| # | Topic | Decision |
|
||||
|---|---|---|
|
||||
| 1 | Core algorithm | **Implement both**: DAWG (Appel-Jacobson) and GADDAG (Gordon, over `dafsa` as a DAWG of `{REV(x)◊y}` with per-node final flags). Pick the default after a self-play stress test. |
|
||||
| 2 | dafsa changes | **Edit `../dafsa` directly**, wire via `go.mod replace`. Leave a spec/CHANGELOG there. |
|
||||
| 3 | Ruleset scope | Default **standard English Scrabble**, fully **parameterizable** (board geometry, premium layout DL/TL/DW/TW, tile values & counts, alphabet, blank count, bingo bonus). Must support **Russian "Эрудит"** (same 15×15 board + premiums; different tile values/counts; Cyrillic alphabet; one word per move, horizontal **or** vertical). |
|
||||
| 4 | Scoring | **Full tournament scoring + breakdown**: main word + all cross-words + premiums (newly-placed tiles only) + bingo bonus; result carries formed cross-words and a per-tile breakdown. |
|
||||
| 5 | Symbol encoding | **`0x80` = wildcard/blank** flag (board/rack/output only — never in the graph). **GADDAG separator `◊` = index == alphabet size** (`cbits` minimal; measured optimum). |
|
||||
| 6 | State model | **Compact byte board** is the generation core; a structured **`Play`** type + a constructor that applies plays to build a board provide the full game-state overlay. |
|
||||
| 7 | API scope | **Generation + scoring + validation** of arbitrary plays. |
|
||||
| 8 | Dictionary | `kamilmielnik/scrabble-dictionaries` as a **git submodule**; `cmd/builddict` builds serialized structures **cached in `testdata/` (gitignored)**. English now; Russian later. |
|
||||
|
||||
### Static structure probe (informs expectations; NOT the decision)
|
||||
|
||||
Full SOWPODS (267,752 words, Σlen = 2,439,269), built through `dafsa`:
|
||||
|
||||
| Structure | nodes | bytes | bits/char | build | ns/arc |
|
||||
|---|---:|---:|---:|---:|---:|
|
||||
| DAWG (a–z) | 77,808 | 750 KB | 2.46 | 186 ms | 48.7 |
|
||||
| GADDAG sep=size(26) · cbits5 | 587,940 | 5.37 MB | 17.61 | 2.92 s | 61.0 |
|
||||
| GADDAG sep=62 · cbits6 (measured) | 587,940 | 5.53 MB | 18.13 | — | 60.9 |
|
||||
| GADDAG sep=0x40 · cbits7 (extrapolated) | 587,940 | ~5.69 MB | ~18.6 | — | ~61 |
|
||||
|
||||
GADDAG is ~7× the DAWG and ~25% costlier per arc, but ~2× faster at actual move
|
||||
generation (Gordon Table IV: ~2.5× fewer arcs). The stress test settles it.
|
||||
|
||||
## Deliverable documents
|
||||
|
||||
1. **`ALGORITHM.md`** — single source of truth (verbatim pseudocode + our adaptation).
|
||||
2. **`PLAN.md`** — this plan.
|
||||
3. **`RESULTS.md`** — stress-test comparison + the production-default decision.
|
||||
|
||||
## Architecture & package layout
|
||||
|
||||
```
|
||||
scrabble-solver/
|
||||
go.mod # + replace github.com/iliadenisov/dafsa => ../dafsa
|
||||
PLAN.md ALGORITHM.md RESULTS.md README.md
|
||||
dictionaries/ # git submodule: kamilmielnik/scrabble-dictionaries
|
||||
testdata/ # gitignored: cached serialized DAWG + GADDAG
|
||||
internal/
|
||||
gaddag/ # REV(x)◊y transform + build + traversal wrapper over dafsa
|
||||
dictdawg/ # plain-DAWG build + traversal wrapper over dafsa
|
||||
encoding/ # byte conventions (wildcard 0x80, separator, board cells)
|
||||
board/ # compact board grid, transpose, premium layout
|
||||
rack/ # rack as per-letter counts + blanks
|
||||
rules/ # Ruleset: geometry, premiums, tile values/counts, alphabet, bonus
|
||||
scrabble/ (public pkg) # Solver + Generator interface; Play/Move types;
|
||||
gen_dawg.go # DAWG generator (LeftPart/ExtendRight)
|
||||
gen_gaddag.go # GADDAG generator (Gen/GoOn)
|
||||
selfplay/ # bag + greedy player + game loop (self-play engine)
|
||||
cmd/builddict/ # word list -> serialized DAWG/GADDAG -> testdata
|
||||
cmd/stress/ # run N self-play games per generator, emit comparison
|
||||
```
|
||||
|
||||
Shared `Generator` interface so the harness can swap implementations:
|
||||
|
||||
```go
|
||||
type Generator interface {
|
||||
GenerateMoves(b *board.Board, r rack.Rack, mode Mode) []Move // ranked, descending score
|
||||
Name() string
|
||||
}
|
||||
```
|
||||
|
||||
Board, rack, rules and **scoring are shared**; cross-set computation is per-generator
|
||||
(DAWG: probe the dictionary DAWG incl. the non-deterministic left set; GADDAG:
|
||||
deterministic GADDAG walk) — that difference is part of what is measured.
|
||||
|
||||
## Changes to `../dafsa` (additive ⇒ backward compatible)
|
||||
|
||||
1. **Low-level traversal API**: `type Node` (opaque bit-offset); `Root() Node`;
|
||||
`Next(n, ch) (child, final, ok)` (= Gordon's `NextArc`, wraps private `getEdge`);
|
||||
`Arcs(n, fn)` (wraps `getNode`, for blanks/cross-sets); a reusable allocation-free
|
||||
cursor for hot-path `Next`.
|
||||
2. **Custom-alphabet persistence**: `WriteTo`/`SaveWith` (allow non-embedded alphabet);
|
||||
`ReadWith`/`LoadWith` (inject a known indexer, skip language reconstruction).
|
||||
3. (Optional) accurate serialized arc/node count; document that `NumEdges()` is build-time.
|
||||
|
||||
## Data model & compact formats
|
||||
|
||||
- **Byte symbol**: low 6 bits = alphabet index; `0x80` = wildcard/blank (I/O only);
|
||||
`◊` = index `len(alphabet)` (GADDAG graph only).
|
||||
- **Board**: `[]byte`, row-major. `0` = empty; occupied = `letterIndex+1`; blank =
|
||||
`(letterIndex+1) | 0x80`. Helpers: `At`, `Set`, `Transpose`, premium lookup.
|
||||
- **Rack**: `[]byte` counts, length `alphabetSize+1`; last slot = blank count.
|
||||
- **`Play`**: `{Row, Col; Dir; Tiles []byte (0x80 flags); Main; CrossWords; Score;
|
||||
Breakdown}` — input for apply/validate/score and the output element.
|
||||
- **Modes**: `Both`, `Horizontal`, `Vertical`.
|
||||
|
||||
## Staged implementation
|
||||
|
||||
- **Stage 0** — Scaffolding & docs: `ALGORITHM.md`, `PLAN.md`, `dictionaries/` submodule,
|
||||
`go.mod` replace, `.gitignore`.
|
||||
- **Stage 1** — dafsa traversal API (shared): `Node`, `Root`, `Next`, `Arcs`, cursor; tests.
|
||||
- **Stage 2** — dafsa custom-alphabet persistence: `SaveWith`/`ReadWith`; round-trip.
|
||||
- **Stage 3** — Shared infra: encoding, board (+transpose), rack, rules (EN; Эрудит stub),
|
||||
scoring, `Generator` interface, `Move`/`Play` types.
|
||||
- **Stage 4** — Dictionary build: `internal/dictdawg` + `internal/gaddag`; `cmd/builddict`
|
||||
caching serialized DAWG **and** GADDAG in `testdata`.
|
||||
- **Stage 5** — Cross-sets: DAWG cross-checks (incl. non-deterministic left set) and GADDAG
|
||||
deterministic cross-sets; validated against each other + brute force on a small lexicon.
|
||||
- **Stage 6** — DAWG generator (`LeftPart`/`ExtendRight`).
|
||||
- **Stage 7** — GADDAG generator (`Gen`/`GoOn`).
|
||||
- **Stage 8** — Correctness gate: DAWG and GADDAG identical move sets on random positions
|
||||
(each move once) + brute force on a tiny dictionary. Must pass before perf comparison.
|
||||
- **Stage 9** — Self-play stress test: `selfplay` engine (bag, racks, greedy policy,
|
||||
seeded RNG, end conditions); `cmd/stress` plays N games per generator measuring time,
|
||||
arcs, allocations (`runtime.MemStats`), peak RSS (`/proc/self/status` VmHWM), footprint;
|
||||
emit `RESULTS.md`.
|
||||
- **Stage 10** — Decision + public API: choose default from `RESULTS.md` (both selectable);
|
||||
finalize `Solver` API, Play↔board constructors, examples.
|
||||
- **Stage 11** — Polish: benchmarks, README, optional prebuilt-graph distribution.
|
||||
|
||||
## Verification
|
||||
|
||||
- `go test ./...`, `go vet`, lint green per stage.
|
||||
- Mutual oracle (Stage 8): identical move sets; brute force on a tiny dictionary.
|
||||
- Build EN structures from the SOWPODS submodule via `cmd/builddict`; run `GenerateMoves`
|
||||
on canonical positions (e.g. Gordon's "CARE on ABLE") and assert top moves/scores.
|
||||
- Run `cmd/stress` (100–1000 seeded games per generator) → `RESULTS.md`.
|
||||
|
||||
## Assumptions & caveats
|
||||
|
||||
- Both algorithms ship; the production default is decided by the stress test. Both remain
|
||||
selectable.
|
||||
- Self-play policy defaults to **greedy** (deterministic tie-break, seeded RNG); tunable.
|
||||
- Separator = real 27th token (`◊`, index = size, `cbits=5`); `0x40` reserved on the board.
|
||||
Wildcard/blank = `0x80`, never in the graph.
|
||||
- **Stateless per-call** generation in v1 (anchors + cross-sets recomputed per call);
|
||||
incremental maintenance is a later optimization (both generators run stateless — a
|
||||
fairness note for the comparison).
|
||||
- Persistence stores only the graph; the (custom) alphabet is injected on load.
|
||||
- Russian "Эрудит" alphabet specifics (Е/Ё handling, tile values/counts) resolved at
|
||||
Stage 3/4; "one word per move, H or V" is satisfied by the modes.
|
||||
- The final-flag GADDAG is larger than Gordon's letter-set form; letter-sets-on-arcs
|
||||
remain a possible future optimization.
|
||||
@@ -0,0 +1,89 @@
|
||||
# scrabble-solver
|
||||
|
||||
A Go library that, given a dictionary, a board position and a rack, returns **every
|
||||
legal play ranked by score**, and also **scores** or **validates** arbitrary plays. The
|
||||
move generator is the DAWG algorithm of Appel & Jacobson, *The World's Fastest Scrabble
|
||||
Program*. It operates on compact byte-indexed inputs/outputs and is dictionary-driven via
|
||||
[`github.com/iliadenisov/dafsa`](https://github.com/iliadenisov/dafsa).
|
||||
|
||||
See [`ALGORITHM.md`](ALGORITHM.md) for the algorithm (the single source of truth) and
|
||||
[`RESULTS.md`](RESULTS.md) for the DAWG-vs-GADDAG benchmark that settled the design.
|
||||
|
||||
## Status
|
||||
|
||||
- DAWG move generation (across / down / both orientations), with full tournament scoring
|
||||
(cross-words, premiums, all-tiles bonus) and a per-tile breakdown.
|
||||
- Public `Solver`: `GenerateMoves` (ranked), `ScorePlay`, `ValidatePlay`.
|
||||
- Rulesets: **English** Scrabble, **Russian** Scrabble, **Эрудит**; `rules.Ruleset` is
|
||||
fully parameterizable (board, premiums, tile values/counts, blanks, rack, bonus).
|
||||
- A GADDAG (Gordon) was implemented, benchmarked and then **removed** — for a scoring
|
||||
solver it was ~7× larger and no faster.
|
||||
|
||||
## Layout
|
||||
|
||||
```
|
||||
scrabble/ public API: Solver, Move/Play types, DAWG generator, scoring, validation
|
||||
board/ rack/ rules/ board grid (+transpose), rack, rulesets (English/Russian/Эрудит)
|
||||
internal/ encoding (byte conventions), wordlist, dictdawg, dict, graph
|
||||
cmd/builddict/ word list -> serialized DAWG in testdata
|
||||
cmd/stress/ greedy self-play benchmark of the generator
|
||||
selfplay/ bag + greedy player + game loop
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
```sh
|
||||
git submodule update --init # the dictionaries submodule (SOWPODS, TWL06, …)
|
||||
go run ./cmd/builddict # build testdata/sowpods.dawg (≈0.2 s, ~730 KB)
|
||||
```
|
||||
|
||||
`go.mod` carries `replace github.com/iliadenisov/dafsa => ../dafsa`: the solver needs
|
||||
dafsa's low-level traversal `Cursor` (see the patch notes in `../dafsa/SCRABBLE_API.md`).
|
||||
|
||||
## Usage
|
||||
|
||||
```go
|
||||
rs := rules.English()
|
||||
finder, _ := dict.EnglishDAWG() // loads testdata/sowpods.dawg
|
||||
s := scrabble.NewSolver(rs, finder)
|
||||
|
||||
b := board.New(rs.Rows, rs.Cols) // empty board (first move)
|
||||
|
||||
r := rack.New(rs.Size()) // rack "friends"
|
||||
tiles, _ := rs.Alphabet.Encode("friends")
|
||||
for _, t := range tiles {
|
||||
r.Add(t)
|
||||
}
|
||||
|
||||
moves := s.GenerateMoves(b, r, scrabble.Both) // ranked, highest score first
|
||||
best := moves[0]
|
||||
// best.Main / best.Cross hold the words (alphabet indexes; decode via rs.Alphabet),
|
||||
// best.Tiles the placed tiles (with blank flags), best.Score the total.
|
||||
|
||||
// Score or validate an arbitrary play (placed tiles + direction):
|
||||
m, err := s.ValidatePlay(b, scrabble.Horizontal, best.Tiles)
|
||||
_ = m
|
||||
_ = err
|
||||
```
|
||||
|
||||
Words and tiles are alphabet **indexes** throughout (no string wrapper); convert with the
|
||||
ruleset's `alphabet.Indexer` (`Encode`/`Decode`) when you need text.
|
||||
|
||||
## Rulesets
|
||||
|
||||
`rules.English()`, `rules.RussianScrabble()`, `rules.Erudit()`, or build your own with
|
||||
`rules.FromTemplate(...)`. For Эрудит, fold Ё→Е while preparing the dictionary with
|
||||
`wordlist.FoldYo` (the engine treats them as one letter; it is a dictionary-prep step).
|
||||
|
||||
## Benchmark
|
||||
|
||||
```sh
|
||||
go run ./cmd/stress -games 100 # greedy AI-vs-AI self-play; reports speed and memory
|
||||
```
|
||||
|
||||
## Tests
|
||||
|
||||
```sh
|
||||
go test ./... # unit tests + a brute-force move-generation oracle
|
||||
go test ./... -short # skips the full-dictionary game test
|
||||
```
|
||||
+87
@@ -0,0 +1,87 @@
|
||||
# DAWG vs GADDAG — self-play stress test
|
||||
|
||||
> **Note.** The GADDAG generator has since been **removed** from the codebase — the DAWG
|
||||
> is the sole move generator. This document is kept as the record of the comparison that
|
||||
> justified that choice. `cmd/stress` now benchmarks the DAWG alone.
|
||||
|
||||
The two move generators were compared by playing greedy AI-vs-AI games on the same
|
||||
dictionary with the same seeds, measuring speed and memory. Reproduce with:
|
||||
|
||||
```
|
||||
go run ./cmd/builddict # build testdata/sowpods.{dawg,gaddag}
|
||||
go run ./cmd/stress -games 200
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
- Dictionary: English **SOWPODS**, 267,752 words (2–15 letters).
|
||||
- Board: standard 15×15; greedy player (highest-scoring move), seeded RNG.
|
||||
- 200 games per generator, identical seeds; mode = both orientations.
|
||||
- Machine: this dev container (Go 1.26, 12 cores; single-threaded run).
|
||||
|
||||
## Results (200 games)
|
||||
|
||||
Includes the **deterministic cross-set optimization** (one arc enumeration via
|
||||
`completers()` for one-sided squares instead of probing every letter); both one-sided
|
||||
cases are deterministic for the GADDAG, only the right-extension is for the DAWG.
|
||||
|
||||
| metric | DAWG | GADDAG |
|
||||
|---|---:|---:|
|
||||
| structure size | **732.5 KB** | 5.12 MB |
|
||||
| games / turns / plays | 200 / 4783 / 4769 | 200 / 4783 / 4769 |
|
||||
| moves generated | 3,880,236 | 3,880,236 |
|
||||
| generation time | **23.68 s** | 25.98 s |
|
||||
| µs / move-generation call | **4951** | 5431 |
|
||||
| moves generated / sec | **163,863** | 149,383 |
|
||||
| arcs traversed | **261.8 M** | 276.0 M |
|
||||
| arcs / move generated | **67.5** | 71.1 |
|
||||
| heap allocated | 16.79 GB | 16.79 GB |
|
||||
| GC cycles | 5624 | 5605 |
|
||||
| avg final game score | 849.3 | 849.3 |
|
||||
|
||||
`GADDAG vs DAWG: 1.10× generation time, 7.16× structure size, 1.00× heap.`
|
||||
Peak process RSS (both structures mapped): ~42 MB.
|
||||
|
||||
The optimization cut the GADDAG's cross-set arcs (285.5 M → 276.0 M) and narrowed the
|
||||
arc gap (1.08× → 1.05×), but the verdict is unchanged: **the GADDAG is still ~10 %
|
||||
slower, 7× larger and traverses slightly more arcs.** End-to-end time is dominated by
|
||||
the shared per-move scoring (`Evaluate`, ~3.9 M calls), not by graph search, so the
|
||||
search-algorithm difference barely moves the total — and what difference remains favours
|
||||
the narrower, smaller DAWG.
|
||||
|
||||
## Interpretation
|
||||
|
||||
- **Correctness.** Both generators produce the *identical* set of moves and scores at
|
||||
every position (identical turns/plays/moves/score above, and the Stage-8 mutual oracle
|
||||
agreeing on 119 positions over 5 games). They differ only in how they search.
|
||||
- **Speed.** The DAWG is ~10 % faster end-to-end and traverses ~8 % **fewer** arcs.
|
||||
- **The GADDAG does not win here, contrary to Gordon's paper.** Two reasons:
|
||||
1. We use the **final-flag GADDAG** (the minimized DAWG of `REV(x)◊y`, completion via
|
||||
accepting states) so that `dafsa` can be used essentially unmodified. This variant
|
||||
lacks Gordon's *letter-sets-on-arcs* compression, so it is both ~7× larger and has
|
||||
wider nodes — it traverses *more* arcs, not fewer, erasing the theoretical edge.
|
||||
2. The workload is a **scoring solver**: every generated move is scored (cross-words,
|
||||
premiums, bonus) by the shared `Evaluate`. That shared per-move cost dominates, so
|
||||
the search-algorithm difference is small — and what remains favours the simpler,
|
||||
narrower DAWG.
|
||||
- **Memory.** The GADDAG structure is 7.16× larger (5.12 MB vs 732 KB). Per-move heap
|
||||
allocation is identical (dominated by shared scoring), and overall RSS is modest.
|
||||
|
||||
## Decision
|
||||
|
||||
**Use the DAWG (Appel-Jacobson) generator as the production default.** For this library
|
||||
(a move generator that scores and ranks every play) it is smaller (7×), at least as fast,
|
||||
and simpler to operate: it needs no separator symbol or custom alphabet, `dafsa`'s
|
||||
`Save`/`Load` work unchanged, and it requires the fewest `dafsa` additions (only the
|
||||
shared low-level traversal API).
|
||||
|
||||
The **GADDAG generator is kept and remains selectable** (`scrabble.NewGADDAGGenerator`),
|
||||
both as an alternative and as a continuing correctness oracle for the DAWG.
|
||||
|
||||
### Caveats / what could change the picture
|
||||
|
||||
- A **letter-set GADDAG** (Gordon's true compressed form) plus **incremental cross-set
|
||||
maintenance** would shrink the GADDAG and cut its arc count; it might then beat the DAWG
|
||||
on raw move generation. This was not pursued: the DAWG already meets the goal, and the
|
||||
7× size gap is decisive for a scoring-solver workload where generation is not the
|
||||
bottleneck. It remains a documented future optimization.
|
||||
+105
@@ -0,0 +1,105 @@
|
||||
// Package board holds the compact game board: a row-major grid of cell bytes encoded
|
||||
// per internal/encoding (0 = empty, letter+1, with 0x80 marking a blank). It is
|
||||
// otherwise alphabet-agnostic.
|
||||
package board
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"unicode"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/internal/encoding"
|
||||
)
|
||||
|
||||
// Board is a row-major grid of encoded cells.
|
||||
type Board struct {
|
||||
rows, cols int
|
||||
cells []byte
|
||||
}
|
||||
|
||||
// New returns an empty rows×cols board.
|
||||
func New(rows, cols int) *Board {
|
||||
return &Board{rows: rows, cols: cols, cells: make([]byte, rows*cols)}
|
||||
}
|
||||
|
||||
// Rows returns the number of rows.
|
||||
func (b *Board) Rows() int { return b.rows }
|
||||
|
||||
// Cols returns the number of columns.
|
||||
func (b *Board) Cols() int { return b.cols }
|
||||
|
||||
// At returns the encoded cell at (r, c).
|
||||
func (b *Board) At(r, c int) byte { return b.cells[r*b.cols+c] }
|
||||
|
||||
// Set stores the encoded cell v at (r, c).
|
||||
func (b *Board) Set(r, c int, v byte) { b.cells[r*b.cols+c] = v }
|
||||
|
||||
// InBounds reports whether (r, c) is on the board.
|
||||
func (b *Board) InBounds(r, c int) bool {
|
||||
return r >= 0 && r < b.rows && c >= 0 && c < b.cols
|
||||
}
|
||||
|
||||
// Empty reports whether (r, c) is an empty square.
|
||||
func (b *Board) Empty(r, c int) bool { return encoding.IsEmpty(b.cells[r*b.cols+c]) }
|
||||
|
||||
// Filled reports whether (r, c) is on the board and occupied.
|
||||
func (b *Board) Filled(r, c int) bool {
|
||||
return b.InBounds(r, c) && !encoding.IsEmpty(b.cells[r*b.cols+c])
|
||||
}
|
||||
|
||||
// IsEmpty reports whether the whole board is empty (used for the first move).
|
||||
func (b *Board) IsEmpty() bool {
|
||||
for _, c := range b.cells {
|
||||
if !encoding.IsEmpty(c) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// Clone returns a deep copy of the board.
|
||||
func (b *Board) Clone() *Board {
|
||||
cp := &Board{rows: b.rows, cols: b.cols, cells: make([]byte, len(b.cells))}
|
||||
copy(cp.cells, b.cells)
|
||||
return cp
|
||||
}
|
||||
|
||||
// Transpose returns a new board with rows and columns swapped, turning vertical lines
|
||||
// into horizontal ones. Down-play generation runs on the transpose.
|
||||
func (b *Board) Transpose() *Board {
|
||||
t := &Board{rows: b.cols, cols: b.rows, cells: make([]byte, len(b.cells))}
|
||||
for r := range b.rows {
|
||||
for c := range b.cols {
|
||||
t.cells[c*t.cols+r] = b.cells[r*b.cols+c]
|
||||
}
|
||||
}
|
||||
return t
|
||||
}
|
||||
|
||||
// Parse builds a board from text rows: '.' (or space) is an empty square, a lowercase
|
||||
// letter is a normal tile, and an uppercase letter is a blank standing for that letter.
|
||||
// Letters are resolved through idx.
|
||||
func Parse(rows []string, idx alphabet.Indexer) (*Board, error) {
|
||||
if len(rows) == 0 {
|
||||
return nil, fmt.Errorf("board: no rows")
|
||||
}
|
||||
cols := len([]rune(rows[0]))
|
||||
b := New(len(rows), cols)
|
||||
for r, line := range rows {
|
||||
runes := []rune(line)
|
||||
for c := 0; c < cols && c < len(runes); c++ {
|
||||
ch := runes[c]
|
||||
if ch == '.' || ch == ' ' {
|
||||
continue
|
||||
}
|
||||
blank := unicode.IsUpper(ch)
|
||||
li, err := idx.Index(string(unicode.ToLower(ch)))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("board: row %d col %d %q: %w", r, c, string(ch), err)
|
||||
}
|
||||
b.Set(r, c, encoding.Cell(li, blank))
|
||||
}
|
||||
}
|
||||
return b, nil
|
||||
}
|
||||
@@ -0,0 +1,92 @@
|
||||
package board_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/encoding"
|
||||
)
|
||||
|
||||
func TestParseAndAccess(t *testing.T) {
|
||||
b, err := board.Parse([]string{
|
||||
"cat",
|
||||
"o..",
|
||||
"W..", // blank standing for 'w'
|
||||
}, alphabet.Latin())
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if b.Rows() != 3 || b.Cols() != 3 {
|
||||
t.Fatalf("size = %dx%d, want 3x3", b.Rows(), b.Cols())
|
||||
}
|
||||
if b.IsEmpty() {
|
||||
t.Error("IsEmpty = true for a non-empty board")
|
||||
}
|
||||
if !b.Empty(1, 1) {
|
||||
t.Error("(1,1) should be empty")
|
||||
}
|
||||
if !b.Filled(0, 0) {
|
||||
t.Error("(0,0) should be filled")
|
||||
}
|
||||
|
||||
// 'c' = index 2, normal tile.
|
||||
if got := b.At(0, 0); got != encoding.Cell(2, false) {
|
||||
t.Errorf("At(0,0) = %#x, want %#x", got, encoding.Cell(2, false))
|
||||
}
|
||||
if encoding.IsBlank(b.At(0, 0)) {
|
||||
t.Error("(0,0) wrongly marked blank")
|
||||
}
|
||||
// 'W' = blank for index 22.
|
||||
if got := b.At(2, 0); got != encoding.Cell(22, true) {
|
||||
t.Errorf("At(2,0) = %#x, want blank w", got)
|
||||
}
|
||||
if !encoding.IsBlank(b.At(2, 0)) {
|
||||
t.Error("(2,0) should be a blank")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNewIsEmpty(t *testing.T) {
|
||||
if !board.New(15, 15).IsEmpty() {
|
||||
t.Error("new board not empty")
|
||||
}
|
||||
}
|
||||
|
||||
func TestTranspose(t *testing.T) {
|
||||
b, _ := board.Parse([]string{
|
||||
"ab",
|
||||
"..",
|
||||
"cd",
|
||||
}, alphabet.Latin())
|
||||
tr := b.Transpose()
|
||||
if tr.Rows() != 2 || tr.Cols() != 3 {
|
||||
t.Fatalf("transpose size = %dx%d, want 2x3", tr.Rows(), tr.Cols())
|
||||
}
|
||||
if tr.At(0, 0) != b.At(0, 0) || tr.At(1, 0) != b.At(0, 1) || tr.At(0, 2) != b.At(2, 0) {
|
||||
t.Error("transpose did not swap coordinates")
|
||||
}
|
||||
|
||||
// Transposing twice restores the original.
|
||||
back := tr.Transpose()
|
||||
for r := range b.Rows() {
|
||||
for c := range b.Cols() {
|
||||
if back.At(r, c) != b.At(r, c) {
|
||||
t.Fatalf("double transpose differs at (%d,%d)", r, c)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestClone(t *testing.T) {
|
||||
b := board.New(3, 3)
|
||||
b.Set(1, 1, encoding.Cell(0, false))
|
||||
cp := b.Clone()
|
||||
cp.Set(0, 0, encoding.Cell(1, false))
|
||||
if !b.Empty(0, 0) {
|
||||
t.Error("mutating clone changed the original")
|
||||
}
|
||||
if cp.Empty(1, 1) {
|
||||
t.Error("clone lost original content")
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,75 @@
|
||||
// Command builddict converts a word list into a serialized DAWG. By default it reads the
|
||||
// English SOWPODS list (Latin alphabet); pass -alphabet russian for the Cyrillic lists.
|
||||
package main
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"time"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
)
|
||||
|
||||
func main() {
|
||||
dict := flag.String("dict", "dictionaries/english/sowpods.txt", "word list file (one word per line)")
|
||||
out := flag.String("out", "testdata", "output directory")
|
||||
name := flag.String("name", "sowpods", "base name for the output file")
|
||||
minLen := flag.Int("min", 2, "minimum word length")
|
||||
maxLen := flag.Int("max", 15, "maximum word length")
|
||||
alpha := flag.String("alphabet", "latin", "alphabet: latin (English) or russian")
|
||||
flag.Parse()
|
||||
|
||||
var idx alphabet.Indexer
|
||||
switch *alpha {
|
||||
case "latin":
|
||||
idx = alphabet.Latin()
|
||||
case "russian":
|
||||
idx = alphabet.Embedded(alphabet.Langs.LangRu)
|
||||
default:
|
||||
log.Fatalf("unknown -alphabet %q (want latin or russian)", *alpha)
|
||||
}
|
||||
|
||||
t0 := time.Now()
|
||||
words, err := wordlist.Read(*dict, idx, *minLen, *maxLen)
|
||||
if err != nil {
|
||||
log.Fatalf("read %s: %v", *dict, err)
|
||||
}
|
||||
fmt.Printf("loaded %d words from %s in %s\n", len(words), *dict, time.Since(t0).Round(time.Millisecond))
|
||||
|
||||
if err := os.MkdirAll(*out, 0o755); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
t := time.Now()
|
||||
f, err := dictdawg.Build(idx, words)
|
||||
if err != nil {
|
||||
log.Fatalf("build dawg: %v", err)
|
||||
}
|
||||
path := filepath.Join(*out, *name+".dawg")
|
||||
if err := dictdawg.Save(f, path); err != nil {
|
||||
log.Fatalf("save: %v", err)
|
||||
}
|
||||
size := int64(0)
|
||||
if fi, err := os.Stat(path); err == nil {
|
||||
size = fi.Size()
|
||||
}
|
||||
fmt.Printf("DAWG %d nodes, %s, built+saved in %s -> %s\n",
|
||||
f.NumNodes(), humanBytes(size), time.Since(t).Round(time.Millisecond), path)
|
||||
}
|
||||
|
||||
func humanBytes(n int64) string {
|
||||
switch {
|
||||
case n >= 1<<20:
|
||||
return fmt.Sprintf("%.2f MB", float64(n)/(1<<20))
|
||||
case n >= 1<<10:
|
||||
return fmt.Sprintf("%.1f KB", float64(n)/(1<<10))
|
||||
default:
|
||||
return fmt.Sprintf("%d B", n)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,104 @@
|
||||
// Command stress plays many greedy AI-vs-AI games and reports the DAWG move generator's
|
||||
// speed and memory. It is a benchmark / regression tool for the production generator.
|
||||
package main
|
||||
|
||||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"runtime"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"scrabble-solver/internal/dict"
|
||||
"scrabble-solver/rules"
|
||||
"scrabble-solver/scrabble"
|
||||
"scrabble-solver/selfplay"
|
||||
)
|
||||
|
||||
func main() {
|
||||
games := flag.Int("games", 100, "games to play")
|
||||
flag.Parse()
|
||||
|
||||
rs := rules.English()
|
||||
if !dict.EnglishAvailable() {
|
||||
log.Fatal("English dictionary not available; run `go run ./cmd/builddict` first")
|
||||
}
|
||||
f, err := dict.EnglishDAWG()
|
||||
if err != nil {
|
||||
log.Fatalf("load dawg: %v", err)
|
||||
}
|
||||
gen := scrabble.NewDAWGGenerator(rs, f)
|
||||
structSize := fileSize(dict.DAWGCache())
|
||||
|
||||
runtime.GC()
|
||||
var m0 runtime.MemStats
|
||||
runtime.ReadMemStats(&m0)
|
||||
start := time.Now()
|
||||
|
||||
var turns, plays, movesGen int
|
||||
var genTime time.Duration
|
||||
var score float64
|
||||
for seed := 1; seed <= *games; seed++ {
|
||||
res := selfplay.PlayGame(rs, gen, scrabble.Both, int64(seed), nil)
|
||||
turns += res.Turns
|
||||
plays += res.Plays
|
||||
movesGen += res.MovesGenerated
|
||||
genTime += res.GenTime
|
||||
score += float64(res.Scores[0] + res.Scores[1])
|
||||
}
|
||||
wall := time.Since(start)
|
||||
var m1 runtime.MemStats
|
||||
runtime.ReadMemStats(&m1)
|
||||
|
||||
fmt.Printf("DAWG · English SOWPODS · %d games · board %dx%d · greedy self-play\n\n", *games, rs.Rows, rs.Cols)
|
||||
fmt.Printf(" structure size %s\n", humanBytes(structSize))
|
||||
fmt.Printf(" turns / plays %d / %d\n", turns, plays)
|
||||
fmt.Printf(" moves generated %d\n", movesGen)
|
||||
fmt.Printf(" generation time %s (%.1f µs/turn)\n",
|
||||
genTime.Round(time.Millisecond), float64(genTime.Microseconds())/float64(turns))
|
||||
fmt.Printf(" moves generated/sec %.0f\n", float64(movesGen)/genTime.Seconds())
|
||||
fmt.Printf(" wall time %s\n", wall.Round(time.Millisecond))
|
||||
fmt.Printf(" heap allocated %s (%d GC cycles)\n",
|
||||
humanBytes(int64(m1.TotalAlloc-m0.TotalAlloc)), m1.NumGC-m0.NumGC)
|
||||
fmt.Printf(" avg final game score %.1f\n", score/float64(*games))
|
||||
fmt.Printf(" peak process RSS %s\n", humanKB(peakRSS()))
|
||||
}
|
||||
|
||||
func fileSize(p string) int64 {
|
||||
if fi, err := os.Stat(p); err == nil {
|
||||
return fi.Size()
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func peakRSS() int64 {
|
||||
data, err := os.ReadFile("/proc/self/status")
|
||||
if err != nil {
|
||||
return 0
|
||||
}
|
||||
for line := range strings.SplitSeq(string(data), "\n") {
|
||||
if rest, ok := strings.CutPrefix(line, "VmHWM:"); ok {
|
||||
if f := strings.Fields(rest); len(f) > 0 {
|
||||
kb, _ := strconv.ParseInt(f[0], 10, 64)
|
||||
return kb
|
||||
}
|
||||
}
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func humanBytes(n int64) string {
|
||||
switch {
|
||||
case n >= 1<<20:
|
||||
return fmt.Sprintf("%.2f MB", float64(n)/(1<<20))
|
||||
case n >= 1<<10:
|
||||
return fmt.Sprintf("%.1f KB", float64(n)/(1<<10))
|
||||
default:
|
||||
return fmt.Sprintf("%d B", n)
|
||||
}
|
||||
}
|
||||
|
||||
func humanKB(kb int64) string { return humanBytes(kb * 1024) }
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Submodule
+1
Submodule dictionaries added at 92f81b2861
@@ -0,0 +1,164 @@
|
||||
# Russian word-list preparation (`dictprep`)
|
||||
|
||||
Builds the Russian **noun** word list for the Scrabble/Эрудит solver out of the official
|
||||
Russian academic **orthographic dictionary**, cross-checked against two independent
|
||||
morphological dictionaries.
|
||||
|
||||
The goal of the pipeline is a list of **common nouns in the nominative singular**
|
||||
(`dictprep/russian/scrabble.txt`), plus an ambiguous tail for manual review.
|
||||
|
||||
> This directory is self-contained tooling for *building* the word list. It is not part
|
||||
> of the solver library. The committed result lives in `dictprep/russian/`.
|
||||
|
||||
## Source
|
||||
|
||||
`orfo_dict_2025.pdf` — *Русский орфографический словарь РАН* (≈ 200 000 entries), the
|
||||
authority for **spelling**. It encodes declension type in its grammatical notes but does
|
||||
**not** reliably mark part of speech.
|
||||
|
||||
- Source: <https://ruslang.ru/sites/default/files/doc/normativnyje_slovari/orfograficheskij_slovar.pdf>
|
||||
- Mirror: <https://rus-gos.spbu.ru/index.php/dictionary>
|
||||
|
||||
The PDF is git-ignored (large, third-party); place it here as `orfo_dict_2025.pdf`. Its
|
||||
pdftotext output is committed as `russian/orfo_dict_2025.txt`, so the word list rebuilds
|
||||
from the text alone — the binary PDF is needed only to regenerate that text.
|
||||
|
||||
## Outputs (`dictprep/russian/`)
|
||||
|
||||
The committed result is **three** files; every other bucket stays in the Stage-2
|
||||
process's memory (dump it with `--dump`, query it with `--trace WORD`).
|
||||
|
||||
| File | Committed | Meaning |
|
||||
|------|:--:|---------|
|
||||
| `orfo_dict_2025.txt` | ✓ | the pdftotext output — the parsed source of truth (the PDF binary is not needed to rebuild). |
|
||||
| `all.txt` | ✓ | Stage 1 base: every clean Cyrillic headword/variant; a plural headword with a singular is replaced by that singular. |
|
||||
| `manual_confirm.txt` | ✓ | hand-reviewed nouns from the undefined tail; the brain merges them into the result. |
|
||||
| `scrabble.txt` | ✓ | **Stage 2 result**: common nouns, nominative singular (+ pluralia tantum), length 2–15 — the working dictionary. |
|
||||
| `undefined.txt` | — | the ambiguous tail; kept in memory, written only with `--dump`. |
|
||||
|
||||
`--dump` also writes `adjectives.txt`, `verbs.txt`, `singulars.txt` and `fate.tsv` (every
|
||||
word with the reason it did or did not reach the dictionary); these are git-ignored debug
|
||||
artifacts. Stage 1 also writes `/tmp/ru_{skip,singulars,variants}.txt`, intermediate inputs
|
||||
the brain consumes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```sh
|
||||
# 1. pdftotext (Poppler)
|
||||
sudo apt-get install -y poppler-utils
|
||||
|
||||
# 2. Go toolchain (Stage 1) — already required by the parent module
|
||||
|
||||
# 3. Python + the OpenCorpora analyser (Stage 2)
|
||||
sudo apt-get install -y python3-venv python3-pip
|
||||
python3 -m venv ru-venv
|
||||
ru-venv/bin/pip install mawo-pymorphy3 # bundles OpenCorpora 2025 (words.dawg)
|
||||
|
||||
# 4. libmorph — the independent morphological dictionary (Stage 2 cross-check)
|
||||
sudo apt-get install -y morphrus morphrus-dev moonycode-dev morphapi-dev
|
||||
g++ -std=c++17 -O2 dictprep/libmorph_check.cpp -lmorphrus -lmoonycode -o dictprep/libmorph_check
|
||||
```
|
||||
|
||||
If `dictprep/libmorph_check` is absent, Stage 2 still runs — it simply drops libmorph from
|
||||
the stack and reports `libmorph_helper=MISSING`.
|
||||
|
||||
## How to run
|
||||
|
||||
```sh
|
||||
# Stage 0 — PDF -> plain text (committed as the source of truth; run once)
|
||||
pdftotext dictprep/orfo_dict_2025.pdf dictprep/russian/orfo_dict_2025.txt
|
||||
|
||||
# Stage 1 — build the base word list (Go): dictprep/russian/all.txt + /tmp/ru_*.txt
|
||||
go run ./dictprep/ruwords
|
||||
|
||||
# Stage 2 — the brain (Python + mawo + libmorph): writes scrabble.txt
|
||||
ru-venv/bin/python dictprep/ru_stage2.py
|
||||
|
||||
# ask how a word did or did not reach the dictionary
|
||||
ru-venv/bin/python dictprep/ru_stage2.py --trace травмпункт
|
||||
# also write the in-memory buckets (undefined, adjectives, verbs, singulars, fate.tsv)
|
||||
ru-venv/bin/python dictprep/ru_stage2.py --dump
|
||||
```
|
||||
|
||||
`-from`/`-to` (defaulting to 452/168808) bound the column word-list section of
|
||||
`russian/orfo_dict_2025.txt` (line 452 = the first entry `а1, …`; line 168808 = the last,
|
||||
`я́щурный`). The preface above line 452 is prose and is skipped. Verify these bounds if the
|
||||
PDF is re-exported.
|
||||
|
||||
## Algorithm
|
||||
|
||||
### Stage 1 — `ruwords` (Go)
|
||||
|
||||
Per dictionary line in `[from, to]` it collects, normalised (stress marks U+0300/U+0301
|
||||
stripped, lowercased, `ё` kept, hyphenated/capitalised/non-Cyrillic rejected):
|
||||
|
||||
- the **headword** (leading token). Leading whitespace including the form-feed `\f`
|
||||
pdftotext puts at every page top is trimmed — otherwise the first headword of each page
|
||||
is lost;
|
||||
- the **singular of a plural headword** when the entry gives it after `ед.`, in full
|
||||
(`ящеры, …, ед. ящер`) or as a replacement suffix (`…, ед. -вец`, spliced where the
|
||||
suffix best overlaps the headword); the plural is then dropped (a plural that has a
|
||||
singular is never needed) and the singular is also recorded (`/tmp/ru_singulars.txt`);
|
||||
- **variant headwords** after `и` that carry their own grammatical note
|
||||
(`аблатив, -а и аблятив, -а`; `регги и реггей, нескл.`), excluding inflected forms.
|
||||
|
||||
Everything else (every maximal Cyrillic token not selected above) goes to
|
||||
`/tmp/ru_skip.txt`, a safety net for a later morphology re-check.
|
||||
|
||||
### Stage 2 — `ru_stage2.py` (Python)
|
||||
|
||||
Each Stage-1 word (length 2–15) is routed by three sources, most authoritative first:
|
||||
|
||||
1. **OpenCorpora** (`words.dawg`, read directly — *not* the predictor): a common-noun
|
||||
reading ⇒ keep the OpenCorpora lemma. The full OpenCorpora common-noun lexicon is also
|
||||
added (so nouns absent from the PDF are included).
|
||||
2. **libmorph** (independent dictionary, via `libmorph_check`): a common-noun reading ⇒
|
||||
keep the libmorph lemma. The two dictionaries are treated as **complementary** — a noun
|
||||
reading in *either* is enough (their disagreements were reviewed and resolved this way,
|
||||
since each is incomplete in different places). A singular reconstructed from "ед." that
|
||||
neither dictionary knows is accepted as a noun (the orthographic note attests it).
|
||||
3. A word **both dictionaries miss** is classified by the orthographic **note**
|
||||
(`-ая, -ое` ⇒ adjective; `-ть`, `сов./несов.` ⇒ verb; single genitive `-а/-и` or
|
||||
`нескл., м./ж./с.` ⇒ noun). A note-noun goes straight to `scrabble.txt`; an adjective or
|
||||
verb is dropped; anything undecided goes to `undefined.txt`.
|
||||
4. **Variant rescue**: when the dictionary joins two spellings with "и" (`травмопункт и
|
||||
травмпункт`, `регги и реггей`) and one is already a confirmed noun, the other is moved
|
||||
from review/undefined into the result as well, propagated transitively through chains.
|
||||
The plural-form variants the dictionaries already resolve never reach this step.
|
||||
|
||||
The nominative singular always comes from the dictionary that recognised the word, or from
|
||||
the orthographic `ед.` note — never from a predictor guess (libmorph and the predictor
|
||||
mis-lemmatise out-of-dictionary words, e.g. `витебчане → витебчан` instead of `витебчанин`).
|
||||
|
||||
### The libmorph bridge — `libmorph_check.cpp`
|
||||
|
||||
libmorph (A. Kovalenko, MIT) ships as `libmorphrus.so`. `libmorph_check` is a thin
|
||||
stdin→stdout filter: one UTF-8 word per line in, one line out:
|
||||
|
||||
```
|
||||
<known>\t<pos>:<lemma>\t<pos>:<lemma>...
|
||||
```
|
||||
|
||||
`<known>` is `CheckWord` (1 = in the dictionary). `<pos>` is `wdInfo & 0x3f`, the part of
|
||||
speech. The codes were reverse-engineered (the docs omit the table):
|
||||
|
||||
| codes | part of speech |
|
||||
|------|----------------|
|
||||
| **7–21, 24** | **noun** (all genders / declensions / animacy; pluralia tantum is 24) |
|
||||
| 1–3 | verb · 25, 27 adjective · 28–32 pronoun · 33–36 numeral |
|
||||
| 38–39 | **proper noun** (excluded) · 48–58 comparative/adverb · 49–53 function words |
|
||||
|
||||
The analyser instance is requested with the key `libmorph.api.v4:utf-8` so words are
|
||||
passed and lemmas returned in UTF-8.
|
||||
|
||||
## Notes & caveats
|
||||
|
||||
- The hard tail (≈ 35 000 Stage-1 words / our candidates) is in **no** morphological
|
||||
dictionary; only the orthographic dictionary attests them, so the PDF note is the sole
|
||||
signal there. Compound and very recent nouns (`робототехник`, `толкинист`) live here.
|
||||
- OpenCorpora and libmorph are near-equal in size (≈ 99 500 words each on `all.txt`)
|
||||
and ≈ 96 % overlapping, but **complementary** (each contributes ≈ 2 200 unique nouns),
|
||||
which is why both are kept. The mawo *predictor* "knows" ~98 % of everything by guessing
|
||||
and is therefore used only as a weak confirming vote, never as dictionary membership.
|
||||
- Licensing: OpenCorpora data is CC BY-SA 3.0; libmorph is MIT; the orthographic
|
||||
dictionary has its own copyright. A list derived from CC BY-SA data inherits that licence.
|
||||
@@ -0,0 +1,27 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Fold Ё/ё → Е/е in a word list and de-duplicate — the dictionary prep for "Эрудит".
|
||||
|
||||
The Эрудит ruleset has no Ё tile and treats Е/Ё as one letter, so its dictionary must be
|
||||
folded before the DAWG is built. Folding merges pairs like ёж/еж, hence the de-dup. Output
|
||||
is sorted (Russian order over the 32 folded letters) and LF-separated.
|
||||
|
||||
Run: python3 dictprep/fold_yo.py dictprep/russian/scrabble.txt > /tmp/ru_erudit_words.txt
|
||||
"""
|
||||
import sys
|
||||
|
||||
ORDER = {c: i for i, c in enumerate("абвгдежзийклмнопрстуфхцчшщъыьэюя")} # 32 letters, no ё
|
||||
|
||||
|
||||
def key(w):
|
||||
return [ORDER.get(c, 99) for c in w]
|
||||
|
||||
|
||||
def main():
|
||||
src = sys.argv[1] if len(sys.argv) > 1 else "/dev/stdin"
|
||||
words = {line.strip().replace("ё", "е").replace("Ё", "Е") for line in open(src, encoding="utf-8")}
|
||||
words.discard("")
|
||||
sys.stdout.write("\n".join(sorted(words, key=key)) + "\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,47 @@
|
||||
// libmorph_check: a thin stdin->stdout bridge to the libmorph Russian morphological
|
||||
// analyser, for use by the Stage-2 classifier (scripts/ru_stage2.py).
|
||||
//
|
||||
// Reads one word per line (bytes are passed through verbatim — the caller encodes to
|
||||
// the code page the libmorph char interface expects, CP1251). For each word it writes
|
||||
// a line:
|
||||
//
|
||||
// <known>\t<pos>:<lemma>\t<pos>:<lemma>...
|
||||
//
|
||||
// where <known> is CheckWord's result (1 = in the dictionary, 0 = not), and each
|
||||
// following field is one lexeme: its part of speech (wdInfo & 0x3f) and lemma.
|
||||
//
|
||||
// Build: g++ -std=c++17 -O2 scripts/libmorph_check.cpp -lmorphrus -lmoonycode -o libmorph_check
|
||||
#include <libmorph/rus.h>
|
||||
#include <libmorph/api.hpp>
|
||||
#include <cstdio>
|
||||
#include <iostream>
|
||||
#include <string>
|
||||
|
||||
int main(int argc, char** argv) {
|
||||
// The factory key selects the code page: "libmorph.api.v4:<charset>". Use the
|
||||
// UTF-8 instance so words pass through verbatim. IMlmaMbXX only adds non-virtual
|
||||
// convenience wrappers over IMlmaMb, so the filled pointer can be used as such.
|
||||
const char* key = argc > 1 ? argv[1] : "libmorph.api.v4:utf-8";
|
||||
IMlmaMbXX* mlma = nullptr;
|
||||
int rc = mlmaruGetAPI(key, (void**)&mlma);
|
||||
if (mlma == nullptr) {
|
||||
std::fprintf(stderr, "libmorph_check: GetAPI('%s') failed, rc=%d\n", key, rc);
|
||||
return 1;
|
||||
}
|
||||
std::string line;
|
||||
while (std::getline(std::cin, line)) {
|
||||
if (!line.empty() && line.back() == '\r') line.pop_back();
|
||||
IMlmaMbXX::inword w(line.c_str(), line.size());
|
||||
int known = mlma->CheckWord(w, sfIgnoreCapitals);
|
||||
std::cout << known;
|
||||
try {
|
||||
for (auto& lx : mlma->Lemmatize(w, sfIgnoreCapitals)) {
|
||||
unsigned pos = lx.ngrams > 0 ? (lx.pgrams[0].wdInfo & 0x3f) : 0xffu;
|
||||
std::cout << '\t' << pos << ':' << (lx.plemma ? lx.plemma : "");
|
||||
}
|
||||
} catch (...) {
|
||||
}
|
||||
std::cout << '\n';
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
Binary file not shown.
@@ -0,0 +1,341 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Stage 2 — the "brain" of the Russian Scrabble word-list pipeline.
|
||||
|
||||
It reads the Stage-1 base word list (built once by ruwords so the heavy PDF is not
|
||||
re-parsed) together with the grammatical notes and the singular/variant structure, runs
|
||||
the whole noun-selection logic in memory, and writes a minimal result:
|
||||
|
||||
dictprep/russian/scrabble.txt — the working dictionary (common nouns, nom. sing.)
|
||||
dictprep/russian/undefined.txt — the ambiguous tail, left for manual review
|
||||
|
||||
(dictprep/russian/all.txt is the Stage-1 base.) Every other bucket — adjectives, verbs,
|
||||
the merged note-nouns, singulars, variants — stays in memory. Pass --dump to also write
|
||||
them; pass --trace WORD to ask how a single word did or did not reach the dictionary.
|
||||
|
||||
Note: all.txt is a plain word list, so the grammatical notes, "ед." singulars and "и"
|
||||
variants are read from the pdftotext output (slov.txt) and the Stage-1 side files; the
|
||||
expensive PDF parse itself runs only once.
|
||||
|
||||
Sources, most authoritative first: OpenCorpora (mawo-pymorphy3), libmorph (libmorph_check),
|
||||
and the orthographic dictionary's own notes. See dictprep/README.md.
|
||||
|
||||
Run: ru-venv/bin/python dictprep/ru_stage2.py [--dump] [--trace WORD]
|
||||
"""
|
||||
import argparse
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
|
||||
HERE = os.path.dirname(os.path.abspath(__file__))
|
||||
OUT_DIR = os.path.join(HERE, "russian")
|
||||
SLOV = os.path.join(OUT_DIR, "orfo_dict_2025.txt") # committed pdftotext output (source of truth)
|
||||
WL_FROM, WL_TO = 452, 168808 # 1-based inclusive bounds of the column word-list section
|
||||
OC_CACHE = "/tmp/oc_nouns.txt"
|
||||
LIBMORPH_BIN = os.path.join(HERE, "libmorph_check")
|
||||
|
||||
ALPHABET = "абвгдеёжзийклмнопрстуфхцчшщъыьэюя"
|
||||
ORDER = {c: i for i, c in enumerate(ALPHABET)}
|
||||
PROPER = {"Name", "Surn", "Patr", "Geox", "Orgn", "Trad"}
|
||||
LIBMORPH_NOUN_CODES = set(range(7, 22)) | {24} # 7..21 plus 24 (pluralia tantum)
|
||||
ADJ_END = {"ая", "яя", "ое", "ее", "ье", "ья", "ьи"}
|
||||
VERB3 = ("ет", "ёт", "ит", "ют", "ут", "ает", "яет", "ует", "уют", "нет", "жет", "чет")
|
||||
GENPL = ("ов", "ёв", "ев", "ей")
|
||||
|
||||
|
||||
def key(w):
|
||||
return [ORDER.get(c, 99) for c in w]
|
||||
|
||||
|
||||
def destress(s):
|
||||
return "".join(c for c in s if ord(c) not in (0x0300, 0x0301)).lower()
|
||||
|
||||
|
||||
def cyr_ok(w):
|
||||
return 2 <= len(w) <= 15 and all(("а" <= c <= "я") or c == "ё" for c in w)
|
||||
|
||||
|
||||
def load(p):
|
||||
return [l.strip() for l in open(p, encoding="utf-8") if l.strip()] if os.path.exists(p) else []
|
||||
|
||||
|
||||
def write(path, words):
|
||||
os.makedirs(os.path.dirname(path), exist_ok=True)
|
||||
open(path, "w", encoding="utf-8").write("\n".join(sorted(set(words), key=key)) + "\n")
|
||||
|
||||
|
||||
import mawo_pymorphy3 # noqa: E402
|
||||
|
||||
M = mawo_pymorphy3.MorphAnalyzer()
|
||||
D = M._dawg_dict
|
||||
|
||||
|
||||
def oc_noun_lemmas():
|
||||
"""Every common-noun lemma (nom. sing. / pluralia tantum) in OpenCorpora's words.dawg."""
|
||||
gp, pt = D.get_paradigm, D.parse_tag_string
|
||||
para0, tagc = {}, {}
|
||||
|
||||
def g0(pid):
|
||||
r = para0.get(pid)
|
||||
if r is None:
|
||||
suf0, tag0, pre0 = gp(pid, 0)
|
||||
_, gr = pt(tag0)
|
||||
r = (pre0, suf0, gr)
|
||||
para0[pid] = r
|
||||
return r
|
||||
|
||||
def gt(pid, idx):
|
||||
k = (pid, idx)
|
||||
r = tagc.get(k)
|
||||
if r is None:
|
||||
suf, tag, pre = gp(pid, idx)
|
||||
pos, gr = pt(tag)
|
||||
r = (suf, pre, pos, gr)
|
||||
tagc[k] = r
|
||||
return r
|
||||
|
||||
out = set()
|
||||
for word, rec in D.words_dawg.iteritems():
|
||||
pid, idx = rec
|
||||
suf, pre, pos, gr = gt(pid, idx)
|
||||
if pos != "NOUN":
|
||||
continue
|
||||
pre0, suf0, gr0 = g0(pid)
|
||||
if (PROPER & gr) or (PROPER & gr0):
|
||||
continue
|
||||
stem = word[len(pre):len(word) - len(suf)] if suf else word[len(pre):]
|
||||
out.add(pre0 + stem + suf0)
|
||||
return {w for w in out if cyr_ok(w)}
|
||||
|
||||
|
||||
def oc_status(word):
|
||||
"""(is_common_noun, in_dictionary) for word, from OpenCorpora only."""
|
||||
parses = D.get_word_parses(word)
|
||||
if not parses:
|
||||
return False, False
|
||||
gp, pt = D.get_paradigm, D.parse_tag_string
|
||||
for pid, idx in parses:
|
||||
suf, tag, pre = gp(pid, idx)
|
||||
pos, gr = pt(tag)
|
||||
if pos == "NOUN":
|
||||
_, tag0, _ = gp(pid, 0)
|
||||
_, gr0 = pt(tag0)
|
||||
if not (PROPER & gr or PROPER & gr0):
|
||||
return True, True
|
||||
return False, True
|
||||
|
||||
|
||||
def libmorph_analyze(words):
|
||||
"""Map each word to (known, noun_lemma, codes) per libmorph; noun_lemma is None when it
|
||||
is not a common noun there. Empty result if the helper binary is not built."""
|
||||
words = list(words)
|
||||
if not words or not os.path.exists(LIBMORPH_BIN):
|
||||
return {}
|
||||
proc = subprocess.run([LIBMORPH_BIN], input="\n".join(words), capture_output=True, text=True)
|
||||
out = {}
|
||||
for w, line in zip(words, proc.stdout.split("\n")):
|
||||
fields = line.split("\t")
|
||||
known = fields[:1] == ["1"]
|
||||
codes, noun_lemmas = set(), []
|
||||
for field in fields[1:]:
|
||||
code, _, lex = field.partition(":")
|
||||
if code.isdigit():
|
||||
codes.add(int(code))
|
||||
if int(code) in LIBMORPH_NOUN_CODES:
|
||||
noun_lemmas.append(lex)
|
||||
lemma = (w if w in noun_lemmas else noun_lemmas[0]) if noun_lemmas else None
|
||||
out[w] = (known, lemma, codes)
|
||||
return out
|
||||
|
||||
|
||||
def build_notes():
|
||||
"""Map each headword (destressed, lowercased) to its grammatical note."""
|
||||
def is_hw(ch):
|
||||
o = ord(ch)
|
||||
return (0x0430 <= o <= 0x044F) or (0x0410 <= o <= 0x042F) or o in (0x0401, 0x0451, 0x0300, 0x0301)
|
||||
|
||||
hmap = {}
|
||||
lines = open(SLOV, encoding="utf-8").read().split("\n")
|
||||
for l in lines[WL_FROM - 1:WL_TO]:
|
||||
s = l.lstrip()
|
||||
e = 0
|
||||
for ch in s:
|
||||
if is_hw(ch):
|
||||
e += 1
|
||||
else:
|
||||
break
|
||||
hw = destress(s[:e])
|
||||
if hw and hw not in hmap:
|
||||
hmap[hw] = destress(s[e:]).strip()
|
||||
return hmap
|
||||
|
||||
|
||||
def classify(w, note):
|
||||
"""Coarse part of speech of an out-of-dictionary word from its PDF note."""
|
||||
if note is None:
|
||||
return "amb"
|
||||
n = re.sub(r"\([^)]*\)", "", note).strip() # drop domain/etymology parentheticals
|
||||
if "кр. ф" in n or "кр.ф" in n or "прич." in n or "прил." in n:
|
||||
return "adj"
|
||||
ends = re.findall(r"-([а-яё]+)", n)
|
||||
if any(e in ADJ_END for e in ends):
|
||||
return "adj"
|
||||
if "сов." in n or "несов." in n or "безл." in n:
|
||||
return "verb"
|
||||
if w.endswith("ся"): # reflexive: no Russian noun ends in -ся
|
||||
return "verb"
|
||||
if any(e.endswith(VERB3) for e in ends) and not any(m in n for m in ("ед.", "тв.", "род.", "м.", "ж.", "с.")):
|
||||
return "verb"
|
||||
if n == "" and w.endswith(("ый", "ий", "ой", "ая", "ое", "ые", "ие", "яя", "ее")):
|
||||
return "adj"
|
||||
if "нескл" in n:
|
||||
return "noun" if any(g in n for g in ("м.", "ж.", "с.", "мн.")) else "amb"
|
||||
if ends:
|
||||
return "noun"
|
||||
if n == "" and w.endswith(("ать", "ять", "еть", "ить", "оть", "уть", "ыть", "ти", "чь")):
|
||||
return "verb"
|
||||
return "amb"
|
||||
|
||||
|
||||
def singular(w, note):
|
||||
"""Nominative singular of a noun headword from the PDF note (authoritative) or, for a
|
||||
plural headword without an explicit singular, the mawo lemma; pluralia tantum kept."""
|
||||
n = note or ""
|
||||
full = re.search(r"ед\.\s+([а-яё]+)", n)
|
||||
if full:
|
||||
return full.group(1)
|
||||
suf = re.search(r"ед\.\s+-([а-яё]+)", n)
|
||||
if suf:
|
||||
s = suf.group(1)
|
||||
i = w.rfind(s[0])
|
||||
return w[:i] + s if i > 0 else w
|
||||
ends = re.findall(r"-([а-яё]+)", re.sub(r"\([^)]*\)", "", n))
|
||||
if ends and ends[0].endswith(GENPL):
|
||||
for p in M.parse(w):
|
||||
if str(p.tag.POS) == "NOUN":
|
||||
return p.normal_form
|
||||
return w
|
||||
return w
|
||||
|
||||
|
||||
def build():
|
||||
"""Run the whole pipeline in memory. Returns the result sets plus a `fate` map giving
|
||||
every word's outcome, so a word's path can be traced or the buckets dumped."""
|
||||
oc = set(load(OC_CACHE)) or oc_noun_lemmas()
|
||||
if not os.path.exists(OC_CACHE):
|
||||
write(OC_CACHE, oc)
|
||||
hmap = build_notes()
|
||||
all_words = load(os.path.join(OUT_DIR, "all.txt"))
|
||||
ed_nouns = set(load("/tmp/ru_singulars.txt"))
|
||||
pairs = [tuple(p) for l in load("/tmp/ru_variants.txt") if len(p := l.split("\t")) == 2]
|
||||
pdf = [w for w in all_words if cyr_ok(w)]
|
||||
lm = libmorph_analyze(pdf)
|
||||
|
||||
def to_singular(w):
|
||||
s = singular(w, hmap.get(w))
|
||||
return s if cyr_ok(s) else w
|
||||
|
||||
fate = {}
|
||||
scrabble = set(oc)
|
||||
adj, verb, amb = [], [], []
|
||||
for w in pdf:
|
||||
oc_noun, oc_known = oc_status(w)
|
||||
if oc_noun:
|
||||
fate[w] = "scrabble: сущ. по OpenCorpora"
|
||||
continue
|
||||
lm_known, lm_lemma, _ = lm.get(w, (False, None, frozenset()))
|
||||
if lm_lemma is not None:
|
||||
s = lm_lemma if cyr_ok(lm_lemma) else to_singular(w)
|
||||
scrabble.add(s)
|
||||
fate[w] = "scrabble: сущ. по libmorph" + ("" if s == w else f" → {s}")
|
||||
continue
|
||||
if oc_known or lm_known:
|
||||
fate[w] = "отброшено: словарь знает как не-существительное"
|
||||
continue
|
||||
if w in ed_nouns:
|
||||
scrabble.add(w)
|
||||
fate[w] = "scrabble: ед.ч. по помете «ед.»"
|
||||
continue
|
||||
c = classify(w, hmap.get(w))
|
||||
if c == "noun":
|
||||
s = to_singular(w)
|
||||
scrabble.add(s)
|
||||
fate[w] = "scrabble: сущ. по помете орфословаря" + ("" if s == w else f" → {s}")
|
||||
elif c == "adj":
|
||||
adj.append(w)
|
||||
fate[w] = "отброшено: прилагательное (помета орфословаря)"
|
||||
elif c == "verb":
|
||||
verb.append(w)
|
||||
fate[w] = "отброшено: глагол (помета орфословаря)"
|
||||
else:
|
||||
amb.append(w)
|
||||
fate[w] = "undefined: неоднозначное (нет в словарях, помета не определяет)"
|
||||
|
||||
# Manual confirmations: nouns the maintainer approved from the undefined tail.
|
||||
for w in load(os.path.join(OUT_DIR, "manual_confirm.txt")):
|
||||
if cyr_ok(w):
|
||||
scrabble.add(w)
|
||||
fate[w] = "scrabble: подтверждено вручную (manual_confirm.txt)"
|
||||
|
||||
# Variant rescue: a word joined by "и" to a confirmed noun is itself a noun.
|
||||
pending = set(amb) - scrabble
|
||||
changed = True
|
||||
while changed:
|
||||
changed = False
|
||||
for a, b in pairs:
|
||||
for x, y in ((a, b), (b, a)):
|
||||
if x in scrabble and y in pending:
|
||||
scrabble.add(y)
|
||||
pending.discard(y)
|
||||
fate[y] = f"scrabble: вариант от «{x}» (через «и»)"
|
||||
changed = True
|
||||
|
||||
undefined = [w for w in amb if w not in scrabble]
|
||||
return {
|
||||
"oc": oc, "scrabble": scrabble, "undefined": undefined,
|
||||
"adjectives": adj, "verbs": verb, "singulars": ed_nouns,
|
||||
"fate": fate, "all": set(all_words),
|
||||
}
|
||||
|
||||
|
||||
def trace(word, r):
|
||||
w = destress(word)
|
||||
if w in r["fate"]:
|
||||
return r["fate"][w]
|
||||
if w in r["scrabble"]:
|
||||
return "scrabble: лексикон OpenCorpora" if w in r["oc"] else "scrabble: производная/лемма"
|
||||
if w not in r["all"]:
|
||||
return "нет в russian_all (не извлечено на Stage 1 — нет в .pdf, либо имя собств./дефис/форма)"
|
||||
if not cyr_ok(w):
|
||||
return "отсеяно: длина или символы вне диапазона (2–15 кириллица)"
|
||||
return "не определено"
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(description="Stage 2 brain: build the noun dictionary, trace a word, or dump buckets.")
|
||||
ap.add_argument("--dump", action="store_true", help="also write the in-memory buckets (adjectives, verbs, singulars, variants, fate)")
|
||||
ap.add_argument("--trace", metavar="WORD", help="report how WORD did or did not reach the dictionary, then exit")
|
||||
args = ap.parse_args()
|
||||
|
||||
r = build()
|
||||
if args.trace:
|
||||
print(f"{args.trace}: {trace(args.trace, r)}")
|
||||
return
|
||||
|
||||
write(os.path.join(OUT_DIR, "scrabble.txt"), r["scrabble"])
|
||||
print(f"=> dictprep/russian/scrabble.txt {len(r['scrabble'])}")
|
||||
print(f" undefined kept in memory: {len(set(r['undefined']))} (use --dump to write it)")
|
||||
if args.dump:
|
||||
write(os.path.join(OUT_DIR, "undefined.txt"), r["undefined"])
|
||||
write(os.path.join(OUT_DIR, "adjectives.txt"), r["adjectives"])
|
||||
write(os.path.join(OUT_DIR, "verbs.txt"), r["verbs"])
|
||||
write(os.path.join(OUT_DIR, "singulars.txt"), r["singulars"])
|
||||
fate_path = os.path.join(OUT_DIR, "fate.tsv")
|
||||
os.makedirs(OUT_DIR, exist_ok=True)
|
||||
with open(fate_path, "w", encoding="utf-8") as f:
|
||||
for w in sorted(r["fate"], key=key):
|
||||
f.write(f"{w}\t{r['fate'][w]}\n")
|
||||
print(f" dumped: undefined.txt ({len(set(r['undefined']))}), adjectives.txt, verbs.txt, singulars.txt, fate.tsv")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
+148900
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,135 @@
|
||||
артгруппа
|
||||
бутень
|
||||
вебинар
|
||||
видеодневник
|
||||
водозащита
|
||||
генацвале
|
||||
жакоб
|
||||
оберфюрер
|
||||
околоть
|
||||
особина
|
||||
полбазара
|
||||
полбака
|
||||
полбалкона
|
||||
полбанана
|
||||
полбарана
|
||||
полбатальона
|
||||
полбатона
|
||||
полбиблиотеки
|
||||
полблокнота
|
||||
полбокала
|
||||
полбуханки
|
||||
полвагона
|
||||
полвечера
|
||||
полвзвода
|
||||
полвинта
|
||||
полгазеты
|
||||
полгектара
|
||||
полгостиницы
|
||||
полграмма
|
||||
полгруппы
|
||||
полдачи
|
||||
полдвора
|
||||
полдекабря
|
||||
полдеревни
|
||||
полдетсада
|
||||
полдивана
|
||||
полдивизии
|
||||
полдыни
|
||||
полжурнала
|
||||
ползавода
|
||||
ползарплаты
|
||||
полздания
|
||||
полканикул
|
||||
полканистры
|
||||
полкартофелины
|
||||
полкастрюли
|
||||
полквартиры
|
||||
полкилограмма
|
||||
полкласса
|
||||
полкниги
|
||||
полколлекции
|
||||
полкольца
|
||||
полкоманды
|
||||
полкоробки
|
||||
полкочана
|
||||
полкурса
|
||||
полкуска
|
||||
полмагазина
|
||||
полмандарина
|
||||
полмарта
|
||||
полматча
|
||||
полмиллиметра
|
||||
полмузея
|
||||
полноября
|
||||
полпакета
|
||||
полпарка
|
||||
полпартии
|
||||
полпинты
|
||||
полпирога
|
||||
полпирожка
|
||||
полпируэта
|
||||
полпоезда
|
||||
полполена
|
||||
полполка
|
||||
полполки
|
||||
полполосы
|
||||
полпомидора
|
||||
полпоросёнка
|
||||
полпосёлка
|
||||
полпредовский
|
||||
полпроцента
|
||||
полпузырька
|
||||
полрайона
|
||||
полромана
|
||||
полроты
|
||||
полрулона
|
||||
полряда
|
||||
полсада
|
||||
полсажени
|
||||
полсезона
|
||||
полсентября
|
||||
полсловаря
|
||||
полсостава
|
||||
полсрока
|
||||
полстада
|
||||
полстены
|
||||
полстолетия
|
||||
полстраницы
|
||||
полстроки
|
||||
полтаблетки
|
||||
полтайма
|
||||
полтакта
|
||||
полтарелки
|
||||
полтетради
|
||||
полтома
|
||||
полтона
|
||||
полторта
|
||||
полтысячелетия
|
||||
полтюбика
|
||||
полусанаторий
|
||||
полфакультета
|
||||
полфевраля
|
||||
полфлакона
|
||||
полфразы
|
||||
полхаты
|
||||
полцарства
|
||||
полцентнера
|
||||
полцистерны
|
||||
полчайника
|
||||
полчемодана
|
||||
полшажка
|
||||
полшажочка
|
||||
полшара
|
||||
полшкафа
|
||||
полшколы
|
||||
полщеки
|
||||
принт
|
||||
промо
|
||||
рентгеноаппарат
|
||||
сивец
|
||||
соцнаём
|
||||
срывка
|
||||
флеш
|
||||
флешмобер
|
||||
шиноремонт
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,434 @@
|
||||
// Command ruwords extracts a clean Cyrillic word list from the plain text of a Russian
|
||||
// orthographic dictionary (the output of `pdftotext`).
|
||||
//
|
||||
// Stage 1 (this tool): from the column word-list section [from, to] it collects, per
|
||||
// entry, the headword (the leading token). When the headword is plural and the entry
|
||||
// gives its singular after "ед." — in full ("ящеры, …, ед. ящер") or as a replacement
|
||||
// suffix ("…, ед. -вец") — only the singular is kept, since a plural that has a singular
|
||||
// is never needed. It drops stress marks, lowercases, keeps ё, and discards proper nouns
|
||||
// (capitalized), hyphenated words, acronyms and non-Cyrillic tokens. The result is
|
||||
// de-duplicated and sorted in Russian alphabetical order (ё right after е), LF-separated.
|
||||
//
|
||||
// It also collects a variant headword joined by "и" when it carries its own grammatical
|
||||
// note (e.g. "аблатив, -а и аблятив, -а"). Suffix-singular reconstruction is heuristic;
|
||||
// Stage 2 (dictprep/ru_stage2.py) re-checks the words against real dictionaries.
|
||||
//
|
||||
// pdftotext dictprep/orfo_dict_2025.pdf /tmp/slov.txt
|
||||
// go run ./dictprep/ruwords -in /tmp/slov.txt -from 452 -to 168808 \
|
||||
// -out russian_all.txt -skip russian_skip.txt
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"flag"
|
||||
"fmt"
|
||||
"log"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sort"
|
||||
"strings"
|
||||
"unicode"
|
||||
)
|
||||
|
||||
// ruAlphabet is the Russian alphabet in collation order (ё directly after е).
|
||||
const ruAlphabet = "абвгдеёжзийклмнопрстуфхцчшщъыьэюя"
|
||||
|
||||
var ruRank = func() map[rune]int {
|
||||
m := make(map[rune]int, len(ruAlphabet))
|
||||
for i, r := range []rune(ruAlphabet) {
|
||||
m[r] = i
|
||||
}
|
||||
return m
|
||||
}()
|
||||
|
||||
func isCyrLetter(r rune) bool {
|
||||
return (r >= 'а' && r <= 'я') || (r >= 'А' && r <= 'Я') || r == 'ё' || r == 'Ё'
|
||||
}
|
||||
|
||||
func isUpperCyr(r rune) bool { return (r >= 'А' && r <= 'Я') || r == 'Ё' }
|
||||
|
||||
func isStress(r rune) bool { return r == 0x0300 || r == 0x0301 }
|
||||
|
||||
// cleanWord normalizes a run of letters/stress-marks into a lowercase Cyrillic word, or
|
||||
// returns ok=false for proper nouns (capitalized), hyphenated or non-Cyrillic runs.
|
||||
func cleanWord(run []rune) (string, bool) {
|
||||
if len(run) == 0 || isUpperCyr(run[0]) {
|
||||
return "", false
|
||||
}
|
||||
var b strings.Builder
|
||||
for _, r := range run {
|
||||
switch {
|
||||
case isStress(r), r == '': // drop stress accents and soft hyphens
|
||||
case r == '-': // a real hyphen means a hyphenated word: reject it
|
||||
return "", false
|
||||
default:
|
||||
b.WriteRune(unicode.ToLower(r))
|
||||
}
|
||||
}
|
||||
w := b.String()
|
||||
if w == "" {
|
||||
return "", false
|
||||
}
|
||||
for _, r := range w {
|
||||
if !((r >= 'а' && r <= 'я') || r == 'ё') {
|
||||
return "", false
|
||||
}
|
||||
}
|
||||
return w, true
|
||||
}
|
||||
|
||||
// headword returns the entry's headword: the leading run of letters, stress marks and
|
||||
// hyphens, normalized.
|
||||
func headword(line string) (string, bool) {
|
||||
// Trim leading whitespace, including the form-feed (U+000C) that pdftotext puts at
|
||||
// the top of each page — otherwise the first headword on every page is lost.
|
||||
line = strings.TrimLeftFunc(line, unicode.IsSpace)
|
||||
var run []rune
|
||||
for _, r := range line {
|
||||
if isCyrLetter(r) || isStress(r) || r == '-' || r == '' {
|
||||
run = append(run, r)
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
return cleanWord(run)
|
||||
}
|
||||
|
||||
// embeddedSingulars returns the singular form of a plural headword spelled out after
|
||||
// "ед.", either in full ("ед. ящер") or as a replacement suffix ("ед. -вец",
|
||||
// reconstructed from headword). It skips gender marks ("ед. м") and abbreviations that
|
||||
// merely start with "ед." ("ед. измер.", "ден. ед.").
|
||||
func embeddedSingulars(line, headword string) []string {
|
||||
var out []string
|
||||
for i := 0; ; {
|
||||
j := strings.Index(line[i:], "ед.")
|
||||
if j < 0 {
|
||||
break
|
||||
}
|
||||
i += j + len("ед.")
|
||||
rest := strings.TrimLeft(line[i:], " \t")
|
||||
|
||||
if strings.HasPrefix(rest, "-") { // suffix form: reconstruct from the headword
|
||||
var suf []rune
|
||||
for _, r := range rest[len("-"):] {
|
||||
if isCyrLetter(r) || isStress(r) {
|
||||
suf = append(suf, r)
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
if s, ok := cleanWord(suf); ok && len([]rune(s)) >= 2 {
|
||||
if recon := reconstructSingular(headword, s); recon != "" {
|
||||
out = append(out, recon)
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
var run []rune
|
||||
consumed := 0
|
||||
for _, r := range rest {
|
||||
if isCyrLetter(r) || isStress(r) {
|
||||
run = append(run, r)
|
||||
consumed += len(string(r))
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
if len(run) == 0 {
|
||||
continue
|
||||
}
|
||||
if strings.HasPrefix(rest[consumed:], ".") {
|
||||
continue // an abbreviation like "ед. измер." rather than a singular form
|
||||
}
|
||||
w, ok := cleanWord(run)
|
||||
if !ok || len([]rune(w)) < 2 { // 2+ letters excludes the gender marks м/ж/с
|
||||
continue
|
||||
}
|
||||
out = append(out, w)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// reconstructSingular builds the singular from a plural headword and the replacement
|
||||
// suffix from "ед. -<suffix>", splicing where the suffix best overlaps the tail of the
|
||||
// headword (the position of longest common prefix between the suffix and a headword
|
||||
// suffix). It is a heuristic; Stage 2 re-checks the words against real dictionaries.
|
||||
func reconstructSingular(headword, suffix string) string {
|
||||
hw, sf := []rune(headword), []rune(suffix)
|
||||
bestK, bestLen := -1, 0
|
||||
for k := 0; k < len(hw); k++ {
|
||||
m := 0
|
||||
for k+m < len(hw) && m < len(sf) && hw[k+m] == sf[m] {
|
||||
m++
|
||||
}
|
||||
if m > bestLen {
|
||||
bestK, bestLen = k, m
|
||||
}
|
||||
}
|
||||
if bestK < 0 {
|
||||
return ""
|
||||
}
|
||||
return string(hw[:bestK]) + suffix
|
||||
}
|
||||
|
||||
// headwordNotes are the grammatical notes that mark a parallel headword (a lemma) after
|
||||
// "и", as opposed to an inflected form. A "-" ending also marks one; form labels such as
|
||||
// деепр. (gerund) or сравн. (comparative) deliberately do not.
|
||||
var headwordNotes = map[string]bool{
|
||||
"нескл": true, "неизм": true, "предлог": true, "предл": true, "нареч": true,
|
||||
"нар": true, "прил": true, "союз": true, "частица": true, "част": true,
|
||||
"межд": true, "мн": true, "ед": true, "тв": true, "числ": true, "мест": true,
|
||||
"м": true, "ж": true, "с": true, "вводн": true, "сказ": true,
|
||||
}
|
||||
|
||||
// variantNoteOK reports whether the note following a candidate variant marks a headword:
|
||||
// a "-" inflection ending or one of headwordNotes (and not a bare inflected word).
|
||||
func variantNoteOK(note string) bool {
|
||||
if strings.HasPrefix(note, "-") {
|
||||
return true
|
||||
}
|
||||
var stem []rune
|
||||
for _, r := range note {
|
||||
if (r >= 'а' && r <= 'я') || r == 'ё' {
|
||||
stem = append(stem, r)
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
return headwordNotes[string(stem)]
|
||||
}
|
||||
|
||||
// variants returns the second (and further) headwords of an entry, written as a parallel
|
||||
// form after " и ", e.g. "аблатив, -а и аблятив, -а" yields "аблятив" and "регги и реггей,
|
||||
// нескл." yields "реггей". Requiring a headword note after the comma keeps this from
|
||||
// matching "и" inside examples or picking up inflected forms.
|
||||
func variants(line string) []string {
|
||||
var out []string
|
||||
const sep = " и "
|
||||
for i := 0; ; {
|
||||
j := strings.Index(line[i:], sep)
|
||||
if j < 0 {
|
||||
break
|
||||
}
|
||||
i += j + len(sep)
|
||||
rest := line[i:]
|
||||
var run []rune
|
||||
consumed := 0
|
||||
for _, r := range rest {
|
||||
if isCyrLetter(r) || isStress(r) {
|
||||
run = append(run, r)
|
||||
consumed += len(string(r))
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
if len(run) == 0 {
|
||||
continue
|
||||
}
|
||||
after := rest[consumed:]
|
||||
if !strings.HasPrefix(after, ", ") || !variantNoteOK(after[len(", "):]) {
|
||||
continue
|
||||
}
|
||||
if w, ok := cleanWord(run); ok && len([]rune(w)) >= 2 {
|
||||
out = append(out, w)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// normToken normalizes any token (a run of letters and stress marks) for the skip set:
|
||||
// lowercase, stress removed, kept only if it is 2+ all-Cyrillic letters. Unlike
|
||||
// cleanWord it does NOT reject capitalized tokens — a lowercased proper noun belongs in
|
||||
// the skip set so it can be re-checked by a morphological analyzer.
|
||||
func normToken(run []rune) (string, bool) {
|
||||
var b strings.Builder
|
||||
for _, r := range run {
|
||||
if isStress(r) {
|
||||
continue
|
||||
}
|
||||
b.WriteRune(unicode.ToLower(r))
|
||||
}
|
||||
w := b.String()
|
||||
if len([]rune(w)) < 2 {
|
||||
return "", false
|
||||
}
|
||||
for _, r := range w {
|
||||
if !((r >= 'а' && r <= 'я') || r == 'ё') {
|
||||
return "", false
|
||||
}
|
||||
}
|
||||
return w, true
|
||||
}
|
||||
|
||||
// tokens returns every maximal run of Cyrillic letters (plus stress marks) in the line,
|
||||
// normalized; runs are split on every other character (so hyphens split a word).
|
||||
func tokens(line string) []string {
|
||||
var out []string
|
||||
var run []rune
|
||||
flush := func() {
|
||||
if len(run) > 0 {
|
||||
if w, ok := normToken(run); ok {
|
||||
out = append(out, w)
|
||||
}
|
||||
run = run[:0]
|
||||
}
|
||||
}
|
||||
for _, r := range line {
|
||||
if isCyrLetter(r) || isStress(r) {
|
||||
run = append(run, r)
|
||||
} else {
|
||||
flush()
|
||||
}
|
||||
}
|
||||
flush()
|
||||
return out
|
||||
}
|
||||
|
||||
func lessRu(a, b string) bool {
|
||||
ra, rb := []rune(a), []rune(b)
|
||||
for i := 0; i < len(ra) && i < len(rb); i++ {
|
||||
if ra[i] != rb[i] {
|
||||
return ruRank[ra[i]] < ruRank[rb[i]]
|
||||
}
|
||||
}
|
||||
return len(ra) < len(rb)
|
||||
}
|
||||
|
||||
func sortedRu(set map[string]struct{}) []string {
|
||||
words := make([]string, 0, len(set))
|
||||
for w := range set {
|
||||
words = append(words, w)
|
||||
}
|
||||
sort.Slice(words, func(i, j int) bool { return lessRu(words[i], words[j]) })
|
||||
return words
|
||||
}
|
||||
|
||||
func writeWords(path string, words []string) error {
|
||||
if dir := filepath.Dir(path); dir != "" && dir != "." {
|
||||
if err := os.MkdirAll(dir, 0o755); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
o, err := os.Create(path)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
w := bufio.NewWriter(o)
|
||||
for _, word := range words {
|
||||
w.WriteString(word)
|
||||
w.WriteByte('\n')
|
||||
}
|
||||
if err := w.Flush(); err != nil {
|
||||
o.Close()
|
||||
return err
|
||||
}
|
||||
return o.Close()
|
||||
}
|
||||
|
||||
func main() {
|
||||
in := flag.String("in", "dictprep/russian/orfo_dict_2025.txt", "plain-text dictionary (pdftotext output)")
|
||||
out := flag.String("out", "dictprep/russian/all.txt", "output: the base word list (clean headwords + reconstructed singulars + variants)")
|
||||
skip := flag.String("skip", "/tmp/ru_skip.txt", "output: every other token, for a later morphology re-check")
|
||||
sings := flag.String("singulars", "/tmp/ru_singulars.txt", "output: singulars reconstructed from \"ед.\" (known nouns)")
|
||||
varsOut := flag.String("variants", "/tmp/ru_variants.txt", "output: variant pairs joined by \"и\" (primary<TAB>variant)")
|
||||
from := flag.Int("from", 452, "first line of the word-list section (1-based, inclusive)")
|
||||
to := flag.Int("to", 168808, "last line of the word-list section (inclusive)")
|
||||
flag.Parse()
|
||||
if *in == "" {
|
||||
log.Fatal("ruwords: -in is required")
|
||||
}
|
||||
|
||||
f, err := os.Open(*in)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
all := make(map[string]struct{})
|
||||
allTokens := make(map[string]struct{})
|
||||
singulars := make(map[string]struct{})
|
||||
variantPairs := make(map[string]struct{})
|
||||
entries, fromHead, fromSing, fromVar := 0, 0, 0, 0
|
||||
sc := bufio.NewScanner(f)
|
||||
sc.Buffer(make([]byte, 1<<20), 1<<20)
|
||||
for line := 0; sc.Scan(); {
|
||||
line++
|
||||
if line < *from || line > *to {
|
||||
continue
|
||||
}
|
||||
entries++
|
||||
text := sc.Text()
|
||||
hw, hwOK := headword(text)
|
||||
var sings []string
|
||||
if hwOK {
|
||||
sings = embeddedSingulars(text, hw)
|
||||
}
|
||||
primary := ""
|
||||
if len(sings) > 0 {
|
||||
// the headword is plural and the entry gives its singular: keep only the singular
|
||||
primary = sings[0]
|
||||
for _, w := range sings {
|
||||
if _, seen := all[w]; !seen {
|
||||
fromSing++
|
||||
all[w] = struct{}{}
|
||||
}
|
||||
singulars[w] = struct{}{}
|
||||
}
|
||||
} else if hwOK {
|
||||
primary = hw
|
||||
if _, seen := all[hw]; !seen {
|
||||
fromHead++
|
||||
}
|
||||
all[hw] = struct{}{}
|
||||
}
|
||||
for _, w := range variants(text) {
|
||||
if _, seen := all[w]; !seen {
|
||||
fromVar++
|
||||
all[w] = struct{}{}
|
||||
}
|
||||
if primary != "" && primary != w {
|
||||
variantPairs[primary+"\t"+w] = struct{}{}
|
||||
}
|
||||
}
|
||||
for _, w := range tokens(text) {
|
||||
allTokens[w] = struct{}{}
|
||||
}
|
||||
}
|
||||
if err := sc.Err(); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
skipSet := make(map[string]struct{})
|
||||
for w := range allTokens {
|
||||
if _, ok := all[w]; !ok {
|
||||
skipSet[w] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
allWords := sortedRu(all)
|
||||
skipWords := sortedRu(skipSet)
|
||||
if err := writeWords(*out, allWords); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
if err := writeWords(*skip, skipWords); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
if err := writeWords(*sings, sortedRu(singulars)); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
pairList := make([]string, 0, len(variantPairs))
|
||||
for p := range variantPairs {
|
||||
pairList = append(pairList, p)
|
||||
}
|
||||
sort.Strings(pairList)
|
||||
if err := writeWords(*varsOut, pairList); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
fmt.Printf("scanned %d entries\n", entries)
|
||||
fmt.Printf(" %-20s %7d words (%d headwords + %d embedded singulars + %d variants)\n", *out, len(allWords), fromHead, fromSing, fromVar)
|
||||
fmt.Printf(" %-20s %7d words (tokens not in %s; for a morphology re-check)\n", *skip, len(skipWords), *out)
|
||||
fmt.Printf(" %-20s %7d words (singulars from \"ед.\"; known nouns)\n", *sings, len(singulars))
|
||||
fmt.Printf(" %-20s %7d pairs (variants joined by \"и\")\n", *varsOut, len(variantPairs))
|
||||
}
|
||||
@@ -0,0 +1,10 @@
|
||||
module scrabble-solver
|
||||
|
||||
go 1.26.3
|
||||
|
||||
require (
|
||||
github.com/iliadenisov/alphabet v1.1.0
|
||||
github.com/iliadenisov/dafsa v1.1.0
|
||||
)
|
||||
|
||||
require golang.org/x/exp v0.0.0-20201008143054-e3b2a7f2fdc7 // indirect
|
||||
@@ -0,0 +1,29 @@
|
||||
dmitri.shuralyov.com/gpu/mtl v0.0.0-20190408044501-666a987793e9/go.mod h1:H6x//7gZCb22OMCxBHrMx7a5I7Hp++hsVxbQ4BYO7hU=
|
||||
github.com/BurntSushi/xgb v0.0.0-20160522181843-27f122750802/go.mod h1:IVnqGOEym/WlBOVXweHU+Q+/VP0lqqI8lqeDx9IjBqo=
|
||||
github.com/go-gl/glfw/v3.3/glfw v0.0.0-20200222043503-6f7a984d4dc4/go.mod h1:tQ2UAYgL5IevRw8kRxooKSPJfGvJ9fJQFa0TUsXzTg8=
|
||||
github.com/iliadenisov/alphabet v1.1.0 h1:d87N7Rmpjj9FgL7bvEaqLdaIaNch2hC6HvkbKGhn7Hk=
|
||||
github.com/iliadenisov/alphabet v1.1.0/go.mod h1:h6BhDBiJBLhMEb5XfsqJXZop3hhwXaD8lc5yf38Baqw=
|
||||
github.com/iliadenisov/dafsa v1.1.0 h1:NV1ZOstMdHXI/cCyAZKOD3qnKLoYdMUunA0+Baj7vR4=
|
||||
github.com/iliadenisov/dafsa v1.1.0/go.mod h1:mG6Y0DdfRrqdXGqTEMb9Zx0Fl0NkP3ZDYesvxR+e14o=
|
||||
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
|
||||
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
|
||||
golang.org/x/exp v0.0.0-20190306152737-a1d7652674e8/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
|
||||
golang.org/x/exp v0.0.0-20201008143054-e3b2a7f2fdc7 h1:2/QncOxxpPAdiH+E00abYw/SaQG353gltz79Nl1zrYE=
|
||||
golang.org/x/exp v0.0.0-20201008143054-e3b2a7f2fdc7/go.mod h1:1phAWC201xIgDyaFpmDeZkgf70Q4Pd/CNqfRtVPtxNw=
|
||||
golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js=
|
||||
golang.org/x/image v0.0.0-20190802002840-cff245a6509b/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
|
||||
golang.org/x/mobile v0.0.0-20190719004257-d2bd2a29d028/go.mod h1:E/iHnbuqvinMTCcRqshq8CkpyQDoeVncDDYHnLhea+o=
|
||||
golang.org/x/mod v0.1.1-0.20191105210325-c90efee705ee/go.mod h1:QqPTAvyqsEbceGzBzNggFXnrqF1CaUcvgkdR5Ot7KZg=
|
||||
golang.org/x/mod v0.3.1-0.20200828183125-ce943fd02449/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
|
||||
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
|
||||
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
|
||||
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
|
||||
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
|
||||
golang.org/x/sys v0.0.0-20190312061237-fead79001313/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/sys v0.0.0-20191001151750-bb3f8db39f24/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
|
||||
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
|
||||
golang.org/x/tools v0.0.0-20200207183749-b753a1ba74fa/go.mod h1:TB2adYChydJhpapKDTa4BR/hXlZSLoq2Wpct/0txZ28=
|
||||
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
|
||||
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
|
||||
@@ -0,0 +1,76 @@
|
||||
// Package dict loads the English test dictionary as a DAWG, preferring the serialized
|
||||
// cache under testdata and falling back to building from the dictionaries submodule.
|
||||
// Paths are resolved relative to the repository root so it works both from the repo root
|
||||
// (commands) and from a package directory (tests).
|
||||
package dict
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
)
|
||||
|
||||
// MinLen and MaxLen bound playable word lengths (a 15x15 board holds at most 15).
|
||||
const (
|
||||
MinLen = 2
|
||||
MaxLen = 15
|
||||
)
|
||||
|
||||
func exists(p string) bool { _, err := os.Stat(p); return err == nil }
|
||||
|
||||
// Root returns the repository root by walking up from the working directory to the
|
||||
// directory containing go.mod, or "." if none is found.
|
||||
func Root() string {
|
||||
dir, err := os.Getwd()
|
||||
if err != nil {
|
||||
return "."
|
||||
}
|
||||
for {
|
||||
if exists(filepath.Join(dir, "go.mod")) {
|
||||
return dir
|
||||
}
|
||||
parent := filepath.Dir(dir)
|
||||
if parent == dir {
|
||||
return "."
|
||||
}
|
||||
dir = parent
|
||||
}
|
||||
}
|
||||
|
||||
// DAWGCache and WordlistPath locate the English cache file and source word list,
|
||||
// relative to the repository root.
|
||||
func DAWGCache() string { return filepath.Join(Root(), "testdata", "sowpods.dawg") }
|
||||
func WordlistPath() string { return filepath.Join(Root(), "dictionaries", "english", "sowpods.txt") }
|
||||
|
||||
// EnglishAvailable reports whether the English dictionary can be loaded (cache or source).
|
||||
func EnglishAvailable() bool {
|
||||
return exists(DAWGCache()) || exists(WordlistPath())
|
||||
}
|
||||
|
||||
// EnglishWords returns the encoded English word list (from the submodule source).
|
||||
func EnglishWords() ([][]byte, error) {
|
||||
return wordlist.Read(WordlistPath(), alphabet.Latin(), MinLen, MaxLen)
|
||||
}
|
||||
|
||||
// EnglishDAWG returns the English DAWG, loading the cache if present, otherwise building
|
||||
// it from the word list and caching it (best effort).
|
||||
func EnglishDAWG() (dawg.Finder, error) {
|
||||
if exists(DAWGCache()) {
|
||||
return dictdawg.Load(DAWGCache())
|
||||
}
|
||||
words, err := EnglishWords()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
f, err := dictdawg.Build(alphabet.Latin(), words)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
_ = dictdawg.Save(f, DAWGCache())
|
||||
return f, nil
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
// Package dictdawg builds a plain left-to-right DAWG of a dictionary, as used by the
|
||||
// Appel-Jacobson move generator.
|
||||
package dictdawg
|
||||
|
||||
import (
|
||||
"github.com/iliadenisov/alphabet"
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
)
|
||||
|
||||
// Build returns a DAWG Finder over words, which must be alphabet-index slices sorted by
|
||||
// index order and de-duplicated (see wordlist.Encode).
|
||||
func Build(idx alphabet.Indexer, words [][]byte) (dawg.Finder, error) {
|
||||
d := dawg.New(idx)
|
||||
for _, w := range words {
|
||||
if err := d.AddB(w); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
return d.Finish(), nil
|
||||
}
|
||||
|
||||
// Save writes the DAWG to filename. It requires an embedded alphabet (for example
|
||||
// alphabet.Latin()), so that Load can reconstruct it.
|
||||
func Save(f dawg.Finder, filename string) error {
|
||||
_, err := f.Save(filename)
|
||||
return err
|
||||
}
|
||||
|
||||
// Load reopens a DAWG saved with Save.
|
||||
func Load(filename string) (dawg.Finder, error) { return dawg.Load(filename) }
|
||||
@@ -0,0 +1,44 @@
|
||||
package dictdawg_test
|
||||
|
||||
import (
|
||||
"path/filepath"
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
)
|
||||
|
||||
func TestBuildAndQuery(t *testing.T) {
|
||||
words := wordlist.Encode([]string{"care", "cares", "cat"}, alphabet.Latin(), 2, 15)
|
||||
f, err := dictdawg.Build(alphabet.Latin(), words)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if f.NumAdded() != 3 {
|
||||
t.Fatalf("NumAdded = %d, want 3", f.NumAdded())
|
||||
}
|
||||
if i := f.IndexOfB([]byte{2, 0, 17, 4}); i != 0 { // care
|
||||
t.Errorf("IndexOf(care) = %d, want 0", i)
|
||||
}
|
||||
if i := f.IndexOfB([]byte{2, 0, 19}); i != 2 { // cat
|
||||
t.Errorf("IndexOf(cat) = %d, want 2", i)
|
||||
}
|
||||
if i := f.IndexOfB([]byte{2, 0, 17}); i != -1 { // car (absent)
|
||||
t.Errorf("IndexOf(car) = %d, want -1", i)
|
||||
}
|
||||
|
||||
path := filepath.Join(t.TempDir(), "d.dawg")
|
||||
if err := dictdawg.Save(f, path); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
g, err := dictdawg.Load(path)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
defer g.Close()
|
||||
if i := g.IndexOfB([]byte{2, 0, 17, 4, 18}); i != 1 { // cares
|
||||
t.Errorf("loaded IndexOf(cares) = %d, want 1", i)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
// Package encoding defines the compact byte conventions shared by the board, rack,
|
||||
// move output and (for letters) the dictionary graph.
|
||||
//
|
||||
// One uniform "symbol byte" is used everywhere:
|
||||
//
|
||||
// bits 0..5 the alphabet letter index plus one (1..63); 0 means "empty / no tile"
|
||||
// bit 6 reserved (unused)
|
||||
// bit 7 Blank — the tile is a blank standing for that letter; it scores 0
|
||||
//
|
||||
// The +1 offset lets 0 mean an empty board square. The same byte represents a board
|
||||
// cell, a placed tile and a rack tile; the graph stores raw letter indexes (without the
|
||||
// +1).
|
||||
package encoding
|
||||
|
||||
const (
|
||||
// Blank flags a tile as a blank standing for its letter; a blank scores 0.
|
||||
Blank byte = 0x80
|
||||
|
||||
// Empty is the value of an unoccupied board square.
|
||||
Empty byte = 0x00
|
||||
|
||||
letterBits byte = 0x3f // low 6 bits: letter index + 1
|
||||
)
|
||||
|
||||
// Cell builds the byte for a tile of the given alphabet letter index. When blank is
|
||||
// true the tile is marked as a blank (it scores 0).
|
||||
func Cell(letter byte, blank bool) byte {
|
||||
c := (letter + 1) & letterBits
|
||||
if blank {
|
||||
c |= Blank
|
||||
}
|
||||
return c
|
||||
}
|
||||
|
||||
// IsEmpty reports whether a board cell is unoccupied.
|
||||
func IsEmpty(cell byte) bool { return cell&letterBits == 0 }
|
||||
|
||||
// Letter returns the alphabet letter index of a non-empty cell or tile byte. The
|
||||
// result is meaningless for an empty cell.
|
||||
func Letter(cell byte) byte { return (cell & letterBits) - 1 }
|
||||
|
||||
// IsBlank reports whether a cell or tile byte is a blank (scores 0).
|
||||
func IsBlank(cell byte) bool { return cell&Blank != 0 }
|
||||
@@ -0,0 +1,39 @@
|
||||
package encoding
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestCellRoundTrip(t *testing.T) {
|
||||
for letter := range byte(26) {
|
||||
c := Cell(letter, false)
|
||||
if IsEmpty(c) {
|
||||
t.Errorf("Cell(%d,false) reports empty", letter)
|
||||
}
|
||||
if IsBlank(c) {
|
||||
t.Errorf("Cell(%d,false) reports blank", letter)
|
||||
}
|
||||
if got := Letter(c); got != letter {
|
||||
t.Errorf("Letter(Cell(%d,false)) = %d", letter, got)
|
||||
}
|
||||
|
||||
b := Cell(letter, true)
|
||||
if !IsBlank(b) {
|
||||
t.Errorf("Cell(%d,true) not blank", letter)
|
||||
}
|
||||
if got := Letter(b); got != letter {
|
||||
t.Errorf("Letter(Cell(%d,true)) = %d, want %d", letter, got, letter)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestEmpty(t *testing.T) {
|
||||
if !IsEmpty(Empty) {
|
||||
t.Error("IsEmpty(Empty) = false")
|
||||
}
|
||||
if IsEmpty(Cell(0, false)) {
|
||||
t.Error("IsEmpty(Cell('a')) = true")
|
||||
}
|
||||
// 'a' (index 0) must not collide with empty.
|
||||
if Cell(0, false) == Empty {
|
||||
t.Error("Cell('a') collides with Empty")
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
// Package graph provides thin, reusable helpers over a dafsa Cursor that the move
|
||||
// generator builds on. It keeps the rest of the solver from depending on dafsa
|
||||
// traversal details directly.
|
||||
package graph
|
||||
|
||||
import dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
// Spell follows the given alphabet indices from the cursor's root. It returns the
|
||||
// state reached, whether that state is accepting, and whether the whole path exists.
|
||||
// When ok is false the path ran into a missing edge; n and final are meaningless.
|
||||
func Spell(c *dawg.Cursor, indices []byte) (n dawg.Node, final, ok bool) {
|
||||
n = c.Root()
|
||||
final = c.Final(n)
|
||||
for _, ix := range indices {
|
||||
n, final, ok = c.Next(n, ix)
|
||||
if !ok {
|
||||
return n, false, false
|
||||
}
|
||||
}
|
||||
return n, final, true
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
package graph_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
"scrabble-solver/internal/graph"
|
||||
)
|
||||
|
||||
// TestSpellSmoke also exercises the go.mod replace => ../dafsa wiring and the new
|
||||
// dafsa traversal API end-to-end from the solver module.
|
||||
func TestSpellSmoke(t *testing.T) {
|
||||
d := dawg.New(alphabet.Latin())
|
||||
for _, w := range []string{"cat", "cats", "dog"} {
|
||||
if err := d.Add(w); err != nil {
|
||||
t.Fatalf("Add(%q): %v", w, err)
|
||||
}
|
||||
}
|
||||
c, err := dawg.NewCursor(d.Finish())
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
enc := func(s string) []byte {
|
||||
b, err := alphabet.Latin().Encode(s)
|
||||
if err != nil {
|
||||
t.Fatalf("Encode(%q): %v", s, err)
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
if _, final, ok := graph.Spell(c, enc("cat")); !ok || !final {
|
||||
t.Errorf("Spell(cat): ok=%v final=%v, want both true", ok, final)
|
||||
}
|
||||
if _, final, ok := graph.Spell(c, enc("ca")); !ok || final {
|
||||
t.Errorf("Spell(ca): ok=%v final=%v, want ok=true final=false", ok, final)
|
||||
}
|
||||
if _, _, ok := graph.Spell(c, enc("xyz")); ok {
|
||||
t.Errorf("Spell(xyz): ok=true, want false")
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,77 @@
|
||||
// Package wordlist reads dictionaries and encodes them into alphabet-index words,
|
||||
// ready to add to a DAWG.
|
||||
package wordlist
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"bytes"
|
||||
"os"
|
||||
"sort"
|
||||
"strings"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
)
|
||||
|
||||
// Encode turns words into alphabet-index slices, keeping only those whose length is in
|
||||
// [minLen, maxLen] and whose characters all belong to idx's alphabet (case-folded).
|
||||
// The result is sorted by index order and de-duplicated, as a DAWG builder requires.
|
||||
func Encode(words []string, idx alphabet.Indexer, minLen, maxLen int) [][]byte {
|
||||
out := make([][]byte, 0, len(words))
|
||||
for _, w := range words {
|
||||
w = strings.TrimSpace(w)
|
||||
if w == "" {
|
||||
continue
|
||||
}
|
||||
b, err := idx.Encode(strings.ToLower(w))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if len(b) < minLen || len(b) > maxLen {
|
||||
continue
|
||||
}
|
||||
out = append(out, b)
|
||||
}
|
||||
sort.Slice(out, func(i, j int) bool { return bytes.Compare(out[i], out[j]) < 0 })
|
||||
return Dedupe(out)
|
||||
}
|
||||
|
||||
// Read is Encode applied to the lines (one word per line) of the file at path.
|
||||
func Read(path string, idx alphabet.Indexer, minLen, maxLen int) ([][]byte, error) {
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
var words []string
|
||||
sc := bufio.NewScanner(f)
|
||||
sc.Buffer(make([]byte, 1<<20), 1<<20)
|
||||
for sc.Scan() {
|
||||
words = append(words, sc.Text())
|
||||
}
|
||||
if err := sc.Err(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return Encode(words, idx, minLen, maxLen), nil
|
||||
}
|
||||
|
||||
// FoldYo replaces Ё/ё with Е/е. The Russian "Эрудит" variant has no Ё tile and treats
|
||||
// Е and Ё as the same letter, so apply this when preparing an Эрудит dictionary (it is a
|
||||
// dictionary-preparation step, not an engine behaviour).
|
||||
func FoldYo(s string) string {
|
||||
return strings.NewReplacer("ё", "е", "Ё", "Е").Replace(s)
|
||||
}
|
||||
|
||||
// Dedupe removes adjacent duplicates from a sorted slice of index words in place.
|
||||
func Dedupe(s [][]byte) [][]byte {
|
||||
if len(s) == 0 {
|
||||
return s
|
||||
}
|
||||
out := s[:1]
|
||||
for i := 1; i < len(s); i++ {
|
||||
if !bytes.Equal(s[i], s[i-1]) {
|
||||
out = append(out, s[i])
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
@@ -0,0 +1,37 @@
|
||||
package wordlist
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
)
|
||||
|
||||
func TestFoldYo(t *testing.T) {
|
||||
if got := FoldYo("ёлка"); got != "елка" {
|
||||
t.Errorf("FoldYo(ёлка) = %q, want елка", got)
|
||||
}
|
||||
if got := FoldYo("Ёжик"); got != "Ежик" {
|
||||
t.Errorf("FoldYo(Ёжик) = %q, want Ежик", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEncodeFilterSortDedupe(t *testing.T) {
|
||||
got := Encode([]string{
|
||||
"cat", "CATS", "ab", "b", "abcdefghi", "cat", " do ", "qu1rk",
|
||||
}, alphabet.Latin(), 2, 8)
|
||||
|
||||
want := [][]byte{
|
||||
{0, 1}, // ab
|
||||
{2, 0, 19}, // cat
|
||||
{2, 0, 19, 18}, // cats (from CATS, case-folded)
|
||||
{3, 14}, // do (trimmed)
|
||||
}
|
||||
if len(got) != len(want) {
|
||||
t.Fatalf("got %d words %v, want %d", len(got), got, len(want))
|
||||
}
|
||||
for i := range want {
|
||||
if string(got[i]) != string(want[i]) {
|
||||
t.Errorf("word %d = %v, want %v", i, got[i], want[i])
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,57 @@
|
||||
// Package rack represents a player's rack as per-letter tile counts plus blanks.
|
||||
package rack
|
||||
|
||||
// Rack holds tile counts: one slot per alphabet letter index plus a final slot for
|
||||
// blanks. Like a Go slice or map, a Rack value shares its underlying storage with its
|
||||
// copies; use Clone for an independent rack. The move generator mutates a single Rack
|
||||
// in place (removing a tile, recursing, putting it back).
|
||||
type Rack struct {
|
||||
counts []int
|
||||
}
|
||||
|
||||
// New returns an empty rack for an alphabet of the given size.
|
||||
func New(alphabetSize int) Rack {
|
||||
return Rack{counts: make([]int, alphabetSize+1)}
|
||||
}
|
||||
|
||||
func (r Rack) blankIdx() int { return len(r.counts) - 1 }
|
||||
|
||||
// Count returns how many tiles of the given letter index are on the rack.
|
||||
func (r Rack) Count(letter byte) int { return r.counts[letter] }
|
||||
|
||||
// Has reports whether at least one tile of the given letter index is on the rack.
|
||||
func (r Rack) Has(letter byte) bool { return r.counts[letter] > 0 }
|
||||
|
||||
// Blanks returns the number of blank tiles on the rack.
|
||||
func (r Rack) Blanks() int { return r.counts[r.blankIdx()] }
|
||||
|
||||
// Total returns the number of tiles on the rack, blanks included.
|
||||
func (r Rack) Total() int {
|
||||
n := 0
|
||||
for _, c := range r.counts {
|
||||
n += c
|
||||
}
|
||||
return n
|
||||
}
|
||||
|
||||
// Empty reports whether the rack holds no tiles.
|
||||
func (r Rack) Empty() bool { return r.Total() == 0 }
|
||||
|
||||
// Add puts a tile of the given letter index onto the rack.
|
||||
func (r Rack) Add(letter byte) { r.counts[letter]++ }
|
||||
|
||||
// AddBlank puts a blank tile onto the rack.
|
||||
func (r Rack) AddBlank() { r.counts[r.blankIdx()]++ }
|
||||
|
||||
// Remove takes one tile of the given letter index off the rack.
|
||||
func (r Rack) Remove(letter byte) { r.counts[letter]-- }
|
||||
|
||||
// RemoveBlank takes one blank tile off the rack.
|
||||
func (r Rack) RemoveBlank() { r.counts[r.blankIdx()]-- }
|
||||
|
||||
// Clone returns an independent copy of the rack.
|
||||
func (r Rack) Clone() Rack {
|
||||
c := make([]int, len(r.counts))
|
||||
copy(c, r.counts)
|
||||
return Rack{counts: c}
|
||||
}
|
||||
@@ -0,0 +1,51 @@
|
||||
package rack
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestRackBasics(t *testing.T) {
|
||||
r := New(26)
|
||||
if !r.Empty() || r.Total() != 0 {
|
||||
t.Fatal("new rack not empty")
|
||||
}
|
||||
|
||||
r.Add(0) // a
|
||||
r.Add(0)
|
||||
r.Add(2) // c
|
||||
r.AddBlank()
|
||||
|
||||
if r.Count(0) != 2 {
|
||||
t.Errorf("Count(a) = %d, want 2", r.Count(0))
|
||||
}
|
||||
if !r.Has(2) || r.Has(1) {
|
||||
t.Errorf("Has c=%v b=%v, want true,false", r.Has(2), r.Has(1))
|
||||
}
|
||||
if r.Blanks() != 1 {
|
||||
t.Errorf("Blanks = %d, want 1", r.Blanks())
|
||||
}
|
||||
if r.Total() != 4 {
|
||||
t.Errorf("Total = %d, want 4", r.Total())
|
||||
}
|
||||
|
||||
r.Remove(0)
|
||||
if r.Count(0) != 1 {
|
||||
t.Errorf("after Remove, Count(a) = %d, want 1", r.Count(0))
|
||||
}
|
||||
r.RemoveBlank()
|
||||
if r.Blanks() != 0 {
|
||||
t.Errorf("after RemoveBlank, Blanks = %d, want 0", r.Blanks())
|
||||
}
|
||||
}
|
||||
|
||||
func TestRackCloneIndependent(t *testing.T) {
|
||||
r := New(26)
|
||||
r.Add(0)
|
||||
cp := r.Clone()
|
||||
cp.Add(0)
|
||||
cp.AddBlank()
|
||||
if r.Count(0) != 1 || r.Blanks() != 0 {
|
||||
t.Errorf("mutating clone changed original: a=%d blanks=%d", r.Count(0), r.Blanks())
|
||||
}
|
||||
if cp.Count(0) != 2 || cp.Blanks() != 1 {
|
||||
t.Errorf("clone wrong: a=%d blanks=%d", cp.Count(0), cp.Blanks())
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,61 @@
|
||||
package rules
|
||||
|
||||
import "testing"
|
||||
|
||||
func sumCounts(c []int) int {
|
||||
s := 0
|
||||
for _, v := range c {
|
||||
s += v
|
||||
}
|
||||
return s
|
||||
}
|
||||
|
||||
func TestRussianScrabble(t *testing.T) {
|
||||
rs := RussianScrabble()
|
||||
if err := rs.Validate(); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if rs.Size() != 33 {
|
||||
t.Errorf("alphabet size %d, want 33", rs.Size())
|
||||
}
|
||||
if n := sumCounts(rs.Counts); n != 102 || n+rs.Blanks != 104 {
|
||||
t.Errorf("bag = %d letters + %d blanks, want 102+2=104", n, rs.Blanks)
|
||||
}
|
||||
if rs.Bingo != 50 {
|
||||
t.Errorf("bonus %d, want 50", rs.Bingo)
|
||||
}
|
||||
if rs.Premium(7, 7) != DW {
|
||||
t.Errorf("centre premium %d, want DW", rs.Premium(7, 7))
|
||||
}
|
||||
if rs.Values[6] != 3 || rs.Counts[6] != 1 { // ё
|
||||
t.Errorf("ё = value %d count %d, want 3/1", rs.Values[6], rs.Counts[6])
|
||||
}
|
||||
}
|
||||
|
||||
func TestErudit(t *testing.T) {
|
||||
rs := Erudit()
|
||||
if err := rs.Validate(); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if rs.Size() != 33 {
|
||||
t.Errorf("alphabet size %d, want 33", rs.Size())
|
||||
}
|
||||
if n := sumCounts(rs.Counts); n != 128 || n+rs.Blanks != 131 {
|
||||
t.Errorf("bag = %d letters + %d blanks, want 128+3=131", n, rs.Blanks)
|
||||
}
|
||||
if rs.Bingo != 15 {
|
||||
t.Errorf("bonus %d, want 15", rs.Bingo)
|
||||
}
|
||||
if rs.Center != 7*15+7 {
|
||||
t.Errorf("centre index %d, want %d", rs.Center, 7*15+7)
|
||||
}
|
||||
if rs.Premium(7, 7) != None {
|
||||
t.Errorf("centre premium %d, want None (Эрудит centre does not double)", rs.Premium(7, 7))
|
||||
}
|
||||
if rs.Counts[6] != 0 { // ё has no tile
|
||||
t.Errorf("ё count %d, want 0", rs.Counts[6])
|
||||
}
|
||||
if rs.Premium(0, 0) != TW {
|
||||
t.Errorf("corner premium %d, want TW (board otherwise standard)", rs.Premium(0, 0))
|
||||
}
|
||||
}
|
||||
+221
@@ -0,0 +1,221 @@
|
||||
// Package rules describes a Scrabble variant: board geometry, premium-square layout,
|
||||
// the letter alphabet, per-letter tile values and bag counts, blanks, rack size and
|
||||
// the all-tiles bonus. English() returns standard English Scrabble; the Ruleset type
|
||||
// is general enough for other variants such as Russian "Эрудит" (same board, different
|
||||
// tile values/counts and alphabet).
|
||||
package rules
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
)
|
||||
|
||||
// Premium is the bonus kind of a board square.
|
||||
type Premium uint8
|
||||
|
||||
const (
|
||||
None Premium = iota
|
||||
DL // double letter
|
||||
TL // triple letter
|
||||
DW // double word
|
||||
TW // triple word
|
||||
)
|
||||
|
||||
// LetterMult is the multiplier a premium applies to the tile placed on it.
|
||||
func (p Premium) LetterMult() int {
|
||||
switch p {
|
||||
case DL:
|
||||
return 2
|
||||
case TL:
|
||||
return 3
|
||||
default:
|
||||
return 1
|
||||
}
|
||||
}
|
||||
|
||||
// WordMult is the multiplier a premium applies to a word passing through it.
|
||||
func (p Premium) WordMult() int {
|
||||
switch p {
|
||||
case DW:
|
||||
return 2
|
||||
case TW:
|
||||
return 3
|
||||
default:
|
||||
return 1
|
||||
}
|
||||
}
|
||||
|
||||
// Ruleset is a complete description of a Scrabble variant.
|
||||
type Ruleset struct {
|
||||
Name string
|
||||
Rows, Cols int
|
||||
Alphabet alphabet.Indexer // letter alphabet (no separator)
|
||||
Values []int // tile value per letter index; len == Alphabet.Size()
|
||||
Counts []int // bag count per letter index; len == Alphabet.Size()
|
||||
Blanks int // number of blank tiles in the bag
|
||||
RackSize int // tiles drawn to a full rack
|
||||
Bingo int // bonus for using the whole rack in one play
|
||||
Center int // row-major index of the centre square (first-move anchor)
|
||||
premiums []Premium // row-major premium per square
|
||||
}
|
||||
|
||||
// Premium returns the premium of square (r, c).
|
||||
func (rs *Ruleset) Premium(r, c int) Premium { return rs.premiums[r*rs.Cols+c] }
|
||||
|
||||
// PremiumAt returns the premium of the row-major square index i.
|
||||
func (rs *Ruleset) PremiumAt(i int) Premium { return rs.premiums[i] }
|
||||
|
||||
// Size returns the number of letters in the alphabet (excluding blanks).
|
||||
func (rs *Ruleset) Size() int { return rs.Alphabet.Size() }
|
||||
|
||||
// Validate checks that the slices are consistent with the alphabet and board.
|
||||
func (rs *Ruleset) Validate() error {
|
||||
n := rs.Alphabet.Size()
|
||||
if len(rs.Values) != n {
|
||||
return fmt.Errorf("rules %q: %d values for %d letters", rs.Name, len(rs.Values), n)
|
||||
}
|
||||
if len(rs.Counts) != n {
|
||||
return fmt.Errorf("rules %q: %d counts for %d letters", rs.Name, len(rs.Counts), n)
|
||||
}
|
||||
if len(rs.premiums) != rs.Rows*rs.Cols {
|
||||
return fmt.Errorf("rules %q: %d premiums for a %dx%d board", rs.Name, len(rs.premiums), rs.Rows, rs.Cols)
|
||||
}
|
||||
if rs.Center < 0 || rs.Center >= rs.Rows*rs.Cols {
|
||||
return fmt.Errorf("rules %q: centre %d out of range", rs.Name, rs.Center)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// standardBoard is the classic 15x15 premium layout: T=triple word, D=double word,
|
||||
// t=triple letter, d=double letter, .=plain, *=centre (a double word).
|
||||
const standardBoard = `T..d...T...d..T
|
||||
.D...t...t...D.
|
||||
..D...d.d...D..
|
||||
d..D...d...D..d
|
||||
....D.....D....
|
||||
.t...t...t...t.
|
||||
..d...d.d...d..
|
||||
T..d...*...d..T
|
||||
..d...d.d...d..
|
||||
.t...t...t...t.
|
||||
....D.....D....
|
||||
d..D...d...D..d
|
||||
..D...d.d...D..
|
||||
.D...t...t...D.
|
||||
T..d...T...d..T`
|
||||
|
||||
// parsePremiums turns a board template into a premium grid and the centre index.
|
||||
func parsePremiums(s string) (rows, cols int, prem []Premium, center int) {
|
||||
lines := strings.Split(strings.TrimSpace(s), "\n")
|
||||
rows = len(lines)
|
||||
cols = len(strings.TrimRight(lines[0], "\r"))
|
||||
prem = make([]Premium, rows*cols)
|
||||
center = -1
|
||||
for r, line := range lines {
|
||||
line = strings.TrimRight(line, "\r")
|
||||
for c := 0; c < cols && c < len(line); c++ {
|
||||
var p Premium
|
||||
switch line[c] {
|
||||
case 'd':
|
||||
p = DL
|
||||
case 't':
|
||||
p = TL
|
||||
case 'D':
|
||||
p = DW
|
||||
case 'T':
|
||||
p = TW
|
||||
case '*': // centre square, a double word
|
||||
p = DW
|
||||
center = r*cols + c
|
||||
case '+': // centre square with no premium
|
||||
center = r*cols + c
|
||||
}
|
||||
prem[r*cols+c] = p
|
||||
}
|
||||
}
|
||||
return rows, cols, prem, center
|
||||
}
|
||||
|
||||
// FromTemplate builds a ruleset from a premium-layout template (see standardBoard for
|
||||
// the character legend; '+' marks a centre square with no premium). It returns an error
|
||||
// if the resulting ruleset is inconsistent.
|
||||
func FromTemplate(name string, idx alphabet.Indexer, values, counts []int, blanks, rackSize, bingo int, template string) (*Ruleset, error) {
|
||||
rows, cols, prem, center := parsePremiums(template)
|
||||
rs := &Ruleset{
|
||||
Name: name, Rows: rows, Cols: cols, Alphabet: idx,
|
||||
Values: values, Counts: counts,
|
||||
Blanks: blanks, RackSize: rackSize, Bingo: bingo,
|
||||
Center: center, premiums: prem,
|
||||
}
|
||||
if err := rs.Validate(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return rs, nil
|
||||
}
|
||||
|
||||
// English returns the standard English Scrabble ruleset (15x15, the classic premium
|
||||
// layout, English tile values and distribution, 2 blanks, a 7-tile rack and a 50-point
|
||||
// bingo bonus).
|
||||
func English() *Ruleset {
|
||||
rs, err := FromTemplate("English Scrabble", alphabet.Latin(),
|
||||
// a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||
[]int{1, 3, 3, 2, 1, 4, 2, 4, 1, 8, 5, 1, 3, 1, 1, 3, 10, 1, 1, 1, 1, 4, 4, 8, 4, 10},
|
||||
[]int{9, 2, 2, 4, 12, 2, 3, 2, 9, 1, 1, 4, 2, 6, 8, 2, 1, 6, 4, 6, 4, 2, 2, 1, 2, 1},
|
||||
2, 7, 50, standardBoard)
|
||||
if err != nil {
|
||||
panic(err) // a programming error in this package, not a runtime condition
|
||||
}
|
||||
return rs
|
||||
}
|
||||
|
||||
// eruditBoard is the standard 15x15 layout but with a non-doubling centre ('+'), as in
|
||||
// the Russian "Эрудит" variant.
|
||||
const eruditBoard = `T..d...T...d..T
|
||||
.D...t...t...D.
|
||||
..D...d.d...D..
|
||||
d..D...d...D..d
|
||||
....D.....D....
|
||||
.t...t...t...t.
|
||||
..d...d.d...d..
|
||||
T..d...+...d..T
|
||||
..d...d.d...d..
|
||||
.t...t...t...t.
|
||||
....D.....D....
|
||||
d..D...d...D..d
|
||||
..D...d.d...D..
|
||||
.D...t...t...D.
|
||||
T..d...T...d..T`
|
||||
|
||||
// russian returns the embedded 33-letter Russian alphabet (а..я including ё at index 6).
|
||||
func russian() alphabet.Indexer { return alphabet.Embedded(alphabet.Langs.LangRu) }
|
||||
|
||||
// RussianScrabble returns the Russian Scrabble ruleset: the 33-letter alphabet, the
|
||||
// standard board, 2 blanks, a 7-tile rack and a 50-point bonus.
|
||||
func RussianScrabble() *Ruleset {
|
||||
rs, err := FromTemplate("Russian Scrabble", russian(),
|
||||
// а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я
|
||||
[]int{1, 3, 1, 3, 2, 1, 3, 5, 5, 1, 4, 2, 2, 2, 1, 1, 2, 1, 1, 1, 2, 10, 5, 5, 5, 8, 10, 10, 4, 3, 8, 8, 3},
|
||||
[]int{8, 2, 4, 2, 4, 8, 1, 1, 2, 5, 1, 4, 4, 3, 5, 10, 4, 5, 5, 5, 4, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 2},
|
||||
2, 7, 50, standardBoard)
|
||||
if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
return rs
|
||||
}
|
||||
|
||||
// Erudit returns the Russian "Эрудит" ruleset. Ё carries no tiles (count 0); fold Ё→Е
|
||||
// when preparing the dictionary (see wordlist.FoldYo). The centre square does not double
|
||||
// the word, there are 3 blanks (each scoring 0), and the all-tiles bonus is 15.
|
||||
func Erudit() *Ruleset {
|
||||
rs, err := FromTemplate("Эрудит", russian(),
|
||||
// а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я
|
||||
[]int{1, 3, 2, 3, 2, 1, 0, 5, 5, 1, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 3, 10, 5, 10, 5, 10, 10, 10, 5, 5, 10, 10, 3},
|
||||
[]int{10, 3, 5, 3, 5, 9, 0, 2, 2, 8, 4, 6, 4, 5, 8, 10, 6, 6, 6, 5, 3, 1, 2, 1, 2, 1, 1, 1, 2, 2, 1, 1, 3},
|
||||
3, 7, 15, eruditBoard)
|
||||
if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
return rs
|
||||
}
|
||||
@@ -0,0 +1,82 @@
|
||||
package rules
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestEnglishConsistency(t *testing.T) {
|
||||
rs := English()
|
||||
if err := rs.Validate(); err != nil {
|
||||
t.Fatalf("Validate: %v", err)
|
||||
}
|
||||
if rs.Rows != 15 || rs.Cols != 15 {
|
||||
t.Errorf("board = %dx%d, want 15x15", rs.Rows, rs.Cols)
|
||||
}
|
||||
if rs.Size() != 26 {
|
||||
t.Errorf("alphabet size = %d, want 26", rs.Size())
|
||||
}
|
||||
if rs.Center != 7*15+7 {
|
||||
t.Errorf("centre = %d, want %d", rs.Center, 7*15+7)
|
||||
}
|
||||
|
||||
letters := 0
|
||||
for _, c := range rs.Counts {
|
||||
letters += c
|
||||
}
|
||||
if letters != 98 {
|
||||
t.Errorf("sum(Counts) = %d, want 98", letters)
|
||||
}
|
||||
if rs.Blanks != 2 || letters+rs.Blanks != 100 {
|
||||
t.Errorf("bag = %d letters + %d blanks, want 98+2=100", letters, rs.Blanks)
|
||||
}
|
||||
|
||||
points := 0
|
||||
for i := range rs.Values {
|
||||
points += rs.Values[i] * rs.Counts[i]
|
||||
}
|
||||
if points != 187 {
|
||||
t.Errorf("total bag points = %d, want 187", points)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEnglishPremiums(t *testing.T) {
|
||||
rs := English()
|
||||
|
||||
spot := []struct {
|
||||
r, c int
|
||||
want Premium
|
||||
}{
|
||||
{0, 0, TW}, {0, 7, TW}, {0, 14, TW}, {7, 0, TW}, {14, 7, TW},
|
||||
{7, 7, DW}, {1, 1, DW}, {4, 4, DW},
|
||||
{1, 5, TL}, {5, 5, TL}, {9, 9, TL},
|
||||
{0, 3, DL}, {3, 0, DL}, {6, 6, DL},
|
||||
{0, 1, None}, {7, 1, None},
|
||||
}
|
||||
for _, s := range spot {
|
||||
if got := rs.Premium(s.r, s.c); got != s.want {
|
||||
t.Errorf("Premium(%d,%d) = %d, want %d", s.r, s.c, got, s.want)
|
||||
}
|
||||
}
|
||||
|
||||
// Census of premium squares for the standard board.
|
||||
census := map[Premium]int{}
|
||||
for i := range rs.Rows * rs.Cols {
|
||||
census[rs.PremiumAt(i)]++
|
||||
}
|
||||
want := map[Premium]int{None: 164, DL: 24, TL: 12, DW: 17, TW: 8}
|
||||
for p, n := range want {
|
||||
if census[p] != n {
|
||||
t.Errorf("premium %d count = %d, want %d", p, census[p], n)
|
||||
}
|
||||
}
|
||||
|
||||
// The standard board is symmetric under transpose and 180° rotation.
|
||||
for r := range rs.Rows {
|
||||
for c := range rs.Cols {
|
||||
if rs.Premium(r, c) != rs.Premium(c, r) {
|
||||
t.Errorf("not transpose-symmetric at (%d,%d)", r, c)
|
||||
}
|
||||
if rs.Premium(r, c) != rs.Premium(rs.Rows-1-r, rs.Cols-1-c) {
|
||||
t.Errorf("not 180°-symmetric at (%d,%d)", r, c)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,14 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/encoding"
|
||||
)
|
||||
|
||||
// Apply places a move's newly-placed tiles on the board. The move must be legal for the
|
||||
// board (as produced by a generator, or validated); Apply does not re-check it.
|
||||
func Apply(b *board.Board, m Move) {
|
||||
for _, t := range m.Tiles {
|
||||
b.Set(t.Row, t.Col, encoding.Cell(t.Letter, t.Blank))
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,108 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/encoding"
|
||||
)
|
||||
|
||||
// letterSet is a bit set over alphabet letter indexes (alphabets are at most 63
|
||||
// letters, so a uint64 suffices). It encodes a square's cross-set: the letters that,
|
||||
// placed on the square, form a legal perpendicular word.
|
||||
type letterSet uint64
|
||||
|
||||
func (s letterSet) has(l byte) bool { return s&(letterSet(1)<<l) != 0 }
|
||||
|
||||
// fullSet is the cross-set of a square with no perpendicular neighbours: every letter
|
||||
// is allowed.
|
||||
func fullSet(size int) letterSet { return letterSet(uint64(1)<<uint(size)) - 1 }
|
||||
|
||||
// columnContext returns the contiguous run of filled cells immediately above and below
|
||||
// the empty square (r, c), each read top to bottom, as alphabet letter indexes. These
|
||||
// are the tiles a perpendicular (vertical) word through (r, c) would include.
|
||||
func columnContext(b *board.Board, r, c int) (above, below []byte) {
|
||||
start := r
|
||||
for start-1 >= 0 && b.Filled(start-1, c) {
|
||||
start--
|
||||
}
|
||||
for rr := start; rr < r; rr++ {
|
||||
above = append(above, encoding.Letter(b.At(rr, c)))
|
||||
}
|
||||
|
||||
end := r
|
||||
for end+1 < b.Rows() && b.Filled(end+1, c) {
|
||||
end++
|
||||
}
|
||||
for rr := r + 1; rr <= end; rr++ {
|
||||
below = append(below, encoding.Letter(b.At(rr, c)))
|
||||
}
|
||||
return above, below
|
||||
}
|
||||
|
||||
// completers returns the letters X (< size) that complete a word when followed from
|
||||
// state: those whose arc leads directly to an accepting node. It is a single arc
|
||||
// enumeration — the deterministic cross-set primitive.
|
||||
func completers(cur *dawg.Cursor, state dawg.Node, size int) letterSet {
|
||||
var set letterSet
|
||||
lim := byte(size)
|
||||
cur.Arcs(state, func(a dawg.Arc) bool {
|
||||
if a.Final && a.Label < lim {
|
||||
set |= letterSet(1) << a.Label
|
||||
}
|
||||
return true
|
||||
})
|
||||
return set
|
||||
}
|
||||
|
||||
// walk follows word left to right from the cursor's root.
|
||||
func walk(cur *dawg.Cursor, word []byte) (dawg.Node, bool) {
|
||||
n := cur.Root()
|
||||
for _, l := range word {
|
||||
var ok bool
|
||||
if n, _, ok = cur.Next(n, l); !ok {
|
||||
return n, false
|
||||
}
|
||||
}
|
||||
return n, true
|
||||
}
|
||||
|
||||
// dawgCrossSet returns the letters X for which above·X·below is a stored word. A right
|
||||
// extension (no tiles below) is deterministic — X just completes the prefix above. A
|
||||
// left extension (tiles below) is non-deterministic and must probe each X.
|
||||
func dawgCrossSet(cur *dawg.Cursor, above, below []byte, size int) letterSet {
|
||||
switch {
|
||||
case len(above) == 0 && len(below) == 0:
|
||||
return fullSet(size)
|
||||
case len(below) == 0:
|
||||
node, ok := walk(cur, above)
|
||||
if !ok {
|
||||
return 0
|
||||
}
|
||||
return completers(cur, node, size)
|
||||
default:
|
||||
node := cur.Root()
|
||||
if len(above) > 0 {
|
||||
var ok bool
|
||||
if node, ok = walk(cur, above); !ok {
|
||||
return 0
|
||||
}
|
||||
}
|
||||
var set letterSet
|
||||
for x := range size {
|
||||
m, final, ok := cur.Next(node, byte(x))
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, l := range below {
|
||||
if m, final, ok = cur.Next(m, l); !ok {
|
||||
break
|
||||
}
|
||||
}
|
||||
if ok && final {
|
||||
set |= letterSet(1) << uint(x)
|
||||
}
|
||||
}
|
||||
return set
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,75 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
)
|
||||
|
||||
func bruteCrossSet(words [][]byte, above, below []byte, size int) letterSet {
|
||||
set := make(map[string]bool, len(words))
|
||||
for _, w := range words {
|
||||
set[string(w)] = true
|
||||
}
|
||||
var out letterSet
|
||||
for x := range size {
|
||||
w := make([]byte, 0, len(above)+1+len(below))
|
||||
w = append(w, above...)
|
||||
w = append(w, byte(x))
|
||||
w = append(w, below...)
|
||||
if set[string(w)] {
|
||||
out |= letterSet(1) << uint(x)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func TestDAWGCrossSetMatchesBruteForce(t *testing.T) {
|
||||
const size = 26
|
||||
words := wordlist.Encode(
|
||||
[]string{"cat", "cot", "cut", "cap", "cab", "at", "it"},
|
||||
alphabet.Latin(), 2, 15)
|
||||
|
||||
finder, err := dictdawg.Build(alphabet.Latin(), words)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
cur, err := dawg.NewCursor(finder)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
cases := []struct {
|
||||
name string
|
||||
above, below []byte
|
||||
}{
|
||||
{"c_t", []byte{2}, []byte{19}}, // expect {a,o,u}
|
||||
{"_t", nil, []byte{19}}, // expect {a,i}
|
||||
{"c_", []byte{2}, nil}, // expect {} (no two-letter c-words)
|
||||
{"a_t", []byte{0}, []byte{19}}, // expect {}
|
||||
}
|
||||
for _, tc := range cases {
|
||||
want := bruteCrossSet(words, tc.above, tc.below, size)
|
||||
if got := dawgCrossSet(cur, tc.above, tc.below, size); got != want {
|
||||
t.Errorf("%s: dawgCrossSet = %026b, want %026b", tc.name, got, want)
|
||||
}
|
||||
}
|
||||
|
||||
// c_t must be exactly {a(0), o(14), u(20)}.
|
||||
want := letterSet(0)
|
||||
for _, x := range []byte{0, 14, 20} {
|
||||
want |= letterSet(1) << x
|
||||
}
|
||||
if got := dawgCrossSet(cur, []byte{2}, []byte{19}, size); got != want {
|
||||
t.Errorf("c_t cross-set = %026b, want {a,o,u} = %026b", got, want)
|
||||
}
|
||||
|
||||
// No perpendicular neighbours: every letter is allowed.
|
||||
if got := dawgCrossSet(cur, nil, nil, size); got != fullSet(size) {
|
||||
t.Errorf("empty context = %026b, want full", got)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,187 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
// TestScoreRealGames replays real tournament games recorded in GCG format and checks that
|
||||
// our scoring reproduces, move for move, the score and running total written in the
|
||||
// protocol. This validates the scoring engine against canonical play, not invented cases.
|
||||
//
|
||||
// The games come from cross-tables.com (annotated self-play) and are stored under
|
||||
// testdata/. They use the standard English board and SOWPODS, so the test loads the
|
||||
// committed dawg/en_sowpods.dawg (build it with `make dawg`).
|
||||
func TestScoreRealGames(t *testing.T) {
|
||||
finder, err := dictdawg.Load("../dawg/en_sowpods.dawg")
|
||||
if err != nil {
|
||||
t.Skipf("need dawg/en_sowpods.dawg (run `make dawg`): %v", err)
|
||||
}
|
||||
s := NewSolver(rules.English(), finder)
|
||||
games, _ := filepath.Glob("testdata/*.gcg")
|
||||
if len(games) == 0 {
|
||||
t.Fatal("no GCG games in testdata/")
|
||||
}
|
||||
for _, g := range games {
|
||||
t.Run(filepath.Base(g), func(t *testing.T) { replayGCG(t, s, g) })
|
||||
}
|
||||
}
|
||||
|
||||
// parsePos decodes a GCG coordinate into a 0-based square and orientation. A leading digit
|
||||
// means an across (horizontal) play ("11J"), a leading letter means a down (vertical) one
|
||||
// ("H7"); rows are 1..15 and columns A..O.
|
||||
func parsePos(p string) (row, col int, dir Direction, ok bool) {
|
||||
if len(p) < 2 {
|
||||
return 0, 0, 0, false
|
||||
}
|
||||
if p[0] >= '1' && p[0] <= '9' { // number first -> horizontal (across)
|
||||
i := 0
|
||||
for i < len(p) && p[i] >= '0' && p[i] <= '9' {
|
||||
i++
|
||||
}
|
||||
if i+1 != len(p) || p[i] < 'A' || p[i] > 'O' {
|
||||
return 0, 0, 0, false
|
||||
}
|
||||
n, _ := strconv.Atoi(p[:i])
|
||||
return n - 1, int(p[i] - 'A'), Horizontal, true
|
||||
}
|
||||
if p[0] >= 'A' && p[0] <= 'O' { // letter first -> vertical (down)
|
||||
for i := 1; i < len(p); i++ {
|
||||
if p[i] < '0' || p[i] > '9' {
|
||||
return 0, 0, 0, false
|
||||
}
|
||||
}
|
||||
n, _ := strconv.Atoi(p[1:])
|
||||
return n - 1, int(p[0] - 'A'), Vertical, true
|
||||
}
|
||||
return 0, 0, 0, false
|
||||
}
|
||||
|
||||
// parseRack splits a GCG rack ("ACEILRT", "?GOORRS") into lowercase letters and a blank
|
||||
// count, ready for makeRack.
|
||||
func parseRack(s string) (string, int) {
|
||||
var letters []rune
|
||||
blanks := 0
|
||||
for _, ch := range s {
|
||||
switch {
|
||||
case ch == '?':
|
||||
blanks++
|
||||
case ch >= 'A' && ch <= 'Z':
|
||||
letters = append(letters, ch+('a'-'A'))
|
||||
case ch >= 'a' && ch <= 'z':
|
||||
letters = append(letters, ch)
|
||||
}
|
||||
}
|
||||
return string(letters), blanks
|
||||
}
|
||||
|
||||
// parseWord turns a GCG word into the newly-placed tiles starting at (row,col) along dir.
|
||||
// "." marks an existing played-through tile (skipped); a lowercase letter is a blank.
|
||||
func parseWord(word string, row, col int, dir Direction) []Placement {
|
||||
var ts []Placement
|
||||
for _, ch := range word {
|
||||
if ch != '.' {
|
||||
switch {
|
||||
case ch >= 'A' && ch <= 'Z':
|
||||
ts = append(ts, Placement{Row: row, Col: col, Letter: byte(ch - 'A')})
|
||||
case ch >= 'a' && ch <= 'z':
|
||||
ts = append(ts, Placement{Row: row, Col: col, Letter: byte(ch - 'a'), Blank: true})
|
||||
}
|
||||
}
|
||||
if dir == Horizontal {
|
||||
col++
|
||||
} else {
|
||||
row++
|
||||
}
|
||||
}
|
||||
return ts
|
||||
}
|
||||
|
||||
func replayGCG(t *testing.T, s *Solver, path string) {
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
b := board.New(s.rules.Rows, s.rules.Cols)
|
||||
total := map[string]int{}
|
||||
var last Move // the last applied play, undone by a phony withdrawal ("--")
|
||||
plays := 0
|
||||
sc := bufio.NewScanner(f)
|
||||
for sc.Scan() {
|
||||
line := sc.Text()
|
||||
if !strings.HasPrefix(line, ">") {
|
||||
continue // pragma or note
|
||||
}
|
||||
colon := strings.Index(line, ":")
|
||||
player := line[1:colon]
|
||||
toks := strings.Fields(line[colon+1:])
|
||||
if len(toks) < 2 {
|
||||
continue
|
||||
}
|
||||
|
||||
// score = the +N/-N token; cumulative = the last token.
|
||||
want, _ := strconv.Atoi(strings.TrimPrefix(toks[len(toks)-2], "+"))
|
||||
cumul, _ := strconv.Atoi(toks[len(toks)-1])
|
||||
|
||||
switch row, col, dir, ok := parsePos(toks[1]); {
|
||||
case ok: // a regular play: RACK POS WORD +SCORE CUMUL
|
||||
ts := parseWord(toks[2], row, col, dir)
|
||||
m, err := s.ScorePlay(b, dir, ts)
|
||||
if err != nil {
|
||||
t.Fatalf("%s: ScorePlay %q at %s: %v", path, toks[2], toks[1], err)
|
||||
}
|
||||
if m.Score != want {
|
||||
t.Errorf("%s: %q at %s scored %d, want %d", path, toks[2], toks[1], m.Score, want)
|
||||
}
|
||||
// A dictionary-valid play must also be produced by the generator from the
|
||||
// player's rack; phonies (not in SOWPODS) are correctly never generated.
|
||||
if _, verr := s.ValidatePlay(b, dir, ts); verr == nil {
|
||||
key, found := moveKey(dir, ts), false
|
||||
for _, mv := range s.GenerateMoves(b, makeRack(parseRack(toks[0])), Both) {
|
||||
if mv.Key() == key {
|
||||
found = true
|
||||
if mv.Score != want {
|
||||
t.Errorf("%s: generated %q at %s scored %d, want %d", path, toks[2], toks[1], mv.Score, want)
|
||||
}
|
||||
break
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Errorf("%s: generator did not produce %q at %s from rack %s", path, toks[2], toks[1], toks[0])
|
||||
}
|
||||
}
|
||||
Apply(b, m)
|
||||
last = m
|
||||
total[player] += m.Score
|
||||
plays++
|
||||
case toks[1] == "--": // a challenged-off phony: undo the previous play
|
||||
for _, p := range last.Tiles {
|
||||
b.Set(p.Row, p.Col, 0)
|
||||
}
|
||||
last = Move{}
|
||||
total[player] += want
|
||||
default: // pass, exchange, challenge bonus, time penalty, end-game rack adjustment
|
||||
total[player] += want
|
||||
}
|
||||
if total[player] != cumul {
|
||||
t.Errorf("%s: %s running total %d, want %d (after %q)", path, player, total[player], cumul, line)
|
||||
}
|
||||
}
|
||||
if err := sc.Err(); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if plays == 0 {
|
||||
t.Fatalf("%s: no plays parsed", path)
|
||||
}
|
||||
t.Logf("%s: %d scored plays, final totals %v", path, plays, total)
|
||||
}
|
||||
@@ -0,0 +1,56 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/rack"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
// generateBoth runs an across-generator on the board (for horizontal plays) and on its
|
||||
// transpose (for vertical plays), as selected by mode, then scores and de-duplicates the
|
||||
// results. runAcross reports placements in the coordinates of the board it is given; for
|
||||
// the transpose pass they are mapped back to the real board.
|
||||
func generateBoth(b *board.Board, rs *rules.Ruleset, rk rack.Rack, mode Mode,
|
||||
runAcross func(bd *board.Board, rk rack.Rack, emit func([]Placement))) []Move {
|
||||
|
||||
rk = rk.Clone() // generation mutates the rack in place and restores it
|
||||
var moves []Move
|
||||
seen := make(map[string]struct{})
|
||||
emit := func(dir Direction, placements []Placement) {
|
||||
key := moveKey(dir, placements)
|
||||
if _, dup := seen[key]; dup {
|
||||
return
|
||||
}
|
||||
m, err := Evaluate(b, rs, dir, placements)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
moves = append(moves, m)
|
||||
}
|
||||
|
||||
if mode.Includes(Horizontal) {
|
||||
runAcross(b, rk, func(p []Placement) { emit(Horizontal, p) })
|
||||
}
|
||||
if mode.Includes(Vertical) {
|
||||
tb := b.Transpose()
|
||||
runAcross(tb, rk, func(p []Placement) {
|
||||
rp := make([]Placement, len(p))
|
||||
for i, pl := range p {
|
||||
rp[i] = Placement{Row: pl.Col, Col: pl.Row, Letter: pl.Letter, Blank: pl.Blank}
|
||||
}
|
||||
emit(Vertical, rp)
|
||||
})
|
||||
}
|
||||
return moves
|
||||
}
|
||||
|
||||
// centerFor returns the centre square in bd's coordinates. bd is either the real board
|
||||
// or its transpose; the ruleset stores the centre on the real board.
|
||||
func centerFor(bd *board.Board, rs *rules.Ruleset) (row, col int) {
|
||||
r, c := rs.Center/rs.Cols, rs.Center%rs.Cols
|
||||
if bd.Rows() == rs.Rows && bd.Cols() == rs.Cols {
|
||||
return r, c
|
||||
}
|
||||
return c, r // transposed
|
||||
}
|
||||
@@ -0,0 +1,221 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/encoding"
|
||||
"scrabble-solver/rack"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
// DAWGGenerator generates moves with the Appel-Jacobson two-phase algorithm
|
||||
// (LeftPart then ExtendRight) over a plain left-to-right DAWG.
|
||||
type DAWGGenerator struct {
|
||||
rules *rules.Ruleset
|
||||
finder dawg.Finder
|
||||
}
|
||||
|
||||
// NewDAWGGenerator builds a DAWG generator for the ruleset over the dictionary finder.
|
||||
func NewDAWGGenerator(rs *rules.Ruleset, finder dawg.Finder) *DAWGGenerator {
|
||||
return &DAWGGenerator{rules: rs, finder: finder}
|
||||
}
|
||||
|
||||
// Name identifies the generator.
|
||||
func (g *DAWGGenerator) Name() string { return "dawg" }
|
||||
|
||||
// GenerateMoves returns every legal play for rk on b in the modes' orientations.
|
||||
func (g *DAWGGenerator) GenerateMoves(b *board.Board, rk rack.Rack, mode Mode) []Move {
|
||||
return generateBoth(b, g.rules, rk, mode, g.runAcross)
|
||||
}
|
||||
|
||||
// tileInfo is a tentatively placed left-part tile (its column is fixed only once the
|
||||
// left part's length is known, at record time).
|
||||
type tileInfo struct {
|
||||
letter byte
|
||||
blank bool
|
||||
}
|
||||
|
||||
// acrossGen carries the state of one across-generation pass over a board.
|
||||
type acrossGen struct {
|
||||
bd *board.Board
|
||||
cur *dawg.Cursor
|
||||
rs *rules.Ruleset
|
||||
rk rack.Rack
|
||||
size int
|
||||
cross func(r, c int) letterSet
|
||||
emit func(placements []Placement) // placements in bd's coordinates
|
||||
|
||||
row int
|
||||
left []tileInfo // left-part tiles, in word (left-to-right) order
|
||||
right []Placement // right-part tiles, with their columns
|
||||
}
|
||||
|
||||
// runAcross generates all across plays on bd (cross-sets are computed as vertical words
|
||||
// on bd) and reports each via emit in bd's coordinates.
|
||||
func (g *DAWGGenerator) runAcross(bd *board.Board, rk rack.Rack, emit func([]Placement)) {
|
||||
cur, err := dawg.NewCursor(g.finder)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
size := g.rules.Size()
|
||||
|
||||
cross := make([]letterSet, bd.Rows()*bd.Cols())
|
||||
known := make([]bool, bd.Rows()*bd.Cols())
|
||||
crossFn := func(r, c int) letterSet {
|
||||
i := r*bd.Cols() + c
|
||||
if !known[i] {
|
||||
above, below := columnContext(bd, r, c)
|
||||
cross[i] = dawgCrossSet(cur, above, below, size)
|
||||
known[i] = true
|
||||
}
|
||||
return cross[i]
|
||||
}
|
||||
|
||||
ag := &acrossGen{bd: bd, cur: cur, rs: g.rules, rk: rk, size: size, cross: crossFn, emit: emit}
|
||||
|
||||
firstMove := bd.IsEmpty()
|
||||
centerRow, centerCol := centerFor(bd, g.rules)
|
||||
for row := range bd.Rows() {
|
||||
ag.generateRow(row, firstMove, centerRow, centerCol)
|
||||
}
|
||||
}
|
||||
|
||||
func (g *acrossGen) generateRow(row int, firstMove bool, centerRow, centerCol int) {
|
||||
g.row = row
|
||||
limit := 0
|
||||
for col := range g.bd.Cols() {
|
||||
if !g.bd.Empty(row, col) {
|
||||
limit = 0
|
||||
continue
|
||||
}
|
||||
anchor := false
|
||||
if firstMove {
|
||||
anchor = row == centerRow && col == centerCol
|
||||
} else {
|
||||
anchor = g.hasFilledNeighbor(row, col)
|
||||
}
|
||||
if !anchor {
|
||||
limit++
|
||||
continue
|
||||
}
|
||||
g.left = g.left[:0]
|
||||
g.right = g.right[:0]
|
||||
if col > 0 && g.bd.Filled(row, col-1) {
|
||||
if node, ok := g.walkPrefix(row, col); ok {
|
||||
g.extendRight(node, col, col)
|
||||
}
|
||||
} else {
|
||||
g.leftPart(g.cur.Root(), col, limit)
|
||||
}
|
||||
limit = 0
|
||||
}
|
||||
}
|
||||
|
||||
func (g *acrossGen) hasFilledNeighbor(r, c int) bool {
|
||||
return g.bd.Filled(r-1, c) || g.bd.Filled(r+1, c) || g.bd.Filled(r, c-1) || g.bd.Filled(r, c+1)
|
||||
}
|
||||
|
||||
// walkPrefix walks the DAWG through the contiguous filled run ending at col-1, returning
|
||||
// the node reached and whether that prefix exists in the dictionary.
|
||||
func (g *acrossGen) walkPrefix(row, col int) (dawg.Node, bool) {
|
||||
start := col - 1
|
||||
for start-1 >= 0 && g.bd.Filled(row, start-1) {
|
||||
start--
|
||||
}
|
||||
node := g.cur.Root()
|
||||
for c := start; c < col; c++ {
|
||||
var ok bool
|
||||
node, _, ok = g.cur.Next(node, encoding.Letter(g.bd.At(row, c)))
|
||||
if !ok {
|
||||
return node, false
|
||||
}
|
||||
}
|
||||
return node, true
|
||||
}
|
||||
|
||||
// leftPart places left-part tiles from the rack (up to limit, on the empty squares left
|
||||
// of the anchor), calling extendRight after each prefix.
|
||||
func (g *acrossGen) leftPart(node dawg.Node, anchorCol, limit int) {
|
||||
g.extendRight(node, anchorCol, anchorCol)
|
||||
if limit == 0 {
|
||||
return
|
||||
}
|
||||
g.cur.Arcs(node, func(a dawg.Arc) bool {
|
||||
l := a.Label
|
||||
if g.rk.Has(l) {
|
||||
g.rk.Remove(l)
|
||||
g.left = append(g.left, tileInfo{letter: l})
|
||||
g.leftPart(a.Dest, anchorCol, limit-1)
|
||||
g.left = g.left[:len(g.left)-1]
|
||||
g.rk.Add(l)
|
||||
}
|
||||
if g.rk.Blanks() > 0 {
|
||||
g.rk.RemoveBlank()
|
||||
g.left = append(g.left, tileInfo{letter: l, blank: true})
|
||||
g.leftPart(a.Dest, anchorCol, limit-1)
|
||||
g.left = g.left[:len(g.left)-1]
|
||||
g.rk.AddBlank()
|
||||
}
|
||||
return true
|
||||
})
|
||||
}
|
||||
|
||||
// extendRight extends the word rightward from col, placing rack tiles on empty squares
|
||||
// (constrained by cross-sets) and following tiles already on the board. A word is
|
||||
// recorded only past the anchor, so the play covers the anchor square.
|
||||
func (g *acrossGen) extendRight(node dawg.Node, col, anchorCol int) {
|
||||
if col >= g.bd.Cols() {
|
||||
if col > anchorCol && g.cur.Final(node) {
|
||||
g.record(anchorCol)
|
||||
}
|
||||
return
|
||||
}
|
||||
if !g.bd.Empty(g.row, col) {
|
||||
if dest, _, ok := g.cur.Next(node, encoding.Letter(g.bd.At(g.row, col))); ok {
|
||||
g.extendRight(dest, col+1, anchorCol)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
if col > anchorCol && g.cur.Final(node) {
|
||||
g.record(anchorCol)
|
||||
}
|
||||
cross := g.cross(g.row, col)
|
||||
g.cur.Arcs(node, func(a dawg.Arc) bool {
|
||||
l := a.Label
|
||||
if !cross.has(l) {
|
||||
return true
|
||||
}
|
||||
if g.rk.Has(l) {
|
||||
g.rk.Remove(l)
|
||||
g.right = append(g.right, Placement{Row: g.row, Col: col, Letter: l})
|
||||
g.extendRight(a.Dest, col+1, anchorCol)
|
||||
g.right = g.right[:len(g.right)-1]
|
||||
g.rk.Add(l)
|
||||
}
|
||||
if g.rk.Blanks() > 0 {
|
||||
g.rk.RemoveBlank()
|
||||
g.right = append(g.right, Placement{Row: g.row, Col: col, Letter: l, Blank: true})
|
||||
g.extendRight(a.Dest, col+1, anchorCol)
|
||||
g.right = g.right[:len(g.right)-1]
|
||||
g.rk.AddBlank()
|
||||
}
|
||||
return true
|
||||
})
|
||||
}
|
||||
|
||||
// record assembles the play's placements (left part at fixed columns, then the right
|
||||
// part) and reports it. It skips plays that lay no new tile.
|
||||
func (g *acrossGen) record(anchorCol int) {
|
||||
if len(g.left)+len(g.right) == 0 {
|
||||
return
|
||||
}
|
||||
placements := make([]Placement, 0, len(g.left)+len(g.right))
|
||||
leftStart := anchorCol - len(g.left)
|
||||
for i, t := range g.left {
|
||||
placements = append(placements, Placement{Row: g.row, Col: leftStart + i, Letter: t.letter, Blank: t.blank})
|
||||
}
|
||||
placements = append(placements, g.right...)
|
||||
g.emit(placements)
|
||||
}
|
||||
@@ -0,0 +1,121 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/encoding"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
"scrabble-solver/rack"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
func makeRack(letters string, blanks int) rack.Rack {
|
||||
r := rack.New(26)
|
||||
for i := range len(letters) {
|
||||
r.Add(letters[i] - 'a')
|
||||
}
|
||||
for range blanks {
|
||||
r.AddBlank()
|
||||
}
|
||||
return r
|
||||
}
|
||||
|
||||
func placeWord(b *board.Board, r, c int, dir Direction, word string) {
|
||||
for i := range len(word) {
|
||||
rr, cc := r, c+i
|
||||
if dir == Vertical {
|
||||
rr, cc = r+i, c
|
||||
}
|
||||
b.Set(rr, cc, encoding.Cell(word[i]-'a', false))
|
||||
}
|
||||
}
|
||||
|
||||
func genMoves(moves []Move) map[string]Move {
|
||||
out := make(map[string]Move, len(moves))
|
||||
for _, m := range moves {
|
||||
out[moveKey(m.Dir, m.Tiles)] = m
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// testWords is a small lexicon with enough overlaps to form across and cross plays.
|
||||
var testWords = []string{
|
||||
"aa", "ace", "act", "arc", "are", "art", "as", "at", "ate",
|
||||
"cab", "cap", "car", "care", "cars", "cart", "cat", "cats", "cot",
|
||||
"oat", "oats", "ta", "tar", "tare", "tat", "tea", "teat",
|
||||
}
|
||||
|
||||
func compareToBrute(t *testing.T, name string, gen Generator, b *board.Board, d dict, rk rack.Rack, mode Mode) {
|
||||
t.Helper()
|
||||
want := bruteForce(b, plainRulesShared, d, rk, mode)
|
||||
got := genMoves(gen.GenerateMoves(b, rk, mode))
|
||||
|
||||
for k, wm := range want {
|
||||
gm, ok := got[k]
|
||||
if !ok {
|
||||
t.Errorf("%s [%s]: %s missing %s (score %d)", name, gen.Name(), gen.Name(), k, wm.Score)
|
||||
continue
|
||||
}
|
||||
if gm.Score != wm.Score {
|
||||
t.Errorf("%s [%s]: %s score %d, want %d", name, gen.Name(), k, gm.Score, wm.Score)
|
||||
}
|
||||
}
|
||||
for k := range got {
|
||||
if _, ok := want[k]; !ok {
|
||||
t.Errorf("%s [%s]: extra move %s", name, gen.Name(), k)
|
||||
}
|
||||
}
|
||||
if len(got) != len(want) {
|
||||
t.Errorf("%s [%s]: %d moves, oracle has %d", name, gen.Name(), len(got), len(want))
|
||||
}
|
||||
}
|
||||
|
||||
func mustPlainRules() *rules.Ruleset {
|
||||
eng := rules.English()
|
||||
rs, err := rules.FromTemplate("plain7", eng.Alphabet, eng.Values, eng.Counts, 2, 7, 50, plain7)
|
||||
if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
return rs
|
||||
}
|
||||
|
||||
var plainRulesShared = mustPlainRules()
|
||||
|
||||
type scenario struct {
|
||||
name string
|
||||
setup func(*board.Board)
|
||||
rack rack.Rack
|
||||
mode Mode
|
||||
}
|
||||
|
||||
func genScenarios() []scenario {
|
||||
return []scenario{
|
||||
{"first move", func(*board.Board) {}, makeRack("cat", 0), Both},
|
||||
{"first move blank", func(*board.Board) {}, makeRack("ca", 1), Both},
|
||||
{"extend cat", func(b *board.Board) { placeWord(b, 3, 1, Horizontal, "cat") }, makeRack("srs", 0), Both},
|
||||
{"cross cat", func(b *board.Board) { placeWord(b, 1, 3, Horizontal, "cat") }, makeRack("aort", 0), Both},
|
||||
{"only horizontal", func(b *board.Board) { placeWord(b, 3, 1, Horizontal, "cat") }, makeRack("aser", 0), OnlyHorizontal},
|
||||
{"only vertical", func(b *board.Board) { placeWord(b, 1, 3, Vertical, "cat") }, makeRack("aser", 0), OnlyVertical},
|
||||
}
|
||||
}
|
||||
|
||||
func TestDAWGGeneratorVsBruteForce(t *testing.T) {
|
||||
rs := plainRulesShared
|
||||
words := wordlist.Encode(testWords, alphabet.Latin(), 2, 15)
|
||||
f, err := dictdawg.Build(alphabet.Latin(), words)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
gen := NewDAWGGenerator(rs, f)
|
||||
d := makeDict(words)
|
||||
|
||||
for _, c := range genScenarios() {
|
||||
b := board.New(rs.Rows, rs.Cols)
|
||||
c.setup(b)
|
||||
compareToBrute(t, c.name, gen, b, d, c.rack, c.mode)
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/rack"
|
||||
)
|
||||
|
||||
// Generator produces every legal play for a position. The DAWG generator
|
||||
// (Appel-Jacobson) is the implementation; the interface keeps the self-play engine and
|
||||
// the solver decoupled from the concrete type.
|
||||
type Generator interface {
|
||||
// GenerateMoves returns every legal play for rack r on board b in the modes'
|
||||
// orientations. The result is unsorted; callers (or the Solver) rank it.
|
||||
GenerateMoves(b *board.Board, r rack.Rack, mode Mode) []Move
|
||||
|
||||
// Name identifies the generator (e.g. "dawg").
|
||||
Name() string
|
||||
}
|
||||
@@ -0,0 +1,36 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// moveKey is a canonical string identifying a play (direction plus its placed tiles),
|
||||
// used to de-duplicate and compare generated moves.
|
||||
func moveKey(dir Direction, p []Placement) string {
|
||||
ps := append([]Placement(nil), p...)
|
||||
sort.Slice(ps, func(i, j int) bool {
|
||||
if ps[i].Row != ps[j].Row {
|
||||
return ps[i].Row < ps[j].Row
|
||||
}
|
||||
return ps[i].Col < ps[j].Col
|
||||
})
|
||||
var sb strings.Builder
|
||||
sb.WriteByte('0' + byte(dir))
|
||||
for _, pl := range ps {
|
||||
sb.WriteByte(';')
|
||||
sb.WriteString(strconv.Itoa(pl.Row))
|
||||
sb.WriteByte(',')
|
||||
sb.WriteString(strconv.Itoa(pl.Col))
|
||||
sb.WriteByte(',')
|
||||
sb.WriteString(strconv.Itoa(int(pl.Letter)))
|
||||
if pl.Blank {
|
||||
sb.WriteByte('*')
|
||||
}
|
||||
}
|
||||
return sb.String()
|
||||
}
|
||||
|
||||
// Key returns the canonical identifier of the move (direction plus its placed tiles).
|
||||
func (m Move) Key() string { return moveKey(m.Dir, m.Tiles) }
|
||||
@@ -0,0 +1,74 @@
|
||||
// Package scrabble is the public library: it builds a move generator over a dictionary
|
||||
// and a ruleset, generates every legal play for a position ranked by score, and scores
|
||||
// or validates arbitrary plays. The generator is the DAWG algorithm (Appel-Jacobson).
|
||||
package scrabble
|
||||
|
||||
// Direction is the orientation of a play's main word.
|
||||
type Direction uint8
|
||||
|
||||
const (
|
||||
// Horizontal is an across play (left to right along a row).
|
||||
Horizontal Direction = iota
|
||||
// Vertical is a down play (top to bottom along a column).
|
||||
Vertical
|
||||
)
|
||||
|
||||
// String renders the direction for diagnostics.
|
||||
func (d Direction) String() string {
|
||||
if d == Vertical {
|
||||
return "vertical"
|
||||
}
|
||||
return "horizontal"
|
||||
}
|
||||
|
||||
// Mode selects which orientations GenerateMoves produces. Russian "Эрудит" requires a
|
||||
// single orientation per turn, which OnlyHorizontal / OnlyVertical express.
|
||||
type Mode uint8
|
||||
|
||||
const (
|
||||
// Both generates across plays (on the board) and down plays (on its transpose).
|
||||
Both Mode = iota
|
||||
// OnlyHorizontal generates across plays only.
|
||||
OnlyHorizontal
|
||||
// OnlyVertical generates down plays only.
|
||||
OnlyVertical
|
||||
)
|
||||
|
||||
// Includes reports whether the mode produces plays in direction d.
|
||||
func (m Mode) Includes(d Direction) bool {
|
||||
switch m {
|
||||
case Both:
|
||||
return true
|
||||
case OnlyHorizontal:
|
||||
return d == Horizontal
|
||||
case OnlyVertical:
|
||||
return d == Vertical
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// Placement is a single newly-placed tile.
|
||||
type Placement struct {
|
||||
Row, Col int
|
||||
Letter byte // alphabet letter index
|
||||
Blank bool // placed from a blank tile, so it scores 0
|
||||
}
|
||||
|
||||
// Word is a word formed by a play, with its location and score.
|
||||
type Word struct {
|
||||
Row, Col int // square of the word's first letter
|
||||
Dir Direction // orientation of the word
|
||||
Letters []byte // alphabet indices of the whole word (existing + new tiles)
|
||||
Blanks []bool // per letter: true if that tile is a blank (scores 0)
|
||||
Score int // the word's score, with premiums from newly-placed tiles
|
||||
}
|
||||
|
||||
// Move is a complete legal play with a full scoring breakdown.
|
||||
type Move struct {
|
||||
Dir Direction // orientation of the main word
|
||||
Tiles []Placement // the newly-placed tiles, in main-word order
|
||||
Main Word // the main word formed along Dir
|
||||
Cross []Word // perpendicular words formed by the new tiles
|
||||
Bonus int // all-tiles (bingo) bonus included in Score, or 0
|
||||
Score int // total: Main.Score + Σ Cross.Score + Bonus
|
||||
}
|
||||
@@ -0,0 +1,147 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/rack"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
// dict is a membership set of words (alphabet-index strings) for the oracle.
|
||||
type dict map[string]bool
|
||||
|
||||
func makeDict(words [][]byte) dict {
|
||||
d := make(dict, len(words))
|
||||
for _, w := range words {
|
||||
d[string(w)] = true
|
||||
}
|
||||
return d
|
||||
}
|
||||
|
||||
func (d dict) has(letters []byte) bool { return d[string(letters)] }
|
||||
|
||||
func lineCoord(dir Direction, line, axis int) (r, c int) {
|
||||
if dir == Horizontal {
|
||||
return line, axis
|
||||
}
|
||||
return axis, line
|
||||
}
|
||||
|
||||
func cellFilled(b *board.Board, dir Direction, line, axis int) bool {
|
||||
r, c := lineCoord(dir, line, axis)
|
||||
return b.Filled(r, c)
|
||||
}
|
||||
|
||||
func coversCenter(dir Direction, line, start, end, cr, cc int) bool {
|
||||
if dir == Horizontal {
|
||||
return line == cr && start <= cc && cc <= end
|
||||
}
|
||||
return line == cc && start <= cr && cr <= end
|
||||
}
|
||||
|
||||
// bruteForce returns every legal play for the position, keyed by moveKey, found by
|
||||
// exhaustively trying every maximal window and every rack assignment, then validating
|
||||
// against the dictionary, connectivity and the first-move centre rule. It is the slow,
|
||||
// obviously-correct oracle for checking the generators on small inputs.
|
||||
func bruteForce(b *board.Board, rs *rules.Ruleset, d dict, rk rack.Rack, mode Mode) map[string]Move {
|
||||
out := map[string]Move{}
|
||||
var dirs []Direction
|
||||
if mode.Includes(Horizontal) {
|
||||
dirs = append(dirs, Horizontal)
|
||||
}
|
||||
if mode.Includes(Vertical) {
|
||||
dirs = append(dirs, Vertical)
|
||||
}
|
||||
firstMove := b.IsEmpty()
|
||||
cr, cc := rs.Center/rs.Cols, rs.Center%rs.Cols
|
||||
|
||||
for _, dir := range dirs {
|
||||
lines, span := b.Rows(), b.Cols()
|
||||
if dir == Vertical {
|
||||
lines, span = b.Cols(), b.Rows()
|
||||
}
|
||||
for line := range lines {
|
||||
for start := range span {
|
||||
for end := start + 1; end < span; end++ {
|
||||
if cellFilled(b, dir, line, start-1) || cellFilled(b, dir, line, end+1) {
|
||||
continue // not a maximal window
|
||||
}
|
||||
var empties []int
|
||||
for a := start; a <= end; a++ {
|
||||
if !cellFilled(b, dir, line, a) {
|
||||
empties = append(empties, a)
|
||||
}
|
||||
}
|
||||
if len(empties) == 0 {
|
||||
continue
|
||||
}
|
||||
assign(b, rs, d, rk.Clone(), dir, line, start, end, empties, 0, nil,
|
||||
firstMove, cr, cc, out)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func assign(b *board.Board, rs *rules.Ruleset, d dict, rk rack.Rack, dir Direction,
|
||||
line, start, end int, empties []int, idx int, placed []Placement,
|
||||
firstMove bool, cr, cc int, out map[string]Move) {
|
||||
|
||||
if idx == len(empties) {
|
||||
validate(b, rs, d, dir, line, start, end, placed, firstMove, cr, cc, out)
|
||||
return
|
||||
}
|
||||
r, c := lineCoord(dir, line, empties[idx])
|
||||
next := placed[:len(placed):len(placed)] // avoid aliasing across siblings
|
||||
|
||||
for l := byte(0); l < byte(rs.Size()); l++ {
|
||||
if rk.Has(l) {
|
||||
rk.Remove(l)
|
||||
assign(b, rs, d, rk, dir, line, start, end, empties, idx+1,
|
||||
append(next, Placement{Row: r, Col: c, Letter: l}), firstMove, cr, cc, out)
|
||||
rk.Add(l)
|
||||
}
|
||||
}
|
||||
if rk.Blanks() > 0 {
|
||||
rk.RemoveBlank()
|
||||
for l := byte(0); l < byte(rs.Size()); l++ {
|
||||
assign(b, rs, d, rk, dir, line, start, end, empties, idx+1,
|
||||
append(next, Placement{Row: r, Col: c, Letter: l, Blank: true}), firstMove, cr, cc, out)
|
||||
}
|
||||
rk.AddBlank()
|
||||
}
|
||||
}
|
||||
|
||||
func validate(b *board.Board, rs *rules.Ruleset, d dict, dir Direction,
|
||||
line, start, end int, placed []Placement, firstMove bool, cr, cc int, out map[string]Move) {
|
||||
|
||||
m, err := Evaluate(b, rs, dir, placed)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
if !d.has(m.Main.Letters) {
|
||||
return
|
||||
}
|
||||
for _, cw := range m.Cross {
|
||||
if !d.has(cw.Letters) {
|
||||
return
|
||||
}
|
||||
}
|
||||
if firstMove {
|
||||
if !coversCenter(dir, line, start, end, cr, cc) {
|
||||
return
|
||||
}
|
||||
} else {
|
||||
existing := false
|
||||
for a := start; a <= end; a++ {
|
||||
if cellFilled(b, dir, line, a) {
|
||||
existing = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !existing && len(m.Cross) == 0 {
|
||||
return // disconnected
|
||||
}
|
||||
}
|
||||
out[moveKey(dir, placed)] = m
|
||||
}
|
||||
@@ -0,0 +1,206 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"sort"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/encoding"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
// coord maps a line coordinate (fixed, axis) to a board (row, col) for direction dir.
|
||||
// For Horizontal the fixed coordinate is the row and the axis runs along columns; for
|
||||
// Vertical it is the reverse.
|
||||
func coord(dir Direction, fixed, axis int) (row, col int) {
|
||||
if dir == Horizontal {
|
||||
return fixed, axis
|
||||
}
|
||||
return axis, fixed
|
||||
}
|
||||
|
||||
// fixedAxis is the inverse of coord: it splits a (row, col) into the fixed and axis
|
||||
// coordinates for direction dir.
|
||||
func fixedAxis(dir Direction, row, col int) (fixed, axis int) {
|
||||
if dir == Horizontal {
|
||||
return row, col
|
||||
}
|
||||
return col, row
|
||||
}
|
||||
|
||||
func perpendicular(d Direction) Direction {
|
||||
if d == Horizontal {
|
||||
return Vertical
|
||||
}
|
||||
return Horizontal
|
||||
}
|
||||
|
||||
// Evaluate computes the words formed and the score for placing tiles on b in direction
|
||||
// dir under ruleset rs. It validates geometry — the tiles lie on one line, on empty
|
||||
// squares, and form a single contiguous run together with existing tiles — but does not
|
||||
// check the dictionary or board connectivity; ValidatePlay layers those on top. tiles
|
||||
// need not be sorted.
|
||||
func Evaluate(b *board.Board, rs *rules.Ruleset, dir Direction, tiles []Placement) (Move, error) {
|
||||
if len(tiles) == 0 {
|
||||
return Move{}, errors.New("scrabble: empty play")
|
||||
}
|
||||
|
||||
ts := append([]Placement(nil), tiles...)
|
||||
sort.Slice(ts, func(i, j int) bool {
|
||||
_, ai := fixedAxis(dir, ts[i].Row, ts[i].Col)
|
||||
_, aj := fixedAxis(dir, ts[j].Row, ts[j].Col)
|
||||
return ai < aj
|
||||
})
|
||||
|
||||
fixed, _ := fixedAxis(dir, ts[0].Row, ts[0].Col)
|
||||
prevAxis := 0
|
||||
for i, t := range ts {
|
||||
f, a := fixedAxis(dir, t.Row, t.Col)
|
||||
if f != fixed {
|
||||
return Move{}, errors.New("scrabble: tiles are not on one line")
|
||||
}
|
||||
if !b.InBounds(t.Row, t.Col) {
|
||||
return Move{}, fmt.Errorf("scrabble: tile (%d,%d) off board", t.Row, t.Col)
|
||||
}
|
||||
if !b.Empty(t.Row, t.Col) {
|
||||
return Move{}, fmt.Errorf("scrabble: square (%d,%d) is occupied", t.Row, t.Col)
|
||||
}
|
||||
if i > 0 && a == prevAxis {
|
||||
return Move{}, errors.New("scrabble: two tiles on the same square")
|
||||
}
|
||||
prevAxis = a
|
||||
}
|
||||
|
||||
main, err := buildMainWord(b, rs, dir, fixed, ts)
|
||||
if err != nil {
|
||||
return Move{}, err
|
||||
}
|
||||
|
||||
move := Move{Dir: dir, Tiles: ts, Main: main, Score: main.Score}
|
||||
for _, t := range ts {
|
||||
if cw, ok := crossWord(b, rs, dir, t); ok {
|
||||
move.Cross = append(move.Cross, cw)
|
||||
move.Score += cw.Score
|
||||
}
|
||||
}
|
||||
if len(ts) == rs.RackSize {
|
||||
move.Bonus = rs.Bingo
|
||||
move.Score += rs.Bingo
|
||||
}
|
||||
return move, nil
|
||||
}
|
||||
|
||||
// buildMainWord assembles the word along dir through the (sorted) placements together
|
||||
// with the existing tiles that extend and bridge them, and scores it. New tiles apply
|
||||
// their squares' premiums; existing tiles score at face value.
|
||||
func buildMainWord(b *board.Board, rs *rules.Ruleset, dir Direction, fixed int, ts []Placement) (Word, error) {
|
||||
_, minA := fixedAxis(dir, ts[0].Row, ts[0].Col)
|
||||
_, maxA := fixedAxis(dir, ts[len(ts)-1].Row, ts[len(ts)-1].Col)
|
||||
|
||||
start := minA
|
||||
for {
|
||||
r, c := coord(dir, fixed, start-1)
|
||||
if !b.Filled(r, c) {
|
||||
break
|
||||
}
|
||||
start--
|
||||
}
|
||||
end := maxA
|
||||
for {
|
||||
r, c := coord(dir, fixed, end+1)
|
||||
if !b.Filled(r, c) {
|
||||
break
|
||||
}
|
||||
end++
|
||||
}
|
||||
|
||||
letters := make([]byte, 0, end-start+1)
|
||||
blanks := make([]bool, 0, end-start+1)
|
||||
letterSum, wordMult := 0, 1
|
||||
ti := 0
|
||||
for a := start; a <= end; a++ {
|
||||
r, c := coord(dir, fixed, a)
|
||||
if ti < len(ts) {
|
||||
if _, ta := fixedAxis(dir, ts[ti].Row, ts[ti].Col); ta == a {
|
||||
t := ts[ti]
|
||||
ti++
|
||||
prem := rs.Premium(r, c)
|
||||
if !t.Blank {
|
||||
letterSum += rs.Values[t.Letter] * prem.LetterMult()
|
||||
}
|
||||
wordMult *= prem.WordMult()
|
||||
letters = append(letters, t.Letter)
|
||||
blanks = append(blanks, t.Blank)
|
||||
continue
|
||||
}
|
||||
}
|
||||
if b.Filled(r, c) {
|
||||
cell := b.At(r, c)
|
||||
l, bl := encoding.Letter(cell), encoding.IsBlank(cell)
|
||||
if !bl {
|
||||
letterSum += rs.Values[l]
|
||||
}
|
||||
letters = append(letters, l)
|
||||
blanks = append(blanks, bl)
|
||||
continue
|
||||
}
|
||||
return Word{}, fmt.Errorf("scrabble: gap in the play at line position %d", a)
|
||||
}
|
||||
|
||||
wr, wc := coord(dir, fixed, start)
|
||||
return Word{Row: wr, Col: wc, Dir: dir, Letters: letters, Blanks: blanks, Score: letterSum * wordMult}, nil
|
||||
}
|
||||
|
||||
// crossWord builds the perpendicular word formed by a single new tile, if any. It
|
||||
// returns ok=false when the tile has no perpendicular neighbour.
|
||||
func crossWord(b *board.Board, rs *rules.Ruleset, dir Direction, t Placement) (Word, bool) {
|
||||
cdir := perpendicular(dir)
|
||||
fixed, axis := fixedAxis(cdir, t.Row, t.Col)
|
||||
|
||||
start := axis
|
||||
for {
|
||||
r, c := coord(cdir, fixed, start-1)
|
||||
if !b.Filled(r, c) {
|
||||
break
|
||||
}
|
||||
start--
|
||||
}
|
||||
end := axis
|
||||
for {
|
||||
r, c := coord(cdir, fixed, end+1)
|
||||
if !b.Filled(r, c) {
|
||||
break
|
||||
}
|
||||
end++
|
||||
}
|
||||
if start == end {
|
||||
return Word{}, false
|
||||
}
|
||||
|
||||
letters := make([]byte, 0, end-start+1)
|
||||
blanks := make([]bool, 0, end-start+1)
|
||||
letterSum, wordMult := 0, 1
|
||||
for a := start; a <= end; a++ {
|
||||
r, c := coord(cdir, fixed, a)
|
||||
if a == axis {
|
||||
prem := rs.Premium(r, c)
|
||||
if !t.Blank {
|
||||
letterSum += rs.Values[t.Letter] * prem.LetterMult()
|
||||
}
|
||||
wordMult *= prem.WordMult()
|
||||
letters = append(letters, t.Letter)
|
||||
blanks = append(blanks, t.Blank)
|
||||
} else {
|
||||
cell := b.At(r, c)
|
||||
l, bl := encoding.Letter(cell), encoding.IsBlank(cell)
|
||||
if !bl {
|
||||
letterSum += rs.Values[l]
|
||||
}
|
||||
letters = append(letters, l)
|
||||
blanks = append(blanks, bl)
|
||||
}
|
||||
}
|
||||
wr, wc := coord(cdir, fixed, start)
|
||||
return Word{Row: wr, Col: wc, Dir: cdir, Letters: letters, Blanks: blanks, Score: letterSum * wordMult}, true
|
||||
}
|
||||
@@ -0,0 +1,138 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/encoding"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
const plain7 = `.......
|
||||
.......
|
||||
.......
|
||||
...+...
|
||||
.......
|
||||
.......
|
||||
.......`
|
||||
|
||||
// plainRules is a 7x7 board with no premiums and English tile values, for isolating
|
||||
// word-assembly and cross-word logic from premium multipliers.
|
||||
func plainRules(t *testing.T) *rules.Ruleset {
|
||||
t.Helper()
|
||||
eng := rules.English()
|
||||
rs, err := rules.FromTemplate("plain7", eng.Alphabet, eng.Values, eng.Counts, 2, 7, 50, plain7)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
return rs
|
||||
}
|
||||
|
||||
// indices: a=0 c=2 o=14 t=19 x=23
|
||||
func TestEvaluateSimpleWord(t *testing.T) {
|
||||
rs := plainRules(t)
|
||||
b := board.New(7, 7)
|
||||
m, err := Evaluate(b, rs, Horizontal, []Placement{
|
||||
{Row: 3, Col: 1, Letter: 2}, {Row: 3, Col: 2, Letter: 0}, {Row: 3, Col: 3, Letter: 19},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if m.Main.Score != 5 || m.Score != 5 {
|
||||
t.Errorf("cat: main=%d total=%d, want 5/5", m.Main.Score, m.Score)
|
||||
}
|
||||
if len(m.Cross) != 0 || m.Bonus != 0 {
|
||||
t.Errorf("cat: cross=%d bonus=%d, want 0/0", len(m.Cross), m.Bonus)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEvaluateCrossWord(t *testing.T) {
|
||||
rs := plainRules(t)
|
||||
b := board.New(7, 7)
|
||||
b.Set(2, 3, encoding.Cell(14, false)) // o
|
||||
b.Set(3, 3, encoding.Cell(23, false)) // x
|
||||
|
||||
// Play "at" horizontally on row 4; the 'a' on col 3 forms the cross word "oxa".
|
||||
m, err := Evaluate(b, rs, Horizontal, []Placement{
|
||||
{Row: 4, Col: 3, Letter: 0}, {Row: 4, Col: 4, Letter: 19},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if m.Main.Score != 2 {
|
||||
t.Errorf("main 'at' = %d, want 2", m.Main.Score)
|
||||
}
|
||||
if len(m.Cross) != 1 || m.Cross[0].Score != 10 {
|
||||
t.Errorf("cross = %+v, want one word scoring 10 (oxa)", m.Cross)
|
||||
}
|
||||
if m.Score != 12 {
|
||||
t.Errorf("total = %d, want 12", m.Score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEvaluatePremiums(t *testing.T) {
|
||||
rs := rules.English()
|
||||
|
||||
// (0,3) is a double-letter square: c(3)*2 + a(1) + t(1) = 8.
|
||||
b := board.New(15, 15)
|
||||
m, err := Evaluate(b, rs, Horizontal, []Placement{
|
||||
{Row: 0, Col: 3, Letter: 2}, {Row: 0, Col: 4, Letter: 0}, {Row: 0, Col: 5, Letter: 19},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if m.Score != 8 {
|
||||
t.Errorf("DL cat = %d, want 8", m.Score)
|
||||
}
|
||||
|
||||
// (1,1) is a double-word square: (c(3) + a(1)) * 2 = 8.
|
||||
b2 := board.New(15, 15)
|
||||
m2, err := Evaluate(b2, rs, Horizontal, []Placement{
|
||||
{Row: 1, Col: 1, Letter: 2}, {Row: 1, Col: 2, Letter: 0},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if m2.Score != 8 {
|
||||
t.Errorf("DW ca = %d, want 8", m2.Score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEvaluateBingo(t *testing.T) {
|
||||
rs := plainRules(t)
|
||||
b := board.New(7, 7)
|
||||
tiles := make([]Placement, 7)
|
||||
for c := range 7 {
|
||||
tiles[c] = Placement{Row: 0, Col: c, Letter: 0} // seven a's
|
||||
}
|
||||
m, err := Evaluate(b, rs, Horizontal, tiles)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if m.Bonus != 50 || m.Score != 7+50 {
|
||||
t.Errorf("bingo: bonus=%d total=%d, want 50/57", m.Bonus, m.Score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEvaluateErrors(t *testing.T) {
|
||||
rs := plainRules(t)
|
||||
b := board.New(7, 7)
|
||||
b.Set(2, 3, encoding.Cell(14, false))
|
||||
|
||||
if _, err := Evaluate(b, rs, Horizontal, nil); err == nil {
|
||||
t.Error("empty play: want error")
|
||||
}
|
||||
if _, err := Evaluate(b, rs, Horizontal, []Placement{{Row: 2, Col: 3, Letter: 0}}); err == nil {
|
||||
t.Error("occupied square: want error")
|
||||
}
|
||||
if _, err := Evaluate(b, rs, Horizontal, []Placement{
|
||||
{Row: 3, Col: 1, Letter: 0}, {Row: 4, Col: 2, Letter: 0},
|
||||
}); err == nil {
|
||||
t.Error("non-collinear: want error")
|
||||
}
|
||||
if _, err := Evaluate(b, rs, Horizontal, []Placement{
|
||||
{Row: 5, Col: 1, Letter: 0}, {Row: 5, Col: 3, Letter: 0},
|
||||
}); err == nil {
|
||||
t.Error("gap: want error")
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,101 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"sort"
|
||||
|
||||
dawg "github.com/iliadenisov/dafsa"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/rack"
|
||||
"scrabble-solver/rules"
|
||||
)
|
||||
|
||||
// Solver is the high-level entry point: it generates ranked plays and scores or
|
||||
// validates arbitrary plays for a ruleset over a dictionary.
|
||||
type Solver struct {
|
||||
rules *rules.Ruleset
|
||||
finder dawg.Finder
|
||||
gen *DAWGGenerator
|
||||
}
|
||||
|
||||
// NewSolver returns a Solver for the ruleset over the dictionary finder.
|
||||
func NewSolver(rs *rules.Ruleset, finder dawg.Finder) *Solver {
|
||||
return &Solver{rules: rs, finder: finder, gen: NewDAWGGenerator(rs, finder)}
|
||||
}
|
||||
|
||||
// Rules returns the solver's ruleset.
|
||||
func (s *Solver) Rules() *rules.Ruleset { return s.rules }
|
||||
|
||||
// GenerateMoves returns every legal play for rack r on board b in the requested
|
||||
// orientations, ranked by descending score (ties broken deterministically by the move's
|
||||
// canonical key).
|
||||
func (s *Solver) GenerateMoves(b *board.Board, r rack.Rack, mode Mode) []Move {
|
||||
moves := s.gen.GenerateMoves(b, r, mode)
|
||||
sort.Slice(moves, func(i, j int) bool {
|
||||
if moves[i].Score != moves[j].Score {
|
||||
return moves[i].Score > moves[j].Score
|
||||
}
|
||||
return moves[i].Key() < moves[j].Key()
|
||||
})
|
||||
return moves
|
||||
}
|
||||
|
||||
// ScorePlay computes the words and score for placing tiles on b in direction dir. It
|
||||
// checks geometry only (see Evaluate); use ValidatePlay to also check the dictionary and
|
||||
// connectivity.
|
||||
func (s *Solver) ScorePlay(b *board.Board, dir Direction, tiles []Placement) (Move, error) {
|
||||
return Evaluate(b, s.rules, dir, tiles)
|
||||
}
|
||||
|
||||
// ValidatePlay scores a play and verifies that every word it forms is in the dictionary
|
||||
// and that it connects to the board (or covers the centre on the first move). It returns
|
||||
// the scored move; the error is nil exactly when the play is legal.
|
||||
func (s *Solver) ValidatePlay(b *board.Board, dir Direction, tiles []Placement) (Move, error) {
|
||||
m, err := Evaluate(b, s.rules, dir, tiles)
|
||||
if err != nil {
|
||||
return Move{}, err
|
||||
}
|
||||
if len(m.Main.Letters) < 2 {
|
||||
return m, errors.New("scrabble: play forms no word of length 2 or more")
|
||||
}
|
||||
if s.finder.IndexOfB(m.Main.Letters) < 0 {
|
||||
return m, fmt.Errorf("scrabble: main word is not in the dictionary")
|
||||
}
|
||||
for _, cw := range m.Cross {
|
||||
if s.finder.IndexOfB(cw.Letters) < 0 {
|
||||
return m, fmt.Errorf("scrabble: a cross word is not in the dictionary")
|
||||
}
|
||||
}
|
||||
if !s.connected(b, m) {
|
||||
return m, errors.New("scrabble: play does not connect to the board")
|
||||
}
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// connected reports whether the play touches the existing position (or covers the centre
|
||||
// on the first move).
|
||||
func (s *Solver) connected(b *board.Board, m Move) bool {
|
||||
if b.IsEmpty() {
|
||||
cr, cc := s.rules.Center/s.rules.Cols, s.rules.Center%s.rules.Cols
|
||||
return wordCovers(m.Main, cr, cc)
|
||||
}
|
||||
// The main word incorporated an existing tile, or a new tile formed a cross word.
|
||||
return len(m.Main.Letters) > len(m.Tiles) || len(m.Cross) > 0
|
||||
}
|
||||
|
||||
func wordCovers(w Word, r, c int) bool {
|
||||
for i := range w.Letters {
|
||||
rr, cc := w.Row, w.Col
|
||||
if w.Dir == Horizontal {
|
||||
cc += i
|
||||
} else {
|
||||
rr += i
|
||||
}
|
||||
if rr == r && cc == c {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
@@ -0,0 +1,88 @@
|
||||
package scrabble
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
)
|
||||
|
||||
func newTestSolver(t *testing.T) *Solver {
|
||||
t.Helper()
|
||||
words := wordlist.Encode(testWords, alphabet.Latin(), 2, 15)
|
||||
f, err := dictdawg.Build(alphabet.Latin(), words)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
return NewSolver(plainRulesShared, f)
|
||||
}
|
||||
|
||||
func TestSolverGenerateMovesRanked(t *testing.T) {
|
||||
s := newTestSolver(t)
|
||||
b := board.New(s.rules.Rows, s.rules.Cols)
|
||||
moves := s.GenerateMoves(b, makeRack("cat", 0), Both)
|
||||
if len(moves) == 0 {
|
||||
t.Fatal("no first moves generated")
|
||||
}
|
||||
for i := 1; i < len(moves); i++ {
|
||||
if moves[i-1].Score < moves[i].Score {
|
||||
t.Fatalf("moves not ranked: %d before %d", moves[i-1].Score, moves[i].Score)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestSolverValidatePlay(t *testing.T) {
|
||||
s := newTestSolver(t)
|
||||
// indices: c=2 a=0 t=19 z=25
|
||||
cat := []Placement{{Row: 3, Col: 2, Letter: 2}, {Row: 3, Col: 3, Letter: 0}, {Row: 3, Col: 4, Letter: 19}}
|
||||
|
||||
// First move through the centre (3,3) is legal.
|
||||
if _, err := s.ValidatePlay(board.New(s.rules.Rows, s.rules.Cols), Horizontal, cat); err != nil {
|
||||
t.Errorf("valid first move rejected: %v", err)
|
||||
}
|
||||
|
||||
// First move that misses the centre is rejected.
|
||||
off := []Placement{{Row: 0, Col: 0, Letter: 2}, {Row: 0, Col: 1, Letter: 0}, {Row: 0, Col: 2, Letter: 19}}
|
||||
if _, err := s.ValidatePlay(board.New(s.rules.Rows, s.rules.Cols), Horizontal, off); err == nil {
|
||||
t.Error("first move off the centre was accepted")
|
||||
}
|
||||
|
||||
// A non-word ("caz") is rejected.
|
||||
caz := []Placement{{Row: 3, Col: 2, Letter: 2}, {Row: 3, Col: 3, Letter: 0}, {Row: 3, Col: 4, Letter: 25}}
|
||||
if _, err := s.ValidatePlay(board.New(s.rules.Rows, s.rules.Cols), Horizontal, caz); err == nil {
|
||||
t.Error("non-word 'caz' was accepted")
|
||||
}
|
||||
|
||||
// A disconnected play on a non-empty board is rejected.
|
||||
b := board.New(s.rules.Rows, s.rules.Cols)
|
||||
placeWord(b, 3, 2, Horizontal, "cat")
|
||||
disc := []Placement{{Row: 0, Col: 0, Letter: 0}, {Row: 0, Col: 1, Letter: 18}} // "as" far away
|
||||
if _, err := s.ValidatePlay(b, Horizontal, disc); err == nil {
|
||||
t.Error("disconnected play was accepted")
|
||||
}
|
||||
|
||||
// Extending "cat" to "cats" connects and is a word.
|
||||
cats := []Placement{{Row: 3, Col: 5, Letter: 18}} // s after cat
|
||||
if m, err := s.ValidatePlay(b, Horizontal, cats); err != nil {
|
||||
t.Errorf("valid extension rejected: %v", err)
|
||||
} else if string(m.Main.Letters) != string([]byte{2, 0, 19, 18}) {
|
||||
t.Errorf("main word = %v, want cats", m.Main.Letters)
|
||||
}
|
||||
}
|
||||
|
||||
func TestSolverScorePlay(t *testing.T) {
|
||||
s := newTestSolver(t)
|
||||
b := board.New(s.rules.Rows, s.rules.Cols)
|
||||
m, err := s.ScorePlay(b, Horizontal, []Placement{
|
||||
{Row: 3, Col: 2, Letter: 2}, {Row: 3, Col: 3, Letter: 0}, {Row: 3, Col: 4, Letter: 19},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
if m.Score != 5 { // c3 a1 t1, no premiums on the plain board
|
||||
t.Errorf("cat score = %d, want 5", m.Score)
|
||||
}
|
||||
}
|
||||
Vendored
+38
@@ -0,0 +1,38 @@
|
||||
#player1 Thomas_Reinke Thomas Reinke
|
||||
#player2 Charles_Reinke Charles Reinke
|
||||
>Thomas_Reinke: BCFLPYY 8H FLYBY +40 40
|
||||
#note Club game, Madison, WI, 3/28/2012. I made a wiseacre comment about not having any vowels here. P.S. check out madisonscrabble.com!
|
||||
>Charles_Reinke: AEEEEIO -AEEEIO +0 0
|
||||
>Thomas_Reinke: ?ACEOPT M2 TOECAPs +92 132
|
||||
#note Scored as 82. The bingo down from the F would have been better.
|
||||
>Charles_Reinke: DEEEKRV K5 KER.ED +26 26
|
||||
#note 4K KE(E)VE looks pretty good. If I'm going to play off these letters, might as well do it with L8 YERKED.
|
||||
>Thomas_Reinke: AEIMRUV I6 VE.ARIUM +68 200
|
||||
>Charles_Reinke: AEEGISV 13I .AVIE +20 46
|
||||
#note L1 VIGA is the star play. I looked at the spot, but not hard enough, since I was sort of annoyed at my (mis)fortune at this point.
|
||||
>Thomas_Reinke: DELQRUU L10 EQU.D +33 233
|
||||
>Charles_Reinke: DEGGLSS 15H GELDS +41 87
|
||||
>Thomas_Reinke: AILRRSU 2F RURALIS. +62 295
|
||||
#note This is the point where Charles is the most pissed.
|
||||
>Charles_Reinke: GINORST H1 T.OG +21 108
|
||||
#note I was pretty pissed at this point.
|
||||
>Thomas_Reinke: CEJNOTZ 4L J.EZ +60 355
|
||||
>Charles_Reinke: EIINNRS 4A RESININ. +70 178
|
||||
#note The thought of blocking the O-column with ZINES briefly crossed my mind...
|
||||
>Thomas_Reinke: ACENNOT O1 CAN.ONET +221 576
|
||||
#note Scored as 231, this makes up for the 10 point underscore before. Quick quiz: find the other playable bingo.
|
||||
>Charles_Reinke: ?ADGHIN A1 HoA.DING +158 336
|
||||
#note Scored as 168. I guess we just like overscoring our triple-triples by 10.
|
||||
>Thomas_Reinke: AAHMRTU B1 AH +23 599
|
||||
#note 3A (A)MAH is normally better, but I wanted to keep a scoring tile and leave lines open.
|
||||
>Charles_Reinke: EIOOOSW C1 WOO. +28 364
|
||||
>Thomas_Reinke: AMOPRTU 10F POU. +12 611
|
||||
#note I spent a long time on this, but it's just fine, there's no way to score here.
|
||||
>Charles_Reinke: BEEIOST M10 BIT. +25 389
|
||||
#note Probably would have done 3C OBOE had I seen it. I don't think any play wins more than 0%, though.
|
||||
>Thomas_Reinke: AAMNRTX I2 .X +34 645
|
||||
>Charles_Reinke: EELOOSW N6 WE +32 421
|
||||
#note Scored as 31. We were missing a tile in this game so this empties the bag; I made this play knowingly emptying the bag.
|
||||
>Thomas_Reinke: AAIMNRT 11A MARTIAN +74 719
|
||||
#note An I was missing from the bag, so Charles' rack of EFLOOST was added to my score. Final recorded score is 739-430.
|
||||
#rack2 EFLOOST
|
||||
Vendored
+29
@@ -0,0 +1,29 @@
|
||||
#player1 Christopher_Sykes Christopher Sykes
|
||||
#player2 John_Dafoe John Dafoe
|
||||
>Christopher_Sykes: AADHNRV 8G VAH +18 18
|
||||
#note I was hoping John would challenge this off and give me an E for VERANDAH.
|
||||
>John_Dafoe: DEHINRS J2 HINDERS +81 81
|
||||
>John_Dafoe: DEHINRS -- -81 0
|
||||
>Christopher_Sykes: ?AADENR G8 .ERANDAs +64 82
|
||||
#note I played this to put an S at G15 that would be there for me after John played a legit bingo.
|
||||
>John_Dafoe: DEHINRS 7H NERDISH +71 71
|
||||
>Christopher_Sykes: ACEIKTW 15A WACKIE.T +239 321
|
||||
>John_Dafoe: AELZ K3 LAZE. +30 101
|
||||
>Christopher_Sykes: EEILNPU 10E PE.ILUNE +64 385
|
||||
>John_Dafoe: AEGILRS 4H REG.LIAS +72 173
|
||||
>John_Dafoe: AEGILRS -- -72 101
|
||||
>Christopher_Sykes: COOOUVY E7 COY.OU +26 411
|
||||
>John_Dafoe: AEGILRS 5H GLA.IERS +68 169
|
||||
>Christopher_Sykes: EINOOTV 13G .EVOTION +82 493
|
||||
>John_Dafoe: IJOT N10 JOI.T +56 225
|
||||
>Christopher_Sykes: EGGMOQS O1 MOGG. +33 526
|
||||
#note Phoney, not deliberate. It's hard to remember those extra-G four letter words sometimes.
|
||||
>John_Dafoe: EIMT 8L TIME +33 258
|
||||
>Christopher_Sykes: AAEQSWY 14E AW. +29 555
|
||||
>John_Dafoe: APX 12J PAX +48 306
|
||||
>Christopher_Sykes: ABEQSSY 15K ABYSS +52 607
|
||||
>John_Dafoe: INRT 8A NITR. +18 324
|
||||
>Christopher_Sykes: EEINOQU B10 QUINO. +70 677
|
||||
>John_Dafoe: DEFO H1 DEFO. +33 357
|
||||
>Christopher_Sykes: ?BEERRU 1A dEBURRE. +83 760
|
||||
>Christopher_Sykes: (DFLT) +16 776
|
||||
Vendored
+41
@@ -0,0 +1,41 @@
|
||||
#player1 Mike Mike
|
||||
#player2 Cheri_Eder Cheri Eder
|
||||
>Mike: ADKNR H8 DRANK +30 30
|
||||
>Cheri_Eder: QUY 10F QU.Y +36 36
|
||||
>Mike: ??AORST 8A ASsORTe. +74 104
|
||||
#note Could have made a few more points on the J column, but I wanted to keep the board wide open.
|
||||
>Cheri_Eder: AE I10 .EA +14 50
|
||||
>Mike: JU F6 JU. +26 130
|
||||
>Cheri_Eder: NOT G5 TON. +16 66
|
||||
>Mike: ABEHLOP 13I HOPABLE +100 230
|
||||
#note I was hopable that she wouldn't challenge, or that it might be good - whichever came first. :-)
|
||||
>Cheri_Eder: ABCDEFG -ABCDE +0 66
|
||||
>Mike: OWW 14J WOW +39 269
|
||||
>Cheri_Eder: EGN O12 G.NE +21 87
|
||||
>Mike: DDEILMS 15E MIDDLES +94 363
|
||||
>Cheri_Eder: DEEGIOV - +0 87
|
||||
#note Challenged POS
|
||||
|
||||
>Mike: AGOTV 14B GAVOT +30 393
|
||||
>Cheri_Eder: ETZ B6 ZE.T +33 120
|
||||
>Mike: EEF H4 FEE +22 415
|
||||
>Cheri_Eder: IO 11E OI +15 135
|
||||
>Mike: DEP I3 PED +18 433
|
||||
#note Still trying to keep the board open
|
||||
>Cheri_Eder: ENOS J8 NOSE +28 163
|
||||
>Mike: AEFLNRU 2G FLANEUR +69 502
|
||||
>Cheri_Eder: HIT 15A HIT +23 186
|
||||
>Mike: V L12 V.. +18 520
|
||||
>Cheri_Eder: EX 12D EX +27 213
|
||||
>Mike: EGI 1F GIE +23 543
|
||||
>Cheri_Eder: AACCRRR -AARRR +0 213
|
||||
>Mike: ACIMNOT M2 .OMANTIC +82 625
|
||||
#note This is where I started to believe this game could be very special
|
||||
>Cheri_Eder: ACR C5 CAR. +20 233
|
||||
>Mike: BY 10A BY +40 665
|
||||
>Cheri_Eder: IR 4K RI. +10 243
|
||||
>Mike: ILRSU N1 US +26 691
|
||||
>Cheri_Eder: AII N6 AI +10 253
|
||||
>Mike: ILR 8L L.RI +18 709
|
||||
#note My highest game ever and largest spread. Sorry Cheri!
|
||||
>Mike: (I) +2 711
|
||||
Vendored
+37
@@ -0,0 +1,37 @@
|
||||
#player1 Morris_Greenberg Morris Greenberg
|
||||
#player2 Ted_Blevins Ted Blevins
|
||||
>Morris_Greenberg: DEILNRT 8C TENDRIL +68 68
|
||||
>Ted_Blevins: AEILORV F1 OVERLAI. +72 72
|
||||
#note Darn, he blocked EXPE(N)dER.
|
||||
>Morris_Greenberg: ?EEEPRX 1A REEXP.sE +266 334
|
||||
>Ted_Blevins: ARWZ - +0 72
|
||||
#note Ted was kind of forced to challenge.
|
||||
>Morris_Greenberg: CEJMRTU 5D MU.CT +18 352
|
||||
#note Just to be extra safe, no insane double-doubles.
|
||||
>Ted_Blevins: ARWZ 3C WAR.Z +54 126
|
||||
>Morris_Greenberg: DEEGJRU 2H REJUDGE +116 468
|
||||
#note Misscored as 114.
|
||||
>Ted_Blevins: AGILNOT - +0 126
|
||||
#note Challenged again.
|
||||
>Morris_Greenberg: AEFHINT 1N FA +20 488
|
||||
#note I didn't really want to block this lane because I was now going for the record, but I couldn't see anything close that didn't block it.
|
||||
>Ted_Blevins: AGILNOT C6 TO.ALING +72 198
|
||||
#note (I)NTAGLIO
|
||||
>Morris_Greenberg: BCEHINT 7H BENTHIC +73 561
|
||||
>Ted_Blevins: AINP 3K PAIN +31 229
|
||||
>Morris_Greenberg: EEINOOS 4C OI +19 580
|
||||
#note I managed to miss OOGENIES.
|
||||
>Ted_Blevins: AFMO 9I FOAM +23 252
|
||||
>Morris_Greenberg: EEINOSU L9 .OUE +12 592
|
||||
>Ted_Blevins: GIW 12A WI.G +24 276
|
||||
>Morris_Greenberg: DEEINSV 13F ENDIVES +70 662
|
||||
>Ted_Blevins: AQSU A8 SQUA. +51 327
|
||||
>Morris_Greenberg: HKNORSY 4J YOK +41 703
|
||||
#note Maybe I should play HON(E)Y to fish for (SQUAW)KER and create a nice S lane. HY(D)RO is also nice.
|
||||
>Ted_Blevins: AOTY H11 TO.AY +30 357
|
||||
>Morris_Greenberg: DHLNRSS M6 L.NS +16 719
|
||||
#note Blocking I(C)EBOATs. However, the better block is SH(O)RLS 10J, to leave D(E)N/(REEXPOsE)D next turn.
|
||||
>Ted_Blevins: ?ABEIOT 15B OBEsIT. +12 369
|
||||
>Morris_Greenberg: DHRS 12K H.RDS +15 734
|
||||
#note I was only 95% sure of H(U)RDS, and I didn't want to phony on this board. Regardless, if I want to go for straight up high score instead of best endgame, (A)DS/(REJUDGE)D/(PAIN)S O1 gets 29 points.
|
||||
>Morris_Greenberg: (A) +2 736
|
||||
Vendored
+36
@@ -0,0 +1,36 @@
|
||||
#player1 Evans_Clinchy Evans Clinchy
|
||||
#player2 Walker_Willingham Walker Willingham
|
||||
>Evans_Clinchy: EELOOTU -EOOU +0 0
|
||||
>Walker_Willingham: EEIINRU -EIU +0 0
|
||||
>Evans_Clinchy: AEGILNT 8F ELATING +70 70
|
||||
>Walker_Willingham: ADEIKNR E6 DARKEN +31 31
|
||||
>Evans_Clinchy: ?IIRTTU K4 TRIU.ITy +78 148
|
||||
>Evans_Clinchy: ?EGILOS (challenge) +5 153
|
||||
>Walker_Willingham: ADEINUZ 6D A.Z +33 64
|
||||
>Evans_Clinchy: ?EGILOS 12A GLIOSEs +80 233
|
||||
>Walker_Willingham: DEEINOU G5 OE +16 80
|
||||
>Evans_Clinchy: AEHMORT A8 ETHO.RAM +176 409
|
||||
>Evans_Clinchy: ACELOUW (challenge) +5 414
|
||||
>Walker_Willingham: DEINOUV H1 ENVOID +51 131
|
||||
>Walker_Willingham: DEINOUV -- -51 80
|
||||
>Evans_Clinchy: ACELOUW 4I OU.LAW +18 432
|
||||
>Walker_Willingham: DEINOUV N2 VO.ED +32 112
|
||||
>Evans_Clinchy: ACDEEMN B2 MENACED +89 521
|
||||
>Walker_Willingham: CIINOOU A1 COO +21 133
|
||||
>Evans_Clinchy: AAEIRUV 1L VIAE +46 567
|
||||
>Walker_Willingham: GIINOSU 10A .I +7 140
|
||||
>Evans_Clinchy: ABEJLRU M3 J.B +46 613
|
||||
>Walker_Willingham: GINOSUY O5 YOGI +35 175
|
||||
>Evans_Clinchy: AAELRUY L11 AYU +13 626
|
||||
>Walker_Willingham: BDEINSU 14J BUSED +35 210
|
||||
>Evans_Clinchy: AELNNRS C11 L.N +6 632
|
||||
>Walker_Willingham: HINQRTX N8 QIN +25 235
|
||||
>Evans_Clinchy: AEFFNRS 10E .FF +17 649
|
||||
>Walker_Willingham: EHIRTTX M9 XI +36 271
|
||||
>Evans_Clinchy: AENRRSW 15F WARREN +33 682
|
||||
>Walker_Willingham: EHPPRTT M13 H.P +26 297
|
||||
>Evans_Clinchy: SS 9M ..S +19 701
|
||||
>Walker_Willingham: EPRTT 3G REP +16 313
|
||||
>Evans_Clinchy: S 6D ....S +15 716
|
||||
#note 720! As of this writing, the 13th highest score in tournament Scrabble history. Pretty cool!
|
||||
>Evans_Clinchy: (TT) +4 720
|
||||
Vendored
+40
@@ -0,0 +1,40 @@
|
||||
#player1 Steven Steven Alexander
|
||||
#player2 Elizabeth Elizabeth Wood
|
||||
#description 2017-07-04 Lake Oswego OR club game 2
|
||||
#comment date="2017-07-04" author="Steven Alexander"
|
||||
#lexicon TWL15
|
||||
>Steven: DEENTUU 8H UNDUE +14 14
|
||||
>Elizabeth: IMO 7H MOI +16 16
|
||||
>Steven: CEHORTY 6I CHERTY +50 64
|
||||
>Elizabeth: BCU K3 CUB. +16 32
|
||||
>Steven: EEFIIOO -EIIOOF +0 64
|
||||
>Elizabeth: FTT 4J T.FT +14 46
|
||||
>Steven: AEEELUY 3K .EE +19 83
|
||||
>Elizabeth: AI 5H AI +8 54
|
||||
>Steven: AEILSUY 4C EASILY +35 118
|
||||
>Elizabeth: EIRW E2 WI.ER +16 70
|
||||
>Steven: AAEEPQU 6B QUA.E +36 154
|
||||
>Elizabeth: AKNS A6 SANK +39 109
|
||||
>Steven: AEIIOPR 2M POI +15 169
|
||||
>Elizabeth: GLN O1 L.NG +21 130
|
||||
>Steven: AEHIORS 7C HI +20 189
|
||||
>Elizabeth: JOW 3A JOW +36 166
|
||||
>Steven: AEOORRS A1 RA. +30 219
|
||||
>Elizabeth: DEGLRT B2 G.T +24 190
|
||||
>Elizabeth: DEGLRT -- -24 166
|
||||
>Steven: ?AEOORS B9 AEROSOl +72 291
|
||||
>Elizabeth: DEGLRT C9 TRED +21 187
|
||||
>Elizabeth: DEGLRT -- -21 166
|
||||
>Steven: ABDEERZ 15A B.AZERED +311 602
|
||||
>Elizabeth: DEGLRT 12A G.RT +14 180
|
||||
>Elizabeth: DEGLRT -- -14 166
|
||||
>Steven: ?FIILOO 7L OI +9 611
|
||||
>Elizabeth: DEGLRT 12A G.LD +16 182
|
||||
>Steven: ?FGILNO 14G FOILiNG +70 681
|
||||
>Elizabeth: ETV L12 VE.T +16 198
|
||||
>Steven: AIMNOTT 2A .M +16 697
|
||||
>Elizabeth: EVX 11K VEX +34 232
|
||||
>Steven: AAINOTT 15L .OIT +15 712
|
||||
>Elizabeth: DNPRS H12 PR.. +10 242
|
||||
>Steven: AANT N9 ANTA +20 732
|
||||
>Steven: (DNS) +8 740
|
||||
Vendored
+29
@@ -0,0 +1,29 @@
|
||||
#player1 David David Poder
|
||||
#player2 Bruce Bruce D'Ambrosio
|
||||
>David: ?DEEEMS 8H SEEDMEn +74 74
|
||||
>Bruce: OVW 7G VOW +28 28
|
||||
>David: AILNPSU F6 PAULINS +77 151
|
||||
>Bruce: HI 7M HI +19 47
|
||||
>David: BBIJ 12B JIBB. +32 183
|
||||
>Bruce: CNO 13A CON +23 70
|
||||
>David: FFO 6I OFF +23 206
|
||||
>Bruce: IRTUV E2 VIRTU +20 90
|
||||
>David: EGU 4D G.UE +10 216
|
||||
>Bruce: LT B12 ..LT +22 112
|
||||
>David: AZ 6N ZA +62 278
|
||||
>Bruce: AEEILUY -AEEILUY +0 112
|
||||
>David: ?AAEERS 15A S.EARAtE +122 400
|
||||
>Bruce: AD 14F DA +15 127
|
||||
>David: EILOPRT 3G POLITER +83 483
|
||||
>Bruce: ADEIMRT 13G READMIT +77 204
|
||||
>David: IQ 2J QI +64 547
|
||||
>Bruce: NU 14K UN +8 212
|
||||
>David: AEKT 15L KETA +51 598
|
||||
>Bruce: GO H1 GO. +12 224
|
||||
>David: HLORW 1K WHORL +51 649
|
||||
>Bruce: Y I5 Y... +10 234
|
||||
>David: EX 4K EX +43 692
|
||||
>Bruce: AEINRST 10F .NERTIAS +60 294
|
||||
>David: ACDEGNO 12L DANG +36 728
|
||||
>Bruce: INOY D6 YONI +16 310
|
||||
>Bruce: (CEO) +10 320
|
||||
Vendored
+29
@@ -0,0 +1,29 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 David_Firstman David Firstman
|
||||
#player2 Zombywoof Zombywoof
|
||||
>David_Firstman: CEIOPRS 8D COPIERS +78 78
|
||||
>Zombywoof: AAELORW D8 .LAWER +22 22
|
||||
>David_Firstman: ?AEGILT H8 .GALITEs +77 155
|
||||
>Zombywoof: AABDFGO G12 GAB +17 39
|
||||
>David_Firstman: AIJNOUY J6 JU. +26 181
|
||||
>Zombywoof: AADFHOR C10 FARAD +34 73
|
||||
>David_Firstman: AINOSVY 15H .YNOVIAS +101 282
|
||||
>Zombywoof: EEHKNOO 10F KO. +17 90
|
||||
>David_Firstman: ?EHINSX E11 EX +40 322
|
||||
>Zombywoof: EEHNOQT K5 HON +23 113
|
||||
>David_Firstman: ?HIINPS O8 kINSHIP. +98 420
|
||||
>Zombywoof: EEEQTTV B13 VET +24 137
|
||||
>David_Firstman: DMNORUW L4 MOW +34 454
|
||||
>Zombywoof: EEOOQTT M2 TOOT +18 155
|
||||
>David_Firstman: ACDENRU 2H UNCRA.ED +84 538
|
||||
>Zombywoof: EEFLQTZ G7 Q. +21 176
|
||||
>David_Firstman: AABEMOT 1H MA +26 564
|
||||
>Zombywoof: EEFLTUZ 1N FE +29 205
|
||||
>David_Firstman: ABDEORT 3B ABORTED +78 642
|
||||
>Zombywoof: ELLSTUZ 11J LUTZE. +30 235
|
||||
>David_Firstman: EENNRUY 4A EYEN +29 671
|
||||
>Zombywoof: DGIIILS A14 GI +20 255
|
||||
>David_Firstman: EINRU M11 .IN +24 695
|
||||
>Zombywoof: DIILS 2A ID +20 275
|
||||
>David_Firstman: ERU F3 .RUE +6 701
|
||||
>David_Firstman: (ILS) +6 707
|
||||
Vendored
+49
@@ -0,0 +1,49 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Samuel_Kaplan Samuel Kaplan
|
||||
#player2 Samuel_Moch Samuel Moch
|
||||
>Samuel_Kaplan: EENSTUZ 8G ZEN +24 24
|
||||
>Samuel_Moch: DIILOOV 7I OVOLI +13 13
|
||||
#note Thanks to Sam for providing his racks. I like OVOLI better than 9I or 7I OVOID since OVOLIS* isn't good and he gives back way fewer points to me.
|
||||
>Samuel_Kaplan: EHOSTUW 8M TOW +20 44
|
||||
#note This play is not great statically, but it sims fairly well. If I'm not going to see M3 OUTW(I)SH for 34, which flushes out all of the bad tiles, I could probably just go with G6 WU(Z) if I really wanted to keep the board bad for him or L4 WHO(L)E for a few more points but keeping a slightly worse leave that does not preserve scoring power. TOW sims slightly worse than WHOLE since the H gets along nicely with the W, but that combination is weakened by keeping the U to go with it. I'd probably do WHOLE in a do-over since WUZ blocks off the bottom left of the board, and OVOLI actually blasted it open nicely.
|
||||
>Samuel_Moch: ADGIILS M4 DIG..ALIS +62 75
|
||||
#note Wow. That is a nice find. I don't think I would have spotted that. Kudos!
|
||||
>Samuel_Kaplan: EHJNRSU 12I JEHU. +30 74
|
||||
#note I did not think about the DIGITALISE hook just because those words are pronounced very differently. 13L JEHU easily reigns supreme. #hookrecognitionlarge
|
||||
>Samuel_Moch: ?ACNOPS 5H CANOP.eS +72 147
|
||||
#note Lol. He's got his bingo bango no matter what. But the only difference is I'm down less if I realized DIGITALISE could have been formed.
|
||||
>Samuel_Kaplan: AEENORS 13E ARENOSE +76 150
|
||||
#note And not seeing DIGITALISE bites me again since ARENOSE would have played for so many more at 13H with DIGITALISES (assuming it does not get blocked)! It's very rare that you seen an 11 on the board. Luckily he didn't see it either.
|
||||
>Samuel_Moch: DEFHIKQ O1 KIEF. +48 195
|
||||
#note This is his best play.
|
||||
>Samuel_Kaplan: EEIIPRT L3 PI. +20 170
|
||||
#note Maybe H1 PIER(C)E?
|
||||
>Samuel_Moch: DEHQRRY 2N Q. +22 217
|
||||
>Samuel_Kaplan: CEEIRTV 12A CIVET +28 198
|
||||
#note This time, I would not have made a play with the DIGITALISE hook even if I had seen it. The problem with making a play there is it doesn't open much and allows him to outrun me easily. CIVET does a nice job of forking vertical lines such that the board will be that much harder for him to block. For this reason, I'm fine with CIVET. 12A VERITE might be 2 more, but CIVET at least opens the board in a way that's not going to give back so many points to him and preserves the board shape that I'm looking for in a better fashion.
|
||||
>Samuel_Moch: DEEHRRY 14B HERD +20 237
|
||||
#note I believe that he made a mistake here. This would be reasonable if he had the last S and didn't sacrifice so many points. Like I mentioned earlier, we both missed the DIGITALISE hook all game and EDH would have played at 13M. That said, I think he should play 11D HERRY here. I am not guaranteed to have a good rack after CIVET.
|
||||
>Samuel_Kaplan: AELNORT 15A TOLE +25 223
|
||||
#note I read Sam's last play as a potential S setup and would like to address this since he could have also played HERD at 11D for 7 more. If he hits a bingo with the TADS hook, I am going to have a hard time clawing my way back. I can see why LONE sims slightly ahead of TOLE despite there being more Ts unseen as opposed to Ns- ART is a bit of a better leave than ANR. Even with an S specified, 9F TALON tops the bunch in the sim, but I was not comfortable doing that when 14B HERD raised alarm bells. I'm fine with my idea here; ART bingoes a little more often than ANR. It's good to have an A in reserve in case I need to use Row 9 next turn.
|
||||
>Samuel_Moch: EGMNRRY 9K GR.Y +13 250
|
||||
#note 13M ERG appears to be worth it here. But that assumes he sees the DIGITALISE hook.
|
||||
>Samuel_Kaplan: AAINNRR H1 ARNI.A +27 250
|
||||
#note Yes, I did consider 9F RAN. I didn't like it for the same reason I didn't like leaving Row 15 open on my previous turn: I thought he kept the last S after his placement of HERD. RAN allows high scoring bingoes on Row 10 ending with -ES which would force me to have to open the left side of the board for cheap depending on what he puts on Row 10. It's another situation where I will probably have a hard time coming back if I make the wrong play. ARNICA not only scores 9 more than RAN, which would make it worth it either way, but he also won't be up by enough to fully commit to shutting down the board. The H1 placement does sim ahead of the A8 placement by just a little. I would think about A8 more if I was ahead, but forking horizontal lines was a key driving force while I was behind.
|
||||
>Samuel_Moch: ABEMNRX 9F MAX +34 284
|
||||
#note I think this looks good if the DIGITALISE hook was missed.
|
||||
>Samuel_Kaplan: ?BEFNOR 14G FOB +27 277
|
||||
#note 2B FReEBO(R)N is not easy to spot, but that's clearly best. The blank was still a clutch draw. #findinglarge
|
||||
>Samuel_Moch: BEGLNRU 10B BUGLE +22 306
|
||||
#note I might do 10E BUG here- since shutting down the board is not possible, he should try putting himself in a position to bingo back if I do that on my next turn since my last turn was a bit more aggressive. BUG wins a good 4% more often than BUGLE does.
|
||||
>Samuel_Kaplan: ?AENRTU 15I sAUNTER +83 360
|
||||
#note As boring as this play looks, it's obviously better than anything else available. Sometimes those are the plays that win you games.
|
||||
>Samuel_Moch: EMNORTT 14N MO +18 324
|
||||
>Samuel_Kaplan: AAIISWY 2F AI.WAY +16 376
|
||||
#note Picking 2 Is would normally be terrible, but given the positioning of the G in BUGLE, this is actually a huge break since there are no -ING threats to worry about. Sam's last play of MO was a great one since he leaves 2 in the bag, and I will have to permute more combinations to block as many bingo threats as possible since I'm not up by enough to outrun. G2 AIS looks tempting at first since I take out the RNI in ARNICA, but the problem is it still allows 1C UNDER(A)TE if DT is in the bag and 1E DER(A)TTED if NU is in the bag. I missed 1C DENUD(A)TE assuming RT is in the bag, but given that I saw more threats after AIS, I still felt it was incorrect. AIRWAY is the best play not just for spread purposes if he does bingo out, but AIRWAY only allows 5C DENDR(I)TE if TU is in the bag. Champ does confirm that this is my winningest move along with WI(R)Y and WA(R)Y, and I'm proud to have gotten this one right.
|
||||
>Samuel_Moch: EENRTTU 3E REN.ET +21 345
|
||||
#note This is a nice play.
|
||||
>Samuel_Kaplan: DDIS G8 ..S +21 397
|
||||
#note I correctly determined that this would be 2 more spread points than 11F DID, but I did miss I5 (A)D(ON)IS which is tricky to spot and is several spread points better than ZAS. #visionmedium
|
||||
>Samuel_Moch: TU 15F UT +8 353
|
||||
#note This is best. I really thought he was going to beat me for the 1st time after he found a 9 and was averaging nearly 40 points a move through 4 turns. It goes to show that anything can happen and I took advantage of his mistake of 14B HERD. He's tough to play against.
|
||||
>Samuel_Moch: (DDI) +10 363
|
||||
Vendored
+35
@@ -0,0 +1,35 @@
|
||||
#character-encoding UTF-8
|
||||
#description Steeltown Scrabble Showdown Round A3 5/31/26
|
||||
#id io.woogles P5s2MwcXZR5c8A3NfTsG3H
|
||||
#lexicon NWL23
|
||||
#game-type classic
|
||||
#player1 JohnHealy John Healy
|
||||
#player2 AnthonyJrAnzaldi Anthony Jr Anzaldi
|
||||
>JohnHealy: FEME 8G FEM +16 16
|
||||
>AnthonyJrAnzaldi: ULU 7I ULU +9 9
|
||||
>JohnHealy: VEI J6 V.EI +24 40
|
||||
>AnthonyJrAnzaldi: DRAPERS L1 DRAPERS +78 87
|
||||
>JohnHealy: MINGLEY 1F MINGLE. +33 73
|
||||
>AnthonyJrAnzaldi: ZOO K3 ZOO +43 130
|
||||
>JohnHealy: VWYITTT -TTT +0 73
|
||||
>AnthonyJrAnzaldi: CHELA 9C CHELA +20 150
|
||||
>JohnHealy: WAIVY E5 WAIV. +22 95
|
||||
>AnthonyJrAnzaldi: AWA D3 AWA +17 167
|
||||
>JohnHealy: BONY C3 BONY +37 132
|
||||
>AnthonyJrAnzaldi: G 3K ..G +26 193
|
||||
>JohnHealy: QUA B1 QUA +29 161
|
||||
>AnthonyJrAnzaldi: CIS N1 CIS +24 217
|
||||
>JohnHealy: FOUNTHE K9 FOUNT +21 182
|
||||
>AnthonyJrAnzaldi: APERIES 14E APERIES +76 293
|
||||
>JohnHealy: JETTYI 15A JETTY +53 235
|
||||
>AnthonyJrAnzaldi: NOI H12 NO.I +15 308
|
||||
>JohnHealy: HOBOTI? 13C BOHO +31 266
|
||||
>AnthonyJrAnzaldi: RIN 10E RIN +27 335
|
||||
>JohnHealy: GEALIT? 12A GEAL +22 288
|
||||
>AnthonyJrAnzaldi: KIT I13 K.T +25 360
|
||||
>JohnHealy: II??ATE 12G I.sI.uATE +73 361
|
||||
>AnthonyJrAnzaldi: TENDDRS O8 TEND.D +24 384
|
||||
>JohnHealy: ODXER 6E .XED +28 389
|
||||
>AnthonyJrAnzaldi: RS E5 ......S +26 410
|
||||
>JohnHealy: OR 13M RO. +14 403
|
||||
>JohnHealy: (R) +2 405
|
||||
Vendored
+36
@@ -0,0 +1,36 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Dustin_Dean Dustin Dean
|
||||
#player2 Judy_Cole Judy Cole
|
||||
>Dustin_Dean: GUV H7 GUV +14 14
|
||||
>Judy_Cole: EFI 9F FI.E +12 12
|
||||
>Dustin_Dean: AHNOSTU 7E HAN.OUTS +64 78
|
||||
#note best, only bingo
|
||||
>Judy_Cole: FO 6E OF +31 43
|
||||
>Dustin_Dean: DEKR L3 DREK. +20 98
|
||||
>Judy_Cole: IOR K5 RO.I +18 61
|
||||
>Dustin_Dean: ACDH 5B CHAD +34 132
|
||||
>Judy_Cole: AIW 3J WA.I +16 77
|
||||
>Dustin_Dean: ADEJ C3 JE.AD +36 168
|
||||
>Judy_Cole: ANV 10E VAN +17 94
|
||||
>Dustin_Dean: TU 3C .UT +10 178
|
||||
>Judy_Cole: ?EEINSW 11G NEWSIEr +71 165
|
||||
#note Best is 8A WISE for 41, keeping EN?. Next is N1 EiSWEIN for 81, putting E in the triple lane.
|
||||
>Dustin_Dean: ?ELMNOT B7 OMENTaL +63 241
|
||||
#note 12H MOoNLET for 92
|
||||
>Judy_Cole: ANQT 12A Q.NAT +46 211
|
||||
>Dustin_Dean: AZ 11D ZA +45 286
|
||||
>Judy_Cole: BDO A7 BOD +39 250
|
||||
>Dustin_Dean: OX 13B .OX +43 329
|
||||
>Judy_Cole: AMRY M9 MA.RY +24 274
|
||||
>Dustin_Dean: GIPR 13I GRIP. +13 342
|
||||
>Judy_Cole: BEGINOS 14E BINGOES +73 347
|
||||
#note best, ahead of N4 BINGOES for 72
|
||||
>Dustin_Dean: CEE F2 CEE +22 364
|
||||
>Judy_Cole: EIPRT 15A TRIPE +34 381
|
||||
>Dustin_Dean: AELORSU 15H ORE +25 389
|
||||
#note There are 6 winning plays: Best is N7 SAUL, which blocks N8 LEY; the others are J9 RA(S)U(RE), N6 EUROS or ROUES, J9 RA(S)U(RE)S, and N1 OUSEL. ORE and N3 SAUL tie.
|
||||
>Judy_Cole: EIILLTY N8 LEY +31 412
|
||||
#note best, all other plays lose
|
||||
>Dustin_Dean: ALSU N3 SAUL +15 404
|
||||
#note SAUL for the tie is best, everything else loses. I had 9.5 seconds left on my clock after SAUL.
|
||||
>Dustin_Dean: (IILT) +8 412
|
||||
Vendored
+43
@@ -0,0 +1,43 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Mad_Palazzo Mad Palazzo
|
||||
#player2 Caleb_Pittman Caleb Pittman
|
||||
>Mad_Palazzo: ILMO 8G LIMO +12 12
|
||||
>Caleb_Pittman: BELPRTY 9H TYPER +31 31
|
||||
#note Why isn't this a word?
|
||||
>Mad_Palazzo: GHIN J7 H..ING +14 26
|
||||
>Caleb_Pittman: BEELOTY 12H BO.EY +28 59
|
||||
#note Just totally missed 8L OBEY. Whoops
|
||||
>Mad_Palazzo: AHIO 8L OHIA +27 53
|
||||
>Caleb_Pittman: EIJLTWZ H12 .IZE +45 104
|
||||
#note Neck and neck with 10L TIZ. Entirely reasonable
|
||||
>Mad_Palazzo: AIQU K4 QUAI +31 84
|
||||
>Caleb_Pittman: JLORTUW 14F WU. +23 127
|
||||
#note Macondo likes 13K WORT, which seems nuts. This is reasonable
|
||||
>Mad_Palazzo: ?AIORSV 15A VARIcOS. +90 174
|
||||
>Caleb_Pittman: AEJLORT N7 R.OJA +28 155
|
||||
#note TOLARJE(V)!!!! I'll never get to play that again as a natural. And for 107 points! Man, oh, man
|
||||
>Mad_Palazzo: ENTW 13K WENT +24 198
|
||||
>Caleb_Pittman: EGLLNTV -GLLNV +0 155
|
||||
#note Never considered 14J VET, which is silly of me. When I exchange, I often exchange too much. I should keep ENT at a minimum
|
||||
>Mad_Palazzo: CGILN L2 CLING +43 241
|
||||
>Caleb_Pittman: AEEINPT 2F PATIEN.E +70 225
|
||||
#note Only bingo
|
||||
>Mad_Palazzo: CDERSTU 3A CRUSTED +83 324
|
||||
#note ..There is a 106 point play here
|
||||
>Caleb_Pittman: AAEFOUX 1G FAX +61 286
|
||||
#note Forced, unfortunately
|
||||
>Mad_Palazzo: ADEKL A1 LA.KED +54 378
|
||||
>Caleb_Pittman: ?ADEEOU 4E ODEA +22 308
|
||||
#note Fine
|
||||
>Mad_Palazzo: DFIRT 13B DRIFT +24 402
|
||||
>Caleb_Pittman: ?ELSTUV N2 SEV +38 346
|
||||
#note I saw my only bingo, and correctly passed it up for the optimal play.
|
||||
>Mad_Palazzo: AGMNO B5 MANGO +19 421
|
||||
>Caleb_Pittman: ?EELSTU A8 aLEE +17 363
|
||||
#note She made a great last play. Every line is dead. I spent 8 minutes looking for 9s starting with TO. Didn't even bother looking at the F... I mean, what kind of word even ends with a 4 point tile????
|
||||
|
||||
Anyway I blocked her only out, guaranteeing an out in 2
|
||||
>Mad_Palazzo: BNOR 12D BO +21 442
|
||||
>Caleb_Pittman: STU 5D UTS +19 382
|
||||
#note Best out
|
||||
>Caleb_Pittman: (NR) +4 386
|
||||
Vendored
+44
@@ -0,0 +1,44 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Caleb_Pittman Caleb Pittman
|
||||
#player2 Ian_Passfield Ian Passfield
|
||||
>Caleb_Pittman: ACEGKOP H4 GECKO +28 28
|
||||
#note Best
|
||||
>Ian_Passfield: GIOV 5D OGIV. +18 18
|
||||
>Caleb_Pittman: AAEHIJP 4B HAJ +44 72
|
||||
#note Obviously
|
||||
>Ian_Passfield: OTY 3A TOY +27 45
|
||||
>Caleb_Pittman: AAADEIP 7H .APA +11 83
|
||||
#note Entirely reasonable
|
||||
>Ian_Passfield: EF 2A EF +30 75
|
||||
>Caleb_Pittman: ADEIMWY 8K YAWED +44 127
|
||||
#note Best
|
||||
>Ian_Passfield: MTX F4 M.XT +29 104
|
||||
>Caleb_Pittman: ADILMNO N7 D.MONIAL +76 203
|
||||
#note I'm stupid. I've had this rack 40 times in Zyzzyva. And it's not like it's been a while; I had it come up 5 days ago!!!! I knew there was something here, but didn't realize it was MELANOID until much later. Sorry, Ian
|
||||
>Ian_Passfield: AGZ M13 ZAG +50 154
|
||||
>Caleb_Pittman: EIORRSS O13 SER +24 227
|
||||
#note Some missed bingoes that I likely should have seen
|
||||
>Ian_Passfield: HILNRU 15G HURLIN. +36 190
|
||||
>Caleb_Pittman: FIIORST 6B FIT +24 251
|
||||
#note Looks good
|
||||
>Ian_Passfield: W A1 W.. +18 208
|
||||
>Caleb_Pittman: AIOPRST 9C AIRPOST +75 326
|
||||
#note Best. Got challenged
|
||||
>Ian_Passfield: AAOSTTU - +0 208
|
||||
#note #unsuccessful-challenge
|
||||
>Caleb_Pittman: AACNORR M3 NARRO. +24 350
|
||||
#note Was gonna play CARROW* before realizing that that was a Harry Potter villain
|
||||
>Ian_Passfield: BDU L2 DUB +22 230
|
||||
>Caleb_Pittman: ?ACDEIS 14A DISCAsE +73 423
|
||||
#note DIsCASE(D) and CAdDISE(D) are the best bingoes. Couldn't avoid floating the 3x3 without taking a sacrifice
|
||||
>Ian_Passfield: EOU E9 .OUE +8 238
|
||||
>Caleb_Pittman: ?EEINQR 12B QuE.NIER +84 507
|
||||
#note I played this instantly. It's a 21 point error, lol
|
||||
>Ian_Passfield: EILN N2 LINE +18 256
|
||||
>Caleb_Pittman: BINOTUV 2I VOI. +10 517
|
||||
#note I wanted to block his only out while retaining an out. It's better to just take the points with VINO and let him go out
|
||||
>Ian_Passfield: EEELST 7A LETS +24 280
|
||||
>Caleb_Pittman: BNTU 11K BUN. +12 529
|
||||
#note Best
|
||||
>Ian_Passfield: EE O8 .EE +10 290
|
||||
>Ian_Passfield: (T) +2 292
|
||||
Vendored
+50
@@ -0,0 +1,50 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Jim_Carlton Jim Carlton
|
||||
#player2 Caleb_Pittman Caleb Pittman
|
||||
>Jim_Carlton: FHIRT 8D FIRTH +30 30
|
||||
>Caleb_Pittman: GLOQRVZ 7C VOG +19 19
|
||||
#note Just totally whiffed on V(I)ZOR. Awful
|
||||
>Jim_Carlton: AEEIRTW H7 W.EATIER +65 95
|
||||
#note Didn't even question this. Damn you, Collins!!!
|
||||
>Caleb_Pittman: DEGLQRZ 10H .DZ +33 52
|
||||
#note Best, probably
|
||||
>Jim_Carlton: CEU K10 ECU +24 119
|
||||
>Caleb_Pittman: EGILMQR 12J Q.IRE +28 80
|
||||
#note Have to do this, unfortunately
|
||||
>Jim_Carlton: ADHS O12 SHAD +43 162
|
||||
>Caleb_Pittman: EGLMRTT E7 ..MLET +18 98
|
||||
#note Big mistake, missed GL(E)ET and GR(E)ET. Cool though
|
||||
>Jim_Carlton: IIII -IIII +0 162
|
||||
>Caleb_Pittman: AGNPRST D11 PANG +22 120
|
||||
#note I considered PR(E)ST! This is good though
|
||||
>Jim_Carlton: AISX 15A AXIS +45 207
|
||||
#note aw man
|
||||
>Caleb_Pittman: AJKNRST N9 JAK.ST +47 167
|
||||
#note Best. Got a challenge!
|
||||
>Jim_Carlton: CGNORUV - +0 207
|
||||
#note #unsuccessful-challenge
|
||||
>Caleb_Pittman: CEIINNR I1 CRININE +67 234
|
||||
#note Feel bad, but this was an intentional phony. CINERIN just didn't play.
|
||||
>Jim_Carlton: MNOO O7 MOON +29 236
|
||||
>Caleb_Pittman: DEEEOOY L7 EYED +28 262
|
||||
#note I thought I got bagged. Anything but! (C)OOEYED is a 20% advantage
|
||||
>Jim_Carlton: AABELRV 3F VAR.ABLE +78 314
|
||||
>Caleb_Pittman: AEIOOOO -IOOOO +0 262
|
||||
#note I deserve this
|
||||
>Jim_Carlton: EPU 1G PU.E +24 338
|
||||
>Caleb_Pittman: ?ABEIOU B11 BEAU. +28 290
|
||||
#note Likely best, sadly
|
||||
>Jim_Carlton: FNNY 5H F.NNY +22 360
|
||||
>Caleb_Pittman: ?DIIOOU F8 .OID +21 311
|
||||
#note If only DOUPIONI were available... This is perfectly good
|
||||
>Jim_Carlton: OW N6 WO +17 377
|
||||
>Caleb_Pittman: ?IILOOU 2M OI +6 317
|
||||
#note Was hoping for something like abOULIA
|
||||
>Jim_Carlton: ALT O1 ALT +12 389
|
||||
>Caleb_Pittman: ?EILOOU F2 O.OLI +12 329
|
||||
#note I can do better, but what's the point, eh?
|
||||
>Jim_Carlton: ?GRST 13A T.R.S +20 409
|
||||
#note I gave him an out, whoops
|
||||
>Caleb_Pittman: ?EU J12 .UEy +14 343
|
||||
#note Best
|
||||
>Caleb_Pittman: (G?) +4 347
|
||||
Vendored
+51
@@ -0,0 +1,51 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Caleb_Pittman Caleb Pittman
|
||||
#player2 Lydia_Keras Lydia Keras
|
||||
>Caleb_Pittman: EEIINSV H7 VIE +12 12
|
||||
#note yup
|
||||
>Lydia_Keras: ARV 10F VAR +21 21
|
||||
>Caleb_Pittman: EGIINRS 7F RE.ISING +66 78
|
||||
#note yup
|
||||
>Lydia_Keras: KNT K5 KN.T +16 37
|
||||
>Caleb_Pittman: ADELNOR I9 LADRONE +70 148
|
||||
#note yup
|
||||
>Lydia_Keras: BLOTT 15D BOTTL. +33 70
|
||||
>Caleb_Pittman: ADDEEEM 12H D.EAMED +26 174
|
||||
#note Missed ADEEMED, rough. Thought of it, too
|
||||
>Lydia_Keras: PUY 11K YUP +34 104
|
||||
>Caleb_Pittman: AEOPUWY 13M POW +33 207
|
||||
#note PREVISING???
|
||||
>Lydia_Keras: X N12 ..X +22 126
|
||||
>Caleb_Pittman: AEMNTUY 15N YA +30 237
|
||||
#note Perfectly good
|
||||
>Lydia_Keras: EJ G13 JE. +18 144
|
||||
>Caleb_Pittman: BEGMNTU G9 B.M +18 255
|
||||
>Lydia_Keras: FO 6F OF +16 160
|
||||
>Caleb_Pittman: ?EGNOTU 14L GO. +19 274
|
||||
#note Boring, but correct
|
||||
>Lydia_Keras: ILO 8M OIL +12 172
|
||||
>Caleb_Pittman: ?ENRTTU O8 .UNT +4 278
|
||||
#note Sorry, FERBAM?
|
||||
>Lydia_Keras: EF 5E EF +16 188
|
||||
>Caleb_Pittman: ?CEGRTU F10 .UG +11 289
|
||||
#note Boring, but good
|
||||
>Lydia_Keras: HINU 12B HUIN. +18 206
|
||||
#note This looked bad to me, but I could bingo for more through the H
|
||||
>Caleb_Pittman: ?ACERTW B8 WATC.ERs +84 373
|
||||
#note Only bingo
|
||||
>Lydia_Keras: EO 8A O.E +18 224
|
||||
>Caleb_Pittman: AIILOQR L4 QI +28 401
|
||||
#note duh
|
||||
>Lydia_Keras: AE A14 AE +11 235
|
||||
>Caleb_Pittman: AHILORS C2 SHOALI.R +79 480
|
||||
#note Yup. Played this as it was more likely to be challenged than AIRHOLES
|
||||
>Lydia_Keras: ACDEEIR - +0 235
|
||||
#note Lo and behold
|
||||
>Caleb_Pittman: CDEINTZ M3 ZIT +42 522
|
||||
>Lydia_Keras: ?AAIRSS N2 pASS +37 272
|
||||
#note The bingoes here are all hard, I'd likely miss them too
|
||||
>Caleb_Pittman: CDEEN B1 NEED +22 544
|
||||
#note Blocked the outs, accidentally gave her another
|
||||
>Lydia_Keras: AIR A1 AR +15 287
|
||||
>Caleb_Pittman: C 4B ..C +12 556
|
||||
>Caleb_Pittman: (I) +2 558
|
||||
Vendored
+47
@@ -0,0 +1,47 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Caleb_Pittman Caleb Pittman
|
||||
#player2 Evan_Bialo Evan Bialo
|
||||
>Caleb_Pittman: AEEIRSY H8 AYE +12 12
|
||||
>Evan_Bialo: AMP G8 AMP +26 26
|
||||
>Caleb_Pittman: DEINRST 11A TINDERS +74 86
|
||||
#note yup
|
||||
>Evan_Bialo: EFHI A11 .HIEF +45 71
|
||||
>Caleb_Pittman: ABCFGTX I9 CAB +28 114
|
||||
#note Looks best
|
||||
>Evan_Bialo: BDGINO C9 BO.DING +28 99
|
||||
>Caleb_Pittman: ?EFGSTX 12C .EX +37 151
|
||||
#note Best
|
||||
>Evan_Bialo: UV 15C .UV +8 107
|
||||
>Caleb_Pittman: ?FGIOST J4 FOGIeST +85 236
|
||||
#note Not a word
|
||||
>Evan_Bialo: IO D8 OI +7 114
|
||||
>Caleb_Pittman: DEKNOOW E4 WOODEN +27 263
|
||||
#note Methinks best, sets up the K
|
||||
>Evan_Bialo: IQ I3 QI +26 140
|
||||
>Caleb_Pittman: IIKMRRT H4 KIR +27 290
|
||||
#note yup
|
||||
>Evan_Bialo: ?EHIOST 12I SHOrTIE +78 218
|
||||
#note 1 of 2 bingoes
|
||||
>Caleb_Pittman: EEIMRTU O12 .MEU +18 308
|
||||
#note Great
|
||||
>Evan_Bialo: AORR 4A ARRO. +18 236
|
||||
>Caleb_Pittman: EEIPRTZ A2 TR.PEZE +54 362
|
||||
#note Yep
|
||||
>Evan_Bialo: AAU C2 AU.A +8 244
|
||||
>Caleb_Pittman: AGILLTV M9 VIT.A +24 386
|
||||
#note Probably better to just play for equity, but this is fine
|
||||
>Evan_Bialo: ARW B6 WAR +40 284
|
||||
>Caleb_Pittman: AEGLLUY 1B GLEY +14 400
|
||||
#note Just tryna block
|
||||
>Evan_Bialo: CEL 2C .LEC +19 303
|
||||
>Caleb_Pittman: ADEJLNU 11K JE. +39 439
|
||||
#note Best
|
||||
>Evan_Bialo: LNNOST K4 SNOT +28 331
|
||||
#note Best
|
||||
>Caleb_Pittman: ADLNU 7E .UD +19 458
|
||||
#note I should threaten an out instead, but this is good
|
||||
>Evan_Bialo: LN 6E .N +8 339
|
||||
>Caleb_Pittman: ALN 10B L. +8 466
|
||||
#note Best
|
||||
>Evan_Bialo: L 7I L.. +4 343
|
||||
>Evan_Bialo: (AN) +4 347
|
||||
Vendored
+41
@@ -0,0 +1,41 @@
|
||||
#character-encoding UTF-8
|
||||
#player1 Su_Edwards Su Edwards
|
||||
#player2 Caleb_Pittman Caleb Pittman
|
||||
>Su_Edwards: AY H7 YA +10 10
|
||||
#note love the vert open!
|
||||
>Caleb_Pittman: ACDEELO I6 COLED +18 18
|
||||
#note Bad, I overvalue AE. Held for awhile
|
||||
>Su_Edwards: ACEILRT G8 ARTICLE +64 74
|
||||
>Caleb_Pittman: AEEGOOW F13 AWE +29 47
|
||||
#note I can do better. Not by that much, though
|
||||
>Su_Edwards: EGV E11 VEG +18 92
|
||||
>Caleb_Pittman: EEFGOOS D12 FEE +27 74
|
||||
#note GE(E)SE is best
|
||||
>Su_Edwards: EFNO J9 FOEN +19 111
|
||||
#note I knew this was phony. But I kept it, for some reason
|
||||
>Caleb_Pittman: ?GOORRS 11J .RGO +10 84
|
||||
>Su_Edwards: GIT 15B GIT +13 124
|
||||
>Caleb_Pittman: ?NORRSZ L11 .ROSZ +50 134
|
||||
#note Best, especially since it got challenged
|
||||
>Su_Edwards: ADEIIPY - +0 124
|
||||
>Caleb_Pittman: ?BIINOR B8 ORBItIN. +82 216
|
||||
#note Best bingo. But (Z)ORI may be better
|
||||
>Su_Edwards: AIKN A5 AKIN +30 154
|
||||
>Caleb_Pittman: DEOORTU N9 OUTRODE +76 292
|
||||
#note Looks good
|
||||
>Su_Edwards: ?AAIMST O3 TAMArIS +79 233
|
||||
>Caleb_Pittman: AAEINTU N4 TUNA +17 309
|
||||
#note bad
|
||||
>Su_Edwards: LUV M3 LUV +23 256
|
||||
>Caleb_Pittman: AEEHIIM O13 HEM +44 353
|
||||
#note Points.
|
||||
>Su_Edwards: IJN M7 JIN +40 296
|
||||
>Caleb_Pittman: AEEIIPY B2 YIPE +20 373
|
||||
#note Defense
|
||||
>Su_Edwards: AHP A1 PAH +34 330
|
||||
>Caleb_Pittman: AEINRWX L6 WAX +48 421
|
||||
#note Obviously
|
||||
>Su_Edwards: EOQTU K2 QUOTE +33 363
|
||||
>Caleb_Pittman: BDEINRS 4B .REBINDS +80 501
|
||||
#note And that's that
|
||||
>Caleb_Pittman: (DLS) +8 509
|
||||
@@ -0,0 +1,154 @@
|
||||
// Package selfplay drives greedy AI-vs-AI Scrabble games used to validate the move
|
||||
// generators (the same position is offered to both) and to benchmark them.
|
||||
package selfplay
|
||||
|
||||
import (
|
||||
"math/rand"
|
||||
"sort"
|
||||
"time"
|
||||
|
||||
"scrabble-solver/board"
|
||||
"scrabble-solver/rack"
|
||||
"scrabble-solver/rules"
|
||||
"scrabble-solver/scrabble"
|
||||
)
|
||||
|
||||
// blankTile marks a blank in the bag and in a player's hand.
|
||||
const blankTile byte = 0xff
|
||||
|
||||
// Bag is a shuffled draw pile of tiles.
|
||||
type Bag struct {
|
||||
tiles []byte
|
||||
}
|
||||
|
||||
// NewBag fills a bag from the ruleset's tile counts and blanks and shuffles it with the
|
||||
// given seed (so games are reproducible).
|
||||
func NewBag(rs *rules.Ruleset, seed int64) *Bag {
|
||||
var tiles []byte
|
||||
for i, n := range rs.Counts {
|
||||
for range n {
|
||||
tiles = append(tiles, byte(i))
|
||||
}
|
||||
}
|
||||
for range rs.Blanks {
|
||||
tiles = append(tiles, blankTile)
|
||||
}
|
||||
rng := rand.New(rand.NewSource(seed))
|
||||
rng.Shuffle(len(tiles), func(i, j int) { tiles[i], tiles[j] = tiles[j], tiles[i] })
|
||||
return &Bag{tiles: tiles}
|
||||
}
|
||||
|
||||
// Len returns the number of tiles left in the bag.
|
||||
func (b *Bag) Len() int { return len(b.tiles) }
|
||||
|
||||
// Draw removes up to n tiles from the bag and returns them.
|
||||
func (b *Bag) Draw(n int) []byte {
|
||||
if n > len(b.tiles) {
|
||||
n = len(b.tiles)
|
||||
}
|
||||
out := b.tiles[len(b.tiles)-n:]
|
||||
b.tiles = b.tiles[:len(b.tiles)-n]
|
||||
return out
|
||||
}
|
||||
|
||||
// rackOf builds a generation rack from a hand of tiles.
|
||||
func rackOf(tiles []byte, size int) rack.Rack {
|
||||
r := rack.New(size)
|
||||
for _, t := range tiles {
|
||||
if t == blankTile {
|
||||
r.AddBlank()
|
||||
} else {
|
||||
r.Add(t)
|
||||
}
|
||||
}
|
||||
return r
|
||||
}
|
||||
|
||||
// removeUsed returns the hand with the tiles consumed by m removed.
|
||||
func removeUsed(tiles []byte, m scrabble.Move) []byte {
|
||||
out := append([]byte(nil), tiles...)
|
||||
for _, p := range m.Tiles {
|
||||
want := p.Letter
|
||||
if p.Blank {
|
||||
want = blankTile
|
||||
}
|
||||
for i, t := range out {
|
||||
if t == want {
|
||||
out = append(out[:i], out[i+1:]...)
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// greedy returns the highest-scoring move (ties broken by canonical key for
|
||||
// reproducibility), or ok=false if there is no legal move.
|
||||
func greedy(gen scrabble.Generator, b *board.Board, rk rack.Rack, mode scrabble.Mode) (scrabble.Move, int, bool) {
|
||||
moves := gen.GenerateMoves(b, rk, mode)
|
||||
if len(moves) == 0 {
|
||||
return scrabble.Move{}, 0, false
|
||||
}
|
||||
sort.Slice(moves, func(i, j int) bool {
|
||||
if moves[i].Score != moves[j].Score {
|
||||
return moves[i].Score > moves[j].Score
|
||||
}
|
||||
return moves[i].Key() < moves[j].Key()
|
||||
})
|
||||
return moves[0], len(moves), true
|
||||
}
|
||||
|
||||
// Result summarizes a finished game.
|
||||
type Result struct {
|
||||
Turns int // turns taken (plays plus passes)
|
||||
Plays int // scoring plays made
|
||||
Scores [2]int // final score per player
|
||||
MovesGenerated int // total legal moves generated across all turns
|
||||
GenTime time.Duration // time spent generating moves
|
||||
}
|
||||
|
||||
// PlayGame plays one greedy AI-vs-AI game with generator gen and returns its result. If
|
||||
// observe is non-nil it is called before each turn with a clone of the board and the
|
||||
// player's rack, so a caller can compare generators on identical positions.
|
||||
func PlayGame(rs *rules.Ruleset, gen scrabble.Generator, mode scrabble.Mode, seed int64,
|
||||
observe func(b *board.Board, rk rack.Rack)) Result {
|
||||
|
||||
const maxTurns = 300
|
||||
bag := NewBag(rs, seed)
|
||||
b := board.New(rs.Rows, rs.Cols)
|
||||
hands := [2][]byte{bag.Draw(rs.RackSize), bag.Draw(rs.RackSize)}
|
||||
|
||||
var res Result
|
||||
passes := 0
|
||||
for turn := range maxTurns {
|
||||
p := turn % 2
|
||||
rk := rackOf(hands[p], rs.Size())
|
||||
if observe != nil {
|
||||
observe(b.Clone(), rk.Clone())
|
||||
}
|
||||
res.Turns++
|
||||
|
||||
t0 := time.Now()
|
||||
m, n, ok := greedy(gen, b, rk, mode)
|
||||
res.GenTime += time.Since(t0)
|
||||
res.MovesGenerated += n
|
||||
if !ok {
|
||||
if passes++; passes >= 4 {
|
||||
break
|
||||
}
|
||||
continue
|
||||
}
|
||||
passes = 0
|
||||
scrabble.Apply(b, m)
|
||||
res.Scores[p] += m.Score
|
||||
res.Plays++
|
||||
hands[p] = removeUsed(hands[p], m)
|
||||
if need := rs.RackSize - len(hands[p]); need > 0 {
|
||||
hands[p] = append(hands[p], bag.Draw(need)...)
|
||||
}
|
||||
if len(hands[p]) == 0 && bag.Len() == 0 {
|
||||
break
|
||||
}
|
||||
}
|
||||
return res
|
||||
}
|
||||
@@ -0,0 +1,33 @@
|
||||
package selfplay_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"github.com/iliadenisov/alphabet"
|
||||
|
||||
"scrabble-solver/internal/dictdawg"
|
||||
"scrabble-solver/internal/wordlist"
|
||||
"scrabble-solver/rules"
|
||||
"scrabble-solver/scrabble"
|
||||
"scrabble-solver/selfplay"
|
||||
)
|
||||
|
||||
func TestPlayGameSmoke(t *testing.T) {
|
||||
rs := rules.English()
|
||||
words := wordlist.Encode([]string{
|
||||
"cat", "cats", "car", "care", "cares", "cot", "cap", "cab", "at", "as",
|
||||
"tea", "eat", "ear", "era", "are", "oat", "oats", "sat", "set", "sea",
|
||||
"tar", "tars", "star", "arts", "rat", "rats", "ace", "aces", "scar", "scare",
|
||||
}, alphabet.Latin(), 2, 15)
|
||||
f, err := dictdawg.Build(alphabet.Latin(), words)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
gen := scrabble.NewDAWGGenerator(rs, f)
|
||||
|
||||
res := selfplay.PlayGame(rs, gen, scrabble.Both, 1, nil)
|
||||
if res.Turns == 0 || res.Plays == 0 {
|
||||
t.Errorf("degenerate game: %+v", res)
|
||||
}
|
||||
t.Logf("smoke game: turns=%d plays=%d scores=%v", res.Turns, res.Plays, res.Scores)
|
||||
}
|
||||
Reference in New Issue
Block a user