The rising Latin vocabulary wall

New headwords a reader must add at each tier: DCC core 1000 → Cicero, Divinatio in Caecilium → MGH Libelli de Lite.

1 · Dickinson (DCC) Latin Core

foundation · all eras

997 headwords — the starting vocabulary

The 1000 most frequent words across canonical Latin. Covers ~85% of the running words in the Divinatio.

↓ add

2 · Cicero, Divinatio in Caecilium

1st c. BCE · classical

+466 new headwords beyond the core

One short forensic speech — 5,818 running words, 73 sections. Mostly judicial vocabulary (accuso, quaestor, accusator).

↓ add

3 · MGH Libelli de Lite

11th–12th c. · medieval

+5,026 new headwords beyond core + Divinatio

764-page multi-author collection — ~378,000 running words. A different, ecclesiastical lexicon (episcopus, ecclesia, papa, concilium, canon). Estimate: ~26% of OCR tokens unrecognised.

New headwords added at each tier

DCC core 997; Divinatio +466; Libelli de Lite +5,026.

Cumulative vocabulary you must know

997
+466
+5,026
DCC core · 997 Divinatio · running total 1,463 Libelli de Lite · running total ≈6,489
The Libelli jump is ~10× the Divinatio's, driven by length, many authors, and a medieval church lexicon — but it is front-loaded: about 30% of the 5,026 new words occur only once, while the top ~300 new headwords already cover half of all new-word occurrences and the top ~1,000 cover ~77%. The active learning core is a few hundred ecclesiastical terms with a long rare tail.

Method & caveats. Texts lemmatised with the Latin-macronizer morphological database (enclitic splitting, frequency disambiguation, capitalisation-based proper-noun routing); compared against the DCC Latin Core (997 headwords). Divinatio from clean Perseus XML (robust to ±5–10). The Libelli is noisy Internet Archive OCR in medieval orthography, so its figure is an estimate — read 5,026 as “on the order of 5,000 classical-recognisable headwords”; apparatus abbreviations, Roman numerals, and medieval spellings (aecclesia=ecclesia) limit precision in both directions. Source: DCC Latin Core Vocabulary.