Matrix Hub · Governance Corpus · Formula Dossier · EpiDoc Workshop · Week 13 / 第十三周

Formulaic Language: The Premise of Prediction — bundled resource 公式化语言:预测的前提 — 总集资源

A consolidated Week 13 module: the lecture, 9 interactive challenges (Latin + Greek), the four full-text editions, a three-engine restoration workbench (DeepSeek · Ithaca-style simulation · Aeneas), and a complete corpus-connectivity dashboard.第十三周整合教学模块:讲义、九项互动练习(拉丁与希腊)、四部全文校勘本、三引擎修复工作台(DeepSeek · Ithaca 式模拟 · Aeneas),以及语料网状导览之全。
Latin · 拉丁 Greek · 希腊 Bundle · 总集

§0 · The four-case mosaic§0 · 四案概览

Four full-text editions, one premise: formula = predictability = restorability.四部全文校勘本,一个前提:套语即可预测性,可预测性即可补全性。

This module argues one claim experientially across four cases. Each is a different way the formulaic-ness of public-facing inscriptions becomes a tool for filling lacunae — and a different way that tool reaches its limit. The cards below open the corresponding full-text edition; the section navigator above takes you to the four challenges that test each formulaic mechanism.

本单元以四案为体,演示同一命题:公共铭文之公式化既是补全残文之利器,亦各自显示其极限。下列卡片导向相应之全文校勘本;上方导航通往检验各类机制之四项练习。

Case A · Imperial Latin案 A · 帝政期拉丁
Laudatio Turiae 图利娅颂辞
CIL VI 1527 = ILS 8393 · c. 8–2 BCE · Augustan Rome
The longest preserved private Latin funerary inscription; type-specimen of the laudatio funebris genre, with an 8-slot formula bank (Mantzilas 2017). Gordon's 1949 fragment supplied right-column lines II 0–9 and proved Mommsen wrong on 7 of 8 testable lines (see §III).现存最长私家拉丁丧葬铭文,丧葬颂辞体裁之典型,八槽位结构(Mantzilas 2017)。Gordon 1949 年发现之残片提供右列 II 0-9 行,并证 Mommsen 八行中误七(参 §III)。
Slot focus: virtue-catalogue · counterfactual closer · proscription-lament
Case B · Classical Greek案 B · 古典期希腊
Segesta decree 塞格斯塔条约
IG I³ 11 = ML 37 · 458/7 or 418/7 BCE — the dating crux
The single most consequential Athenian decree of the 5th c. for Empire chronology — its date pivots on three restored letters. Habr]on (Mommsen-Wade-Gery, three-bar sigma → 458) vs Antiph]on (Chambers, Gallucci & Spanos 1990 laser-scan → 418). See Challenge VIII.五世纪雅典法令对帝国年代学最具决定性者,其纪年仅系于三字之补。Habr]on(Mommsen-Wade-Gery,三划西格玛 → 458 年)对 Antiph]on(Chambers 等 1990 年激光扫描 → 418 年),见挑战八。
Slot focus: prescript-archon · enactment-formula · oath-clause
Case C · Classical Greek案 C · 古典期希腊
Athenian Tribute Lists 雅典贡金表
IG I³ 259–290 · 454/3–415/4 BCE · 39 annual lists
Formula at industrial scale. 40 years of annual aparche records on stoichedon grids; the same prescript-formula reused; cities listed in alphabet-blocks. Year 1 + Year 2 + Year 9 + Year 39 anchor the comparative view; the 425/4 Thoudippos reassessment is the discontinuity bar. Two editorial schools (ATL maximal vs Paarmann minimal) disagree about what a formula entitles an editor to restore — see Challenge VII.工业规模之公式化文本。四十年间方阵网格上之年度初奉(aparche)记录,同一前言套语周而复始,城邦按字母排列。第 1、2、9、39 年构成比较锚点,公元前 425/4 年 Thoudippos 重新评估即比较图中之断裂点。两编辑学派(ATL 最大派对 Paarmann 最小派)就一条套语究竟授权编者写下什么意见相左 — 见挑战七。
Slot focus: year-prescript · alphabet-block · aparche-quota
Case D · Imperial Greek+Latin案 D · 帝政期希腊与拉丁
Persicus on Artemision Persicus 关于阿尔忒弥斯神庙的诏书
IK Ephesos 17–19 · AD 44 · proconsular edict, Ephesos
A proconsul (Paullus Fabius Persicus, cos. AD 34) regulates the finances and personnel of the Artemision in formal bilingual register. The Greek text translates from the Latin original; comparison reveals how proconsular formulae cross the language barrier — and where translation chooses one register over another. Bridges the Latin and Greek tracks of this module.前 34 年执政官 Paullus Fabius Persicus 以双语正式语气规范阿尔忒弥斯神庙之财政与人员。希腊文译自拉丁本;对照可见地方总督套语如何越语言之界 — 译文又如何在不同语域中取舍。本案桥接本单元之拉丁与希腊两线。
Slot focus: proconsular-heading · reform-substantive · publication-clause
Where to read each in full. Click any card above. All four editions are part of the Governance Corpus and follow the SCPP standard (line-keyed apparatus, hover-glossary, EpiDoc TEI XML downloads). The Laudatio's §III "Restoration apparatus — Mommsen and after" demonstrates the box-score format used in Challenge V and Challenge IX below. 各案全文。点击上方卡片。四部校勘本皆属公共文书语料库,依 SCPP 标准(行键校勘栏、悬浮词汇、EpiDoc TEI XML 下载)。图利娅颂辞 §III"校勘谱:Mommsen 及其后继"体现下方挑战五与挑战九所用之对照格式。

The Premise前提

Restoration as prediction — and what the assumption costs.作为预测的修复,以及这一前提的代价。

Epigraphy is, to an unusual degree, a predictive discipline. The stones reach us broken, abraded, re-used; the editor's task is to complete them. Completion is possible at all only because Roman inscriptions are formulaic — they repeat. A funerary altar, a votive plaque, a building dedication, an honorific base: each draws on a small, well-worn repertoire of phrases. Where the stone fails, the formula carries on.

The standard reference work states the method without hesitation:

铭文学在相当程度上是一门预测性学科。石头传到我们手中时已残破、磨蚀、被改作他用;编者的任务即是将其补全。补全之所以可能,唯因罗马铭文具有公式化特征,它们不断重复。墓祭坛、还愿牌、建筑题献、荣誉基座:每一类都取材于一套用熟了的、数量有限的套语。石头中断处,套语接续。

标准参考著作毫不犹豫地陈述了这一方法:

"Paying attention to epigraphic patterns is of crucial significance, since formulae are very common in Latin inscriptions. … if the text is carved on a cippus of low-quality stone with the letters M S clearly legible in the first line, it is a safe bet that we are dealing with an epitaph introduced by the common formula [D(is)] M(anibus) s(acrum). … Sound knowledge of epigraphic Latin and the possibility of referring to comparable elements from other inscriptions of similar type will finally allow the epigrapher to complete lacunae in the text by restoring missing letters." “留意铭文的程式至关重要,因为套语在拉丁铭文中极为常见。……若文本刻于一方劣质石料的界碑(cippus)之上,首行M S二字清晰可辨,则几可断定我们面对的是一方以常见套语[D(is)] M(anibus) s(acrum)起首的墓志。……对铭文拉丁语的扎实掌握,加之可参照同类铭文中的可比成分,最终将使铭文学者得以补出残缺、还原失字。” Bruun & Edmondson (eds.), The Oxford Handbook of Roman Epigraphy (Oxford 2015), p. 14 (ch. 1, "The Epigrapher at Work").

This is predictive restoration. It has three moving parts: a corpus of parallels to consult, a repertoire of formulae to recognise, and an editor willing to bet. Digital corpora have made the first part enormous. The Epigraphik-Datenbank Clauss/Slaby (EDCS) holds well over half a million Latin texts; the export analysed throughout this module contains 537,262 inscriptions. A scholar can now ask, in seconds, how a damaged phrase was completed across every comparable inscription, and restore by the weight of parallels. Week 12's C.101 case study — six imperial documents carried on one stele — was an exercise in exactly this formulaic, official language.

This module does not dispute that restoration works. It asks a narrower and harder question: where does it stop working, and how would we know? Our instrument is the most ordinary feature of a Latin inscription — the abbreviation. Abbreviations are where the formulaic system both succeeds most visibly and fails most quietly. And the Oxford Handbook's own Appendix II, its list of epigraphic abbreviations, supplies the evidence — while, as we shall see, quietly contradicting the confident method of page 14.

这便是预测性修复。它由三个部件构成:一个可供查检的平行文本语料库、一套需要辨识的套语谱系,以及一位愿意下注的编者。数字语料库已使第一个部件变得极为庞大。Clauss/Slaby 铭文数据库(EDCS)收录的拉丁文本远超五十万条;本单元通篇分析所用的导出库即含537,262条铭文。学者如今可在数秒之内查得某一残损语句在所有同类铭文中是如何被补全的,并依平行文本之众寡作出修复。第十二周的 C.101 个案,一方石碑承载六份帝国文书,所演练的,正是这种公式化的官方语言。

本单元并不否认修复有效。它所追问的是一个更狭窄、也更棘手的问题:它在何处失效?我们又凭何得知?我们的探针是拉丁铭文中最寻常的特征;缩写。缩写既是公式化系统最显眼的成功之处,也是它最悄无声息的失败之处。而牛津手册自身的附录二(铭文缩写表)恰恰提供了证据,并且,如我们将见,悄然推翻了该书第14页那一套自信的方法。

Thesis under test受检命题

Predictive restoration assumes that epigraphic Latin is predictable enough that a lacuna can be filled from formula and parallel. The five challenges below treat that assumption as a hypothesis, not a given, and test it against two bodies of evidence: the abbreviation menus of OHRE Appendix II, and the attested expansion frequencies of the EDCS corpus. 预测性修复假定铭文拉丁语足够可预测,以致残缺可凭套语与平行文本补足。下列五项挑战将此假定视为一项假说而非既定前提,并以两类证据加以检验:OHRE 附录二所列缩写的菜单,以及 EDCS 语料库中各项展开式的实际频次

What predictive restoration does预测性修复之运作
the stone, as it survives ▢ M S first letter lost PREDICTIVE RESTORATION — a corpus of parallels (EDCS: 537,262) — a repertoire of formulae — an editor willing to bet the text, as published [D(is)] M(anibus) s(acrum) a prediction, printed as fact

The square bracket is the join between evidence and inference. Challenge I asks what the bracket leaves out; Challenge III asks where step two's "corpus of parallels" actually comes from.方括号是证据与推断的接合处。挑战一追问方括号略去了什么;挑战三追问第二步的“平行文本语料库”究竟从何而来。

The repertoire套语谱系

Before testing the limits, see the system working. These are the formulae a Roman epitaph or votive is built from — the "comparable elements" the Handbook tells the editor to consult. Each card pairs the abbreviation with its expansion and a real inscription in which it survives. Predictive restoration is the claim that, where one of these breaks off, the rest of the formula carries the reading across the gap.

在检验其限度之前,先看这一系统如何运作。下列即罗马墓志或还愿铭赖以构成的套语,手册嘱编者参照的“可比成分”。每张卡片将缩写、其展开式与一方实际存留之铭文并置。预测性修复所主张的是:当其中之一断裂时,套语之余可承载读法、越过缺口。

The five challenges五项挑战

IBranching factor. Predictability is a measurable gradient, not a property an abbreviation simply has or lacks.分支因子。可预测性是一个可度量的连续梯度,而非缩写或有或无的属性。
IIThe flagship's hidden premise. Even D M — the Handbook's own "safe bet" — turns ambiguous once context is gone, by the Handbook's own appendix.旗舰的隐含前提。即便是D M;手册自称的“稳妥之选”,一旦失去语境,按手册自己的附录亦将不确定。
IIICircularity. The database we predict from is itself largely built by prediction.循环论证。我们据以预测的数据库,本身在很大程度上正是由预测构建的。
IVContext collapse. The lacuna destroys the very context that would disambiguate the abbreviation.语境塌缩。残缺所摧毁的,恰是能够消解缩写歧义的那个语境。
VThe machine. A language model is predictive restoration with the brakes removed — fluent, confident, and never silent.机器。语言模型是卸去了制动器的预测性修复,流畅、自信、且从不沉默。

Work through them in order; each rests on the one before. Challenges II and V can call a live language model (DeepSeek) — open ⚙ API above to add a key — but every challenge is fully usable without one.请依次研习;每一项都建立在前一项之上。挑战二与挑战五可调用在线语言模型(DeepSeek);点击上方 ⚙ API 添加密钥,但每一项挑战在无密钥时亦可完整使用。

The Abbreviation Menu缩写菜单

What Appendix II lists — and what the corpus actually attests.附录二所列者,与语料库的实际见证。

Appendix II of the Oxford Handbook is a list of common abbreviations used in Latin inscriptions. It is indispensable — and it is candid about its own limits. Its opening paragraph is, in effect, the first challenge to predictive restoration, written by the predictors themselves:

牛津手册附录二是一份拉丁铭文常用缩写表。它不可或缺,而且对自身的局限直言不讳。其开篇一段,实际上正是预测性修复所面对的第一项挑战,且出自预测者本人之手:

"This list … lays no claim to completeness. … a reader of epigraphic texts will discover that the ingenuity of the Roman stonecutter and/or his clients surpassed what modern compilers of wordlists have accomplished. Even with the help of an extensive list some puzzles will remain. As illustrated below, a single initial letter can occasionally be used as an abbreviation for a great variety of words." “本表……无意求全。……铭文文本的读者将会发现,罗马石匠及/或其主顾的巧思,超出了现代词表编纂者所能企及。即便借助一份详尽的清单,若干谜题仍将存留。如下文所示,一个单独的首字母,有时可用作种类繁多的词语之缩写。 OHRE (2015), Appendix II, "Epigraphic Abbreviations", p. 787.

To make this measurable we set the Handbook's menu beside the corpus. In EDCS the convention is the standard epigraphic one: the letters actually on the stone are printed plain, and the editor's completion follows in round brackets — D(ecimi), M(anibus), co(n)s(ul). Every such token in 537,262 inscriptions can therefore be harvested mechanically, and the engraved abbreviation tallied against each word the editors expanded it into.

The table below pairs, for each single-letter abbreviation, three figures: the size of its OHRE Appendix II menu (the curated teaching list); its total EDCS attestations; and the Shannon entropy of its expansion distribution, in bits. Entropy measures the ambiguity in that attested distribution: it is low when one expansion dominates (the abbreviation is near-deterministic) and high when probability is spread across many (a guess). Sort by any column; the entropy column is the spine of Challenge I.

为使此点可度量,我们将手册的菜单与语料库并置。EDCS 采用铭文学的标准惯例:石上实刻的字母以正体印出,编者补足的部分随之置于圆括号内;D(ecimi)M(anibus)co(n)s(ul)。因此,537,262 条铭文中的每一个此类记号都可被机械地采集,并将所刻缩写与编者展开所得的每一个词逐一对勘。

下表为每个单字母缩写并列三项数据:其 OHRE 附录二菜单的规模(经遴选的教学清单);其在 EDCS 中的总见证数;以及其展开式分布的香农熵(以比特计)。熵衡量该实见分布中的歧义程度:当某一展开式占绝对优势时(缩写近乎确定),熵值低;当概率分散于众多展开式之间时(即一次猜测),熵值高。可按任意列排序;熵这一列是挑战一的脊柱。

Ambiguity index — OHRE Appendix II × EDCS歧义指数,OHRE 附录二 × EDCS

Click a column header to sort. EDCS frequencies are lightly cleaned (obvious OCR fragments dropped); "EDCS tokens" counts attested X(…) tokens, not distinct inscriptions. The corpus shows hundreds of distinct expansions per letter — the Appendix II menu is a deliberate, pedagogical undercount.点击列首可排序。EDCS 频次经轻度清洗(剔除明显的 OCR 残片);“EDCS tokens”计的是所见证的 X(…) 记号数,而非不同铭文的条数。语料库为每个字母呈现数以百计的不同展开式,附录二的菜单是一种刻意的、为教学计的低估。

The tip of the iceberg — Appendix II menu vs. EDCS attested冰山一角,附录二菜单 对比 EDCS 实际见证

Above the waterline: the count of expansions Appendix II prints for each letter. Below: the distinct expansions the corpus actually attests. The curated teaching list is a handful; the real spread runs to many hundreds — and every one of them is a reading an editor might, in a lacuna, have to choose between.水线之上:附录二为每个字母所印展开式之数。水线之下:语料库实际见证的不同展开式之数。经遴选的教学清单只示数项;真实的散布则多达数百,而其中每一项,都是编者在残缺面前可能须加抉择的一种读法。

Read it this way. The OHRE menu tells you what an abbreviation could mean; the EDCS column tells you how often each meaning did occur. Predictive restoration lives in the gap between the two — and the entropy figure measures how wide that gap is. A low-entropy abbreviation can be restored almost mechanically; a high-entropy one cannot be "restored" at all, only guessed, however many parallels are consulted. 请如此解读。OHRE 菜单告诉你一个缩写可能意指什么;EDCS 这一列告诉你每个含义实际出现的频率。预测性修复就栖身于二者之间的缝隙,而熵值所度量的,正是这道缝隙有多宽。低熵缩写几乎可机械地补出;高熵缩写则根本无法“修复”,无论查检多少平行文本,都只能猜测

Challenge I · The Branching Factor挑战一 · 分支因子

Predictability is a measurable gradient — not a property an abbreviation has or lacks.可预测性是可度量的连续梯度,而非缩写或有或无的属性。

The Handbook's method has a binary shape. Identify the formula; restore from it. On this picture an abbreviation either belongs to a recognised formula — and is therefore restorable — or it does not. But abbreviations do not divide so cleanly. Predictability is not a switch. It is a dial, and the dial has a number.

That number is entropy. For an abbreviation with attested expansions of probability p₁, p₂, …, the Shannon entropy H = −Σ pᵢ log₂ pᵢ measures, in bits, how much genuine uncertainty remains once you know only the letter. One bit is one yes/no question. Sorted by entropy, the single-letter abbreviations of Appendix II do not fall into two camps. They spread continuously across the whole range:

手册的方法具有二元结构。辨认套语;据以修复。在这一图景中,一个缩写要么归属于某个已识别的套语,因而可被修复,要么不然。但缩写并不如此泾渭分明。可预测性不是一个开关。它是一个旋钮,而旋钮上有一个数值。

这个数值就是。对于一个其各展开式概率为 p₁, p₂, … 的缩写,香农熵 H = −Σ pᵢ log₂ pᵢ 以比特度量:在仅知字母的前提下,尚存多少真实的不确定性。一个比特即一个是非问题。按熵排序,附录二的单字母缩写并不分作两营。它们连续地铺展于整个区间:

The ambiguity spectrum — every abbreviation on one axis歧义光谱,诸缩写同置一轴

Each abbreviation sits at its measured entropy. There is no gap and no two camps — only a continuous slide from the near-certain (E) to the near-arbitrary (P). "Restorable" is not a category an abbreviation is in; it is a position on this line.每个缩写都坐落于其实测熵值之上。没有断层,没有两营,只有一道从近乎确定(E)到近乎任意(P)的连续滑移。“可修复”并非缩写所属的类别;它是这条轴线上的一个位置。

The entropy gradient — single-letter abbreviations, EDCS熵梯度,单字母缩写,EDCS

Bars show Shannon entropy (bits) of the expansion distribution; the leading expansion is named. Green ≈ near-deterministic, red ≈ near-uniform.柱形表示展开式分布的香农熵(比特);并标注居首的展开式。绿≈近乎确定,红≈近乎均匀。

At the safe end sits E. In 21,148 attestations the bare letter E expands to est in 87% of cases — overwhelmingly the est of a funerary formula (h(ic) s(itus) e(st)). Its entropy, 1.26 bits, is barely more than a single yes/no question. Restoring a final [E] in h(ic) s(itus) [e(st)] is not a bet; it is a near-certainty. The predictive thesis is, here, simply correct.

At the other end sits P, entropy 5.32 bits. The corpus attests the single letter P expanding to pedes, Publius, pater, posuit, pius, pecunia, populus, plebs, pondo, pars, praetor, proconsul, provincia, perpetuus, parentes — and on. Appendix II's menu for P runs to nineteen words; the corpus runs to hundreds. A bare [P] in a lacuna is not restored from a formula; it is filled by whatever the surviving neighbours happen to license. Strip those away and "restoration" becomes indistinguishable from naming the most frequent option.

The lesson is not that restoration is unreliable. It is that reliability is a continuous quantity that the editorial notation does not record. The Leiden Convention gives us the square bracket: […] means "these letters were restored". It cannot say whether the restoration was a 1.26-bit near-certainty or a 5.32-bit guess. On the printed page [E] and [P…] look identical. The bracket records that a prediction was made; it hides the entropy of the prediction. Challenge I asks the editor to carry, in the mind if not on the page, the number the bracket omits.

稳妥的一端是 E。在 21,148 例见证中,单独的字母 E 有 87% 展开为 est;绝大多数是墓葬套语中的 esth(ic) s(itus) e(st))。其熵值 1.26 比特,仅略多于一个是非问题。补出 h(ic) s(itus) [e(st)] 中末尾的 [E] 并非下注,而近乎确定。预测命题在此,确然成立。

另一端是 P,熵值 5.32 比特。语料库见证单字母 P 展开为 pedes、Publius、pater、posuit、pius、pecunia、populus、plebs、pondo、pars、praetor、proconsul、provincia、perpetuus、parentes,不一而足。附录二为 P 所列菜单达十九词;语料库则数以百计。残缺中一个孤零零的 [P] 并非依套语补出;它是由残存的邻词所恰好允准者填入的。剥去那些邻词,“修复”便与“指认最高频选项”无从分辨。

其教训不在于修复不可靠,而在于:可靠性是一个连续的量,而编辑符号并不记录它。莱顿规约给了我们方括号:[…] 意为“这里字母系补出”。它无法说明这次补出是 1.26 比特的近乎确定,还是 5.32 比特的猜测。在印本上,[E][P…] 看来别无二致。方括号记录了曾作出一次预测;却隐去了该预测的。挑战一要求编者,即便不落于纸面,也须在心中,补上方括号所略去的那个数值。

Inspect an expansion distribution检视某一展开式分布

Choose an abbreviation. The bars are its attested expansions in EDCS, by frequency.选择一个缩写。柱形为其在 EDCS 中所见证的各展开式,按频次排列。

The branching factor, drawn — one abbreviation, many words画出分支因子,一缩写,多词语

Choose an abbreviation: it fans out to its attested expansions, each branch as thick as that reading is frequent. This is the "branching factor" the challenge is named for — and the wider the fan, the less a bare letter in a lacuna can be said to be "restored" at all.选择一个缩写:它向各见证展开式扇形铺开,每一分支之粗细与该读法之频次相称。这便是本挑战所以得名的“分支因子”,扇面愈宽,残缺中一个孤立字母便愈难说是被已经"修复"了。

Take-away要点 — The first challenge to predictive restoration is not external. It is the entropy figure the editor already owes but never writes down. "Formulaic" is not true or false of an inscription; it is true to a degree, and the degree is measurable. ,对预测性修复的第一项挑战并非来自外部。它就是编者本已亏欠、却从不写下的那个熵值。“公式化”对一方铭文而言并非真或假;它在某种程度上为真,而这个程度是可度量的。

Challenge II · The Flagship's Hidden Premise挑战二 · 旗舰的隐含前提

Even D M — the textbook's own "safe bet" — turns ambiguous once context is stripped away, by the textbook's own appendix.即便是 D M,教科书自称的“稳妥之选”,一旦剥离语境,按教科书自己的附录也将变得不确定。

Challenge I was statistical. Challenge II is textual, and sharper, because the witness against predictive restoration is the Oxford Handbook itself. On page 14 the Handbook offers its flagship example of a safe restoration — the funerary D M:

挑战一是统计性的。挑战二是文本性的,且更为锋利,因为指证预测性修复的证人正是牛津手册本身。在第14页,手册给出了它关于稳妥修复的旗舰范例,墓葬套语 D M

"… if the text is carved on a cippus of low-quality stone with the letters M S clearly legible in the first line, it is a safe bet that we are dealing with an epitaph introduced by the common formula [D(is)] M(anibus) s(acrum)." “……若文本刻于一方劣质石料的界碑之上,首行 M S 二字清晰可辨,则几可断定我们面对的是一方以常见套语 [D(is)] M(anibus) s(acrum) 起首的墓志。” OHRE (2015), p. 14.

Now turn 773 pages, to the same book's Appendix II. Under D M the Handbook lists seven expansions; under D M S, two:

现在翻过 773 页,到同一本书的附录二。在 D M 条下,手册列出个展开式;在 D M S 条下,个:

D MDea Magna · decurio municipii · Deum Mater · devotae memoriae · Dis Manibus · dolus malus · Dominus
D M SDeo Mithrae sacrum · Dis Manibus sacrum

The contradiction is exact. Page 14 calls [D] M S a "safe bet" for a funerary epitaph; page 787 records that the very same three letters are also the standard opening of a Mithraic dedication — Deo Mithrae sacrum. A goddess (Dea Magna), a municipal councillor (decurio municipii), a legal clause warding off fraud (dolus malus), the cult of Mithras: the letters D M open all of them. The "safe bet" is safe only if you already know it is funerary — and what tells you that is not the letters but the context: the cippus, the cheap stone, the layout of an epitaph. Predictive restoration quietly imports that context as a free premise. A lacuna does not grant it for free. (This is the hinge into Challenge IV.)

Below are real inscriptions in which D M is not Dis Manibus. Decide each before revealing the answer.

这一矛盾分毫不差。第14页称 [D] M S 是墓志的“稳妥之选”;第787页却记载,同样这三个字母也是一篇密特拉教题献的标准起首;Deo Mithrae sacrum。一位女神(Dea Magna)、一名市议员(decurio municipii)、一条防范欺诈的法律条款(dolus malus)、密特拉崇拜:字母 D M 为它们一一起首。“稳妥之选”唯有在你已经知道它属丧葬时才稳妥,而告知你这一点的并非字母,而是语境:界碑、廉价石料、墓志的版式。预测性修复悄然将这一语境当作免费的前提引入。残缺却不会免费奉送它。(这里即通往挑战四的枢纽。)

下列是 D M 并非 Dis Manibus 的真实铭文。请先自行判断,再揭晓答案。

Disambiguation cases — what does D M expand to here?消歧个案,这里 D M 展开为何?

Case A · a votive plaque个案甲 · 一方还愿牌

Case B · a clause in a testament个案乙 · 遗嘱中的一项条款

Case C · the trap inside the formula个案丙 · 套语之内的陷阱

This epitaph really does open D(is) M(anibus) s(acrum). But read on: decurio m(unicipii) M(ustitani). The letter M here is municipii; the next M is the town's name. One inscription, the abbreviation M resolved three different ways. What licenses each is position and neighbour — context, again.这方墓志确实以 D(is) M(anibus) s(acrum) 起首。但读下去:decurio m(unicipii) M(ustitani)。这里的字母 Mmunicipii;下一个 M 是城镇之名。一方铭文之内,缩写 M 以三种不同方式解读。允准各自读法者,乃位置与邻词,又是语境。

Case D · the same letters, a cult个案丁 · 同样的字母,一种崇拜

A dedication to the Mater deum Magna — the Great Mother — and Attis. Appendix II lists D M = Dea Magna and Deum Mater: where this dedication damaged to its first line, [D] M would be a perfectly good restoration — and nothing about the funerary frequency of D M would warn you the genre was wrong.一篇献给Mater deum Magna;伟大之母,与阿提斯的题献。附录二列 D MDea MagnaDeum Mater:倘此题献残损至仅余首行,[D] M 将是一个完全说得通的修复,而 D M 的丧葬频次,绝不会警示你文类已然错置。

Ask the prediction machine询问预测机器

Give DeepSeek the bare letters, no context, and ask it to expand them. Watch whether it hedges across the seven possibilities — or simply commits.把孤立的字母交给 DeepSeek,不予语境,请它展开。看它是否会在七种可能之间留有余地,抑或径直认定。

Take-away要点 — The Handbook's flagship restoration is sound, but not for the reason it gives. D M is not restorable from the letters; it is restorable from the genre, and the genre is read off the context. When a book can call the same three letters a "safe bet" on p. 14 and list seven meanings on p. 787 without noticing, the predictive method has a blind spot exactly the size of a lacuna. ,手册的旗舰修复是稳妥的,但并非出于它所给的理由。D M 并非依字母可修复;它是依文类可修复,而文类需自语境读出。当一本书能在第14页称同样三个字母为“稳妥之选”、又在第787页列出七种含义而浑然不觉时,预测方法便有了一个盲点,其大小,恰好是一处残缺。

Challenge III · Circularity挑战三 · 循环论证

The database we predict from is itself, in large part, built by prediction.我们据以预测的数据库,本身在很大程度上正是由预测构建的。

Challenge II showed that restoring an abbreviation means choosing a genre. Challenge III asks where the editor gets the odds for that choice — and the answer is uncomfortable.

Database-based restoration has an obvious, principled rule: restore the expansion the corpus attests most often. It is the rule the frequency tables of this whole module are built to serve. But the rule hides a circularity. The corpus is not a record of inscriptions as they survive on stone. It is a record of inscriptions as editors have published them — and editors restore. Every […] in EDCS is a sequence of letters that is not on the stone: it is an earlier editor's prediction, printed as text.

How much of the corpus is prediction? In the 537,262-inscription export, the editorial bracket [ opens 1,086,124 times — an average of 2.02 restorations per inscription. The corpus is not a record of what the Romans wrote. It is a record of what the Romans wrote plus two centuries of what editors predicted they wrote — and the two are interleaved in the same strings, often indistinguishable to a frequency count.

挑战二表明,修复一个缩写意味着选择一种文类。挑战三追问:编者从何处获得这一选择的概率,答案令人不安。

基于数据库的修复有一条显而易见、且合乎原则的规则:补出语料库见证频次最高的展开式。这正是本单元全部频次表所要服务的规则。但这条规则隐含着一种循环。语料库并非石上残存的铭文记录,而是编者所刊布的铭文记录,而编者会作修复。EDCS 中的每一个 […] 都是一串并不在石头上的字母:它是前代编者的预测,被当作文本印出。

语料库中有多少是预测?在这份 537,262 条铭文的导出库里,编辑方括号 [ 开启了 1,086,124 次,平均每条铭文 2.02 处修复。语料库并非罗马人所书写者的记录,而是罗马人所书写者加上两个世纪以来编者所预测的书写,二者交织于同一串字符之中,对频次统计而言往往无从分辨。

The loop — predict, publish, re-measure, predict循环,预测、刊布、再度量、预测
① THE EDCS CORPUS 537,262 texts — already holding 1,086,124 [ ] restorations ② THE FREQUENCY "D(is) is the commonest reading of D" — 59% ③ A NEW RESTORATION the editor prints a fresh [D(is)] in a new lacuna — and publishes the method is graded on a test it helped to write

Step ③ feeds back into step ① — the new restoration is published, enters EDCS, and is counted the next time the frequency is measured. Right or wrong, it becomes evidence for the restoration after it.第③步回馈于第①步,新的修复一经刊布,便进入 EDCS,并在下一次度量频次时被计入。无论对错,它都成为其后那处修复的证据。

How much of an inscription is the editor? — A. Didius Gallus, EDCS-28500281铭文中有几分出自编者之手?A. Didius Gallus,EDCS-28500281

The circularity is now visible. We restore a damaged D as Dis because D(is) is the corpus's commonest expansion of D. But a large share of those D(is) attestations are themselves [D(is)] — restored, by exactly the reasoning we are now applying, by earlier editors. A frequency search cannot easily separate the D(is) that is on the stone from the [D(is)] that an editor supplied. The frequency that licenses the restoration was produced by restorations. Predict the mode and you do not discover what D meant; you ratify what editors have always assumed it meant — and you ratify their errors with their successes, because a wrong restoration, once printed, is counted as evidence for the next.

This is not a marginal worry. It is the methodological floor of quantitative epigraphy. Kaše, Heřmánková and Sobotková's 2022 study counts occupational terms across Roman cities to measure the division of labour — a genuinely valuable enterprise, and one this course recommends. But it counts them in a dataset (LIRE) derived from EDCS. Every quantitative result drawn from such a corpus inherits its restorations: the numbers describe, in part, the editorial tradition that built the corpus, not only the society that cut the stones. The honest quantitative claim is always conditional — "given the corpus as restored."

循环至此显形。我们将残损的 D 补为 Dis,因为 D(is) 是语料库中 D 最常见的展开式。但那些 D(is) 见证中,有很大一部分本身即是 [D(is)];由前代编者以我们此刻所用的同一推理补出。频次检索难以将石上实有的 D(is) 与编者补足的 [D(is)] 区分开来。允准这次修复的频次,恰恰修复所产生。预测众数,你并未发现 D 曾意指什么;你只是认可了编者一向假定它意指的东西,并将他们的谬误与成功一并认可,因为一处错误的修复一经印出,便被计为下一处修复的证据。

这并非边缘性的顾虑,而是定量铭文学的方法论地基。Kaše、Heřmánková 与 Sobotková 2022 年的研究统计罗马各城市的职业用语以衡量劳动分工,这是一项确有价值的工作,也是本课程所推荐者。但它是在一个由 EDCS 派生的数据集(LIRE)中作此统计。凡取自此类语料库的定量结果,都承袭了它的修复:那些数字所描述的,部分地是构建该语料库的编辑传统,而不仅是凿刻石头的那个社会。诚实的定量主张始终是有条件的;“就经过修复的语料库而言”

Take-away要点 — Predictive restoration and the corpus that validates it are not independent. The method is graded on a test it helped to write. This does not make the corpus useless — it makes it a record of a tradition as much as of antiquity, and quantitative work must say so. ,预测性修复,与验证它的那个语料库,并非彼此独立。该方法是在一份它自己参与撰写的考卷上受评的。这并不使语料库失去价值,它使语料库既是古代的记录,也同样是一种学术传统的记录,而定量研究必须如实声明这一点。

Challenge IV · Context Collapse挑战四 · 语境塌缩

The lacuna destroys the very context that would disambiguate the abbreviation.残缺所摧毁的,恰是能够消解缩写歧义的那个语境。

Challenge II located the disambiguating work in context. Challenge IV measures what happens when context is removed — which is, precisely, what damage does.

An abbreviation is a compression. It can be decompressed because the reader supplies what the writer omitted — and the reader supplies it from context: the genre of the text, the formula in progress, the words on either side. The cruel structure of epigraphic damage is that it attacks the abbreviation and its context together. The same break that leaves you needing to restore M also removes the lines that would have told you whether M is Manibus, municipii, miles, mater, memoriae, or the praenomen Marcus.

Step through an inscription as the stone breaks. Watch the candidate set widen:

挑战二将消歧之功定位于语境。挑战四则度量:当语境被移除时会发生什么,而移除语境,正是残损之所为。

缩写是一种压缩。它之所以能被解压,是因为读者补足了书写者所省略者,而读者是从语境中补足的:文本的文类、进行中的套语、左右两侧的词语。铭文残损的残酷结构在于:它同时攻击缩写及其语境。那道令你不得不修复 M 的裂口,也一并抹去了本可告知你 M 究竟是 Manibusmunicipiimilesmatermemoriae,抑或前名 Marcus 的那几行。

随石头一步步破碎,逐阶研习一方铭文。看候选集如何扩张:

Context stepper — the abbreviation M, under increasing damage语境步进器,缩写 M,残损渐增

The shape of the result is the heart of Challenge IV. Predictive restoration is most confident when the text is nearly whole — when there is least to restore. It is least reliable when the text is badly broken — when there is most to restore. The method's accuracy is inversely correlated with the situation that calls for it. Where restoration is easy it is barely needed; where it is needed it is barely possible. The intact neighbours that make M obvious are a loan from the part of the inscription that did not need help — and a lacuna is, by definition, the place where that loan is unavailable.

这一结果的形态正是挑战四之要。预测性修复在文本近乎完整时最为自信:此时最无须修复;在文本严重残破时最不可靠:此时最需修复。该方法的准确度,与召唤它出场的那种处境负相关。修复容易处,几乎用不着它;需要它处,它几乎无能为力。使 M 一目了然的那些完好邻词,是向铭文中无须援手的部分借来的,而一处残缺,按定义,正是这笔借款无从取得之地。

Take-away要点 — "Restore from context" is sound advice that describes the easy case and evades the hard one. The hard case — heavy damage — is the one where context has collapsed, and there the editor is not reading the formula off the stone but projecting it onto the silence. ,“依语境修复”是一条中肯的建议,但它描述的是容易的情形,回避的是困难的情形。困难的情形,严重残损,恰是语境已然塌缩之处;在那里,编者并非从石上读出套语,而是将套语投射到沉默之上。

Challenge V · The Prediction Machine挑战五 · 预测机器

A language model is predictive restoration with the brakes removed.语言模型是卸去了制动器的预测性修复。

A large language model is a prediction engine. Trained to continue text, it answers every prompt by emitting the most probable next tokens. Asked to complete a damaged inscription it does exactly what the predictive editor does — consult an internalised store of parallels and produce the likeliest continuation — but with two differences. It has read far more than any editor. And it has no brake.

A human editor completing [D] M S on a cippus carries — even if Challenge I showed the notation hides it — some sense of the entropy of the move. The editor can decline: can print [- - -], can mark a reading (?), can leave the lacuna open and say so. A language model, by default, does none of this. It is fluent by construction. It will expand D M with the same easy confidence whether the answer is a 1.26-bit near-certainty or a 5.32-bit guess. It almost never returns the most honest epigraphic answer — "the surviving letters do not determine this."

Use the tool below. Give DeepSeek a damaged text and read its restoration critically — against the entropy of Challenge I, the genres of Challenge II, the circularity of Challenge III, the collapsed context of Challenge IV.

大型语言模型是一台预测引擎。它受训以续写文本,对每一则提示,都以发出概率最高的后续记号来作答。当被要求补全一方残损铭文时,它所做的,恰是预测型编者所做的,查检一套内化的平行文本储备,产出最可能的续写,但有两点不同。它读过的,远多于任何一位编者。而且它没有制动器

一位人类编者,在界碑上补出 [D] M S 时,心中怀着,即便挑战一已表明符号系统将其隐去,对这一动作之熵的某种感知。编者可以拒绝:可以印出 [- - -],可以以 (?) 标注某一读法,可以让残缺敞着并明言之。语言模型则默认不作此举。它的流畅是构造使然。无论答案是 1.26 比特的近乎确定,还是 5.32 比特的猜测,它都会以同样轻松的自信展开 D M。它几乎从不给出铭文学上最诚实的回答;“残存字母不足以判定这里”

请使用下方工具。把一段残损文本交给 DeepSeek,并批判地研读它的修复,对照挑战一的熵、挑战二的文类、挑战三的循环、挑战四塌缩的语境。

Reconstruction workbench — DeepSeek重建工作台,DeepSeek

Pick a graded case or paste your own. Round brackets ( ) = abbreviations already expanded by an editor; underscores ___ = a lacuna.选择一个分级个案,或粘贴你自己的。圆括号 ( ) = 已由编者展开的缩写;下划线 ___ = 残缺。

Take-away要点 — The model's restorations are often correct: Latin epigraphy is formulaic, and a formulaic system is exactly what a next-token predictor models well. That is the point — and the warning. The machine does not fail by being wrong. It fails by being equally fluent when right and when wrong. It externalises the predictive method and strips out the one feature that made it scholarship: the editor's hesitation. A restoration's fluency was never evidence of its correctness — not on the page, not on the screen. The machine only makes that old truth impossible to ignore. ,模型的修复往往正确:拉丁铭文是公式化的,而公式化系统恰是下一记号预测器所善于建模者。这正是要点,也是警示。机器并非因出错而失败。它的失败在于正确时与错误时同样流畅。它把预测方法外化出来,却剔除了使其成其为学术的那唯一特征:编者的迟疑。一处修复的流畅,从来不是其正确的证据,纸上不是,屏上亦不是。机器只是让这条古老的真理再难被忽视。

VI · The Grid挑战六 · 网格

Attic Greek brings a different engine — stoichedon geometry.阿提卡希腊文带来另一套引擎,方阵几何。

The five Latin challenges turned on one engine of prediction: the frequency with which an abbreviation takes each expansion. Greek epigraphy supplies a second, quite different engine — and seeing it sharpens what the Latin one was.

Most Attic public inscriptions of the fifth and fourth centuries BC were cut stoichedon: every letter occupies one square of an invisible grid, aligned both across and down. Once the editor establishes how many letters fill one line, the letter-count of every lacuna on the stone is fixed — not estimated, counted. It is the strongest constraint anywhere in this module.

拉丁文的五项挑战都围绕同一套预测引擎:缩写取各展开式的频次。希腊铭文学提供了第二套、相当不同的引擎,看清它,也就看清了拉丁那一套究竟是什么。

公元前五、四世纪的雅典公共铭文,多以方阵式(stoichedon)刻写:每一字母占一格隐形方格,横竖皆对齐。编者一旦确定一行刻几字母,碑上每一处残缺的字母数便随之锁定,不是估算,而是数出来的。这是本单元中最强的约束。

"Once the number of letter-spaces in one line of a stoichedon inscription has been established, the number of spaces in all the lines is known, and the possibilities regarding restorations are circumscribed within exact … limits … many plausible restorations have foundered through disregard of the stoichedon pattern." “方阵式铭文一旦确定了一行的字母格数,则各行的格数皆已知,可能的修复便被圈限于精确……的范围之内……许多看似合理的修复,正因无视方阵格式而告破产。” A. G. Woodhead, The Study of Greek Inscriptions, 2nd ed. (Cambridge 1981), p. 71.
A stoichedon decree — the grid and the gap一份方阵式法令,网格与缺口

An illustrative stoichedon grid, 24 letters wide, after Woodhead's discussion of the decree IG I³ 159 — shown in fifth-century Attic letterforms (the prose normalises them). The principle is general: fix one line's width and every lacuna becomes countable — here, exactly 6 stoichoi.一幅示意性的方阵网格,每行 24 字母,依 Woodhead 对法令 IG I³ 159 的论述而作,以公元前五世纪的阿提卡字形呈现(行文中则用规范拼写)。其理普遍适用:一旦确定一行宽度,每一处残缺便皆可计数,这里恰好六格

What does the gap allow?缺口允许什么?
The lesson. Stoichedon geometry is powerful and exact — but it constrains length, never content. Six stoichoi means six letters and Greek has a great many six-letter words. The gap closes to a single reading only when a second engine — the genre formula — is laid over the geometry. Geometry alone fixes the frame; the formula fills it. This is the Greek counterpart of the Latin abbreviation menu: a count is not yet a word. 要旨。方阵几何强大而精确,但它约束的是长度,从不约束内容。六格即六字母,而希腊文的六字母词数以千计。唯有在几何之上再叠加第二套引擎,文类套语,缺口才会收束为单一读法。几何只定框架,套语方能填充。这正是拉丁缩写菜单的希腊对应物:字数尚不是字。
↔ Latin echo. Compare the Abbreviation Menu and Challenge I: there the menu of possibilities is lexical (which expansion?); here it is geometric (which six-letter string?). Two scripts, the same gap between a constraint and an answer.↔ 拉丁回响。对照缩写菜单挑战一:彼处可能性的菜单是词汇性的(取哪个展开式?),这里则是几何性的(取哪个六字母串?)。两种文字,同一道横亘于约束与答案之间的缺口。

VII · The Tribute List挑战七 · 贡金表

Formula at industrial scale — and two editors, one stone.工业规模的套语,以及一石、两编者。

From 454/3 BC the Athenians inscribed, year by year, the aparche — the "first-fruits", one-sixtieth of the tribute (phoros) paid by each allied city, dedicated to Athena. The first stele, the lapis primus (IG I³ 259–272), is the most relentlessly formulaic Greek corpus there is: a column of entries, each [city] ⫶ [amount], repeated hundreds of times.

That formula, plus stoichedon counting, let the great edition — Meritt, Wade-Gery & McGregor's Athenian Tribute Lists (1939–53) — restore vast stretches of broken stone. Björn Paarmann's 2007 re-edition deliberately reverses the move. Toggle the entry below between the two editorial philosophies.

自公元前 454/3 年起,雅典人逐年刻下aparche:“初熟之果”,即各盟邦所纳贡金(phoros)的六十分之一,献予雅典娜。第一方石碑,即 lapis primusIG I³ 259–272),是现存最彻底公式化的希腊语料:一栏栏条目,每条皆作 [城邦] ⫶ [数额],重复数百次。

正是这套套语,加上方阵计数,使那部巨著,Meritt、Wade-Gery 与 McGregor 的《雅典贡金表》(1939–53);得以补全大片残石。Paarmann 2007 年的新校本则有意逆转此举。请在下方条目的两种编辑哲学之间切换。

"The philosophy behind the edition itself has been to keep the restoration restricted to a minimum … a lot of what has been thought acquired knowledge has now been taken away. However, it will attempt to give a truer impression of what we really have." “这部校本的理念,是将修复压缩至最低限度……许多曾被视为既得知识者,如今已被移除。然而,它将力图更真实地呈现我们实际所拥有的东西。” Björn Paarmann, Aparchai and Phoroi: A New Commented Edition of the Athenian Tribute Quota Lists (diss. Fribourg 2007), Preface.
One quota-list entry, two editions同一贡金条目,两种校本
The lesson. The quota-list formula predicts the column structure — that a name is followed by a figure — with near-total reliability. It never predicts which allied city, or what amount. Where the ATL editors read the formula as a licence to restore, Paarmann reads the same formula as a frame that must be left visibly empty. Same stone; the disagreement is about what a formula entitles you to write. 要旨。贡金表的套语能近乎万无一失地预测栏目结构,人名之后必有数字。但它从不预测哪一个盟邦、多少数额。ATL 的编者将套语读作修复的许可证,Paarmann 则将同一套语读作一个必须留作可见空白的框架。同一块石头;分歧在于:一条套语究竟授权你写下什么。
↔ Latin echo. This is Challenge II in Greek. There, the genre of an epitaph gave you D M but never the dead person's name; here, the genre of a quota list gives you the column but never the city. Formula supplies the skeleton; the proper nouns are always a bet.↔ 拉丁回响。这是希腊文版的挑战二。彼处,墓志文类给你 D M,却从不给出逝者之名;这里,贡金表文类给你栏目,却从不给出城邦。套语只供骨架;专名永远是一场下注。

VIII · The Dated Letter挑战八 · 纪年之字

When one restored letter dates an empire — the three-bar sigma.当一个补出的字母为一个帝国纪年,三划西格玛。

The Athens–Egesta alliance (IG I³ 11) carries its own date only in the archon's name, line 3 — of which just the final letters ]ΟΝ ΕΡΧΕ ("…on was archon") are securely read. In 1944 Raubitschek restored the rest as Habron (458/7 BC). The early date was then locked in place by a letter-form rule: the "three-bar sigma" was held to have gone out of use by about 446 — and this decree carries it thirteen times.

So a single restored archon-name, propped on a single letter-form, dated one step in the history of the Athenian empire. Harold Mattingly fought the rule for decades. In 1990 a laser beam, fired through the marble, re-read line 3 as Antiphon (418/7) — forty years later. Pick the archon and watch the history fork.

雅典—埃格斯塔同盟铭文(IG I³ 11)的纪年仅藏于第三行的执政官之名,而其中唯有末尾数字 ]ΟΝ ΕΡΧΕ(“……翁任执政官”)可确读。1944 年,Raubitschek 将其余补作 哈布隆(Habron,前 458/7 年)。早期断代随即由一条字形规则锁定:“三划西格玛”据信约在前 446 年后便已弃用,而此铭刻有此形十三次。

于是,一个补出的执政官之名,支于一个字形之上,便为雅典帝国史中的一步纪了年。Mattingly 与此规则缠斗数十年。1990 年,一束激光穿透大理石,将第三行重读为 安提丰(Antiphon,前 418/7 年):晚了整整四十年。请择一执政官,看历史如何分岔。

"Its date depends on the name of the archon for the year, of which only the final two letters … have been securely read." … "in this text there are 13 three-bar sigmas … post quem non for three-bar sigma and tailed rho 446 and 438 B.C. respectively." “其纪年取决于该年执政官之名,而此名仅末二字母……可确读。”……“此铭中有十三个三划西格玛……三划西格玛与带尾若(rho)的下限分别为公元前 446 与 438 年。” Chambers, Gallucci & Spanos, "Athens' Alliance with Egesta in the Year of Antiphon", ZPE 83 (1990), p. 38; A. S. Henry, "Through a Laser Beam Darkly", ZPE 91 (1992), pp. 137–38.
Restore the archon · line 3 reads ]ΟΝ ΕΡΧΕ补出执政官 · 第三行作 ]ΟΝ ΕΡΧΕ
If Habron · 458/7 BC若哈布隆 · 前 458/7
The alliance falls a generation before the Sicilian crisis. The three-bar sigma "canon" holds. A whole series of imperial decrees is dated early, and the empire looks aggressive a generation sooner.同盟早于西西里危机一代人。三划西格玛“定律”得以维持。一整批帝国法令被系于早期,帝国的扩张姿态因而提前了一代。
If Antiphon · 418/7 BC若安提丰 · 前 418/7
The alliance sits right before the Sicilian Expedition — Thucydides' context. The sigma canon collapses; Mattingly's "low" chronology is vindicated; a series of Athenian imperial decrees slides decades later.同盟正处西西里远征前夕,修昔底德的语境。西格玛定律就此崩塌;Mattingly 的“低”年代学得到证实;一系列雅典帝国法令的年代随之后移数十年。

The controversy, in four moves争议之四幕

The circularity循环之所在 The three-bar sigma "canon" was itself induced from inscriptions whose dates were partly fixed by their sigmas — then used to date new inscriptions. The dating rule and the dated corpus each underwrite the other. This is Challenge III exactly: the database you predict from was built by prediction. 三划西格玛“定律”本身,是从一批铭文中归纳出来的,而那批铭文的年代,部分恰恰是由其西格玛字形所定,继而又被用以为新铭文断代。断代规则与被断代的语料,彼此互为担保。这正是挑战三:你据以预测的数据库,本身正是由预测建成的。
↔ Latin echo. Challenge III showed the EDCS corpus part-built by restoration; here the same loop runs through letter-forms instead of words. And the laser is Challenge V's machine in another guise — a new instrument that reads more confidently than the eye, and still has to be doubted.↔ 拉丁回响。挑战三揭示 EDCS 语料库部分由修复建成;这里同一循环只是改由字形而非词语运行。而那束激光,正是挑战五之机器的又一化身,一具读得比肉眼更自信的新仪器,却仍须被怀疑。

IX · Two Stones挑战九 · 两石相照

Greek and Latin, side by side — and the machine across both.希腊与拉丁并置,以及横跨二者的机器。

Three predictive engines have now been on the table: the frequency of a Latin abbreviation's expansions, the geometry of a stoichedon grid, and the formula of a genre. They are not the same kind of thing, and a restoration is only ever as strong as the engine beneath it.

至此,三套预测引擎已悉数登场:拉丁缩写各展开式的频次、方阵网格的几何、文类的套语。它们并非同类之物;而一处修复,其可靠程度永远不超过其下那套引擎。

Latin (this module)拉丁(本单元)Greek (Attic)希腊(阿提卡)
Engine引擎genre formula + frequency of abbreviation expansion文类套语 + 缩写展开式之频次genre formula + stoichedon geometry文类套语 + 方阵几何
It fixes它锁定the likeliest expansion最可能的展开式the exact letter-count of a gap缺口的精确字母数
It leaves open它留待未决every non-modal expansion一切非众数的展开式which string fills the count哪个字串填满字数
Fails when失效于a rare abbreviation is read as the modal one罕见缩写被读作众数a supplement is "necessary" but non-formulaic; circular letter-form dating补字虽称“必然”却非套语;字形断代之循环
The machine机器DeepSeek — a general language modelDeepSeek,通用语言模型Ithaca — a corpus-trained networkIthaca,以语料训练之网络

History from square brackets出自方括号的历史

"There is a peculiar brand of historical fiction created by those … who build far-ranging historical theories on words … inserted — meaning no harm, and often exempli gratia — between square brackets in a fragmentary text … especially in non-stoichedon texts and non-formulaic phrases … history should not be written with any confidence from what is inside square brackets in published inscriptions." “有一类独特的历史虚构,出自这样一些人之手……他们将宏阔的历史理论,建立在残篇之中、被前人,本无恶意,且往往仅作举例之用,置于方括号之内的词语之上……尤以非方阵、非套语的文句为甚……不应凭已刊铭文方括号内之物,有任何自信地书写历史。” E. Badian, "History from 'Square Brackets'", Zeitschrift für Papyrologie und Epigraphik 79 (1989), pp. 59, 70.

Badian's two specimens are one-word bets that carry whole histories. In IG II² 399 the gap reads either πολ[εμί]ων ("enemies") or Moretti's [ληιστ]ῶν ("pirates") — and on that single word turns the reconstructed history of Athenian involvement in the war of Agis III. In the Philippi letter of Alexander, whether to restore the royal title [βασιλέα] would "settle" a long dispute over Macedonian kingship — and Badian's verdict is that it "does not seem a necessary and inevitable supplement". The bracket is doing the work the evidence cannot.

Badian 的两个标本,都是承载着整段历史的“一词之注”。IG II² 399 中,缺口或读作 πολ[εμί]ων(“敌人”),或读作 Moretti 的 [ληιστ]ῶν(“海盗”),而雅典是否卷入阿基斯三世之战,其重构的历史竟系于这一个词。在亚历山大致腓立比的书信中,是否补出王号 [βασιλέα],将“了结”一桩关于马其顿王权的长久争论,而 Badian 的裁断是:它“似乎并非一处必然而不可避免的补字”。方括号正在做证据无法做的工作。

The Greek machine — Ithaca希腊文的机器,Ithaca

Challenge V's machine has a Greek-trained counterpart. Ithaca (Assael, Sommerschield et al., Nature 2022) is a deep neural network for the restoration, dating, and geographic attribution of Greek inscriptions — trained on the corpus itself, and built to assist rather than replace.

挑战五的机器,有一个以希腊文训练的对应物。Ithaca(Assael、Sommerschield 等,《自然》2022)是一个用于希腊铭文修复、断代与地理归属的深度神经网络,以语料本身训练,且其设计意在辅助而非取代。

62%restoration, model alone单独修复准确率
25→72%historian accuracy, with Ithaca史家借助后之准确率
71%geographic attribution地理归属准确率
<30 yrdating precision断代精度

Crucially, Ithaca returns the top 20 hypotheses, not one answer — predictive restoration with its uncertainty kept visible. That is the honest form of the machine; the Lab below lets you test whether a general model behaves the same way.关键在于:Ithaca 返回前 20 项假设,而非单一答案,这是把不确定性始终保留可见的预测性修复。这才是机器之诚实形态;下方的实验室让你检验:一个通用模型是否也如此行事。

The Restoration Lab修复实验室

One workbench, three jobs. Generate a restoration; Compare a Greek and a Latin problem; or Critique a restoration — set the model, in Badian's role, against a published text or against its own output. Pick a mode; the prompt loads; edit it freely; run.

一张工作台,三项任务。生成一处修复;对照一组希腊与拉丁的难题;或批判一处修复,让模型扮演 Badian 的角色,去检验一份已刊文本,或检验它自己的产物。择一模式,提示词即载入;可自由编辑,然后运行。

DeepSeek Restoration LabDeepSeek 修复实验室
↔ Latin echo. The Lab is Challenge V widened: Generate is C5's workbench; Critique is C5's "read it critically" turned into a second pass; Compare puts the Latin context collapse of Challenge IV beside its Greek mirror. The brake on all of it is the same square bracket Badian named.↔ 拉丁回响。实验室是被拓宽的挑战五:生成即挑战五的工作台;批判即将挑战五“批判地研读”化为第二道工序;对照则把挑战四的拉丁语境塌缩与其希腊镜像并置。而约束这一切的,正是 Badian 所命名的那个方括号。

Debrief反思

Eight questions to carry out of the module. Open each for a model answer.带出本单元的八个问题。点开各题查看参考答案。
1 · If a restoration is a prediction, what is the editorial bracket [ ] actually telling the reader — and what is it not telling them?1 · 若修复即预测,那么编辑方括号 [ ] 究竟在向读者传达什么,又有什么是它没有传达的?
It tells the reader that a prediction was made — these letters are not on the stone. It does not tell them the entropy of that prediction: whether it was a near-certain 1.26-bit completion (a final [E] in h s [e]) or a 5.32-bit guess (a bare [P…]). Challenge I: the bracket records the act of prediction and conceals its confidence; [E] and [P…] look identical in print.它告诉读者作出一次预测,这些字母不在石上。它没有告诉读者该预测的:是近乎确定的 1.26 比特补足(h s [e] 中末尾的 [E]),还是 5.32 比特的猜测(孤立的 [P…])。挑战一:方括号记录了预测这一行为,却隐藏了它的把握程度;印本上 [E][P…] 别无二致。
2 · The Oxford Handbook calls [D] M S a "safe bet" (p. 14) and lists D M S as Deo Mithrae sacrum OR Dis Manibus sacrum (p. 787). Is this a contradiction, or can both be true?2 · 牛津手册称 [D] M S 为“稳妥之选”(第14页),又将 D M S 列为 Deo Mithrae sacrum 或 Dis Manibus sacrum(第787页)。这是矛盾,还是二者可同时成立?
Both are true — and that is the problem. The restoration is safe given the genre; the appendix lists the menu without the genre. The "safe bet" smuggles in a premise (this is funerary) that the letters alone do not supply. The contradiction is not in the facts but in the method's failure to state its own precondition. A lacuna is exactly the situation in which that precondition cannot be assumed.二者皆为真,而这正是问题所在。修复在给定文类的前提下是稳妥的;附录所列的菜单则不含文类。“稳妥之选”夹带了一个前提(这是丧葬),而单凭字母无法提供这一前提。矛盾不在事实,而在方法未能言明它自身的前提条件。一处残缺,恰恰是那个前提无法被假定的处境。
3 · Explain the circularity of "restore the most frequent expansion" in one sentence.3 · 用一句话说明“补出最高频展开式”的循环性。
The frequencies are computed on a corpus that already contains ~2 editorial restorations per inscription, so predicting the mode largely re-confirms what earlier editors predicted — including their errors — rather than discovering what the Romans wrote.频次是在一个平均每条铭文已含约 2 处编者修复的语料库上算得的,故预测众数在很大程度上只是重新确认前代编者的预测,连同其谬误,而非发现罗马人之所书。
4 · Why is predictive restoration least reliable exactly where it is most needed?4 · 为何预测性修复恰在最需要它之处最不可靠?
Because the disambiguating signal is context — genre, formula, neighbouring words — and physical damage removes the abbreviation and its context together. Light damage leaves context intact (restoration easy, barely needed); heavy damage collapses context (restoration needed, barely possible). Accuracy is inversely correlated with the severity that calls for the method (Challenge IV).因为消歧的信号是语境,文类、套语、邻词,而物理残损会同时移除缩写及其语境。轻度残损保全语境(修复容易,几乎用不着);重度残损使语境塌缩(修复必需,却几乎不可能)。准确度与召唤该方法出场的损毁程度负相关(挑战四)。
5 · A language model restores a damaged inscription and the restoration reads perfectly. Why is fluency not evidence of correctness?5 · 语言模型修复了一方残损铭文,读来天衣无缝。为何流畅不构成正确的证据?
Because the model is fluent by construction — it emits the most probable continuation whether the underlying evidence determines it or not. It produces equally smooth text for a 1.26-bit certainty and a 5.32-bit guess. Fluency measures the model's training, not the inscription's evidence. The same is true, less visibly, of a confident human editor.因为模型的流畅是构造使然,无论底层证据是否足以判定,它都发出概率最高的续写。对 1.26 比特的确定与 5.32 比特的猜测,它产出同样顺滑的文本。流畅度衡量的是模型的训练,而非铭文的证据。对一位自信的人类编者而言,此理亦同,只是不那么显眼。
6 · Does Challenge III mean quantitative epigraphy (e.g. Kaše et al. 2022) is invalid?6 · 挑战三是否意味着定量铭文学(如 Kaše 等 2022)不成立?
No. It means its results are conditional. Counting occupational terms across LIRE/EDCS is a legitimate, valuable method; but the corpus is partly an artefact of editorial restoration and of the survival bias of the epigraphic habit. The valid claim is "given the corpus as restored and as it survives." Quantitative epigraphy is not invalidated by Challenge III; it is obliged by it to state its conditions.不。它意味着其结论是有条件的。在 LIRE/EDCS 上统计职业用语是一种正当而有价值的方法;但语料库部分地是编辑修复的产物,也带有铭文习俗(epigraphic habit)的存留偏差。成立的主张是“就经过修复、且如此存留的语料库而言”。挑战三并不推翻定量铭文学;它只是责成定量铭文学言明其条件。
7 · After all five challenges — should an editor still restore lacunae?7 · 历经五项挑战之后,编者是否仍应修复残缺?
Yes. Restoration is indispensable; a corpus of un-restored fragments would be largely unusable. The module's claim is not "stop restoring" but "restore with the entropy in view." State the confidence, distinguish the 1.26-bit completion from the 5.32-bit guess, name the genre premise being assumed, and flag where the corpus evidence is itself restored. The discipline already has the tool for honest doubt — (?), [- - -], the apparatus note. The challenges argue for using it in proportion to the entropy.应当。修复不可或缺;一部全是未修复残片的语料库将几乎无法使用。本单元的主张不是“停止修复”,而是“在熵的注视下修复”。言明把握程度,将 1.26 比特的补足与 5.32 比特的猜测区分开来,点明所假定的文类前提,并标示语料库证据本身何处即是修复。本学科早已具备诚实存疑的工具;(?)[- - -]、校勘注。诸挑战所主张者,是按熵的大小相称地使用它。
8 · In one sentence: what is "the premise of prediction"?8 · 用一句话:何谓“预测的前提”?
That epigraphic language is formulaic enough to be predicted — true, but only to a measurable degree, only where context survives to fix the genre, and only on a corpus we must remember we partly wrote ourselves.即铭文语言足够公式化,因而可被预测,此言为真,但仅在一个可度量的程度上为真,仅在语境尚存以锚定文类处为真,且仅就一个我们必须记得有几分出自我们自己之手的语料库而言为真。

Reading & Sources阅读与出处

Set reading for Week 13, and the materials behind this module.第十三周指定阅读,及本单元所据材料。

Set reading指定阅读

Kaše, V., Heřmánková, P., and Sobotková, A. 2022. "Division of labor, specialization and diversity in the ancient Roman cities: a quantitative approach to Latin epigraphy." PLoS ONE 17(6): e0269869. doi:10.1371/journal.pone.0269869

Read it for its method as much as its findings. As you read, hold Challenge III in mind: the study's evidence is occupational vocabulary counted in a Latin-epigraphic dataset (LIRE) derived from EDCS. Ask, at each result: which of these counts could be affected by editorial restoration, and by the survival bias of the "epigraphic habit"? The paper is a model of quantitative method; the challenges of this module are the conditions under which its numbers should be read.

阅读时,应同等地着眼于其方法与其结论。研读之际,请将挑战三存于心中:该研究的证据,是在一个由 EDCS 派生的拉丁铭文数据集(LIRE)中所统计的职业词汇。面对每一项结果,都不妨自问:这些计数中,哪些可能受到编辑修复、以及“铭文习俗”存留偏差的影响?该文是定量方法的典范;本单元的诸项挑战,则是其数字应被据以解读的条件。

Primary reference主要参考

Bruun, C., and Edmondson, J. (eds.) 2015. The Oxford Handbook of Roman Epigraphy. Oxford: Oxford University Press. — esp. ch. 1, "The Epigrapher at Work" (the predictive method, p. 14); Appendix I, "Epigraphic Conventions: the Leiden System"; Appendix II, "Epigraphic Abbreviations" (pp. 787–98).

Data behind this module本单元背后的数据

Frequency and entropy figures were computed for this module from the EDCS text export (EDCS_text_cleaned_2022-09-12.json, 537,262 records; 1,086,124 restoration brackets) held in epidoc/big databases/public edcs. The abbreviation menus are transcribed from OHRE Appendix II. Example inscriptions are quoted from EDCS and cited by EDCS-ID. The wider data ecology referenced in the challenges — LIRE (the Latin Inscriptions of the Roman Empire dataset), the EDCS_ETL and Lat-Epig pipelines, and the kcl_tei EpiDoc corpora (Aphrodisias, Tripolitania, Cyrenaica) — is held in epidoc/inscription_databases.

本单元的频次与熵数据,系据 epidoc/big databases/public edcs 中所藏 EDCS 文本导出库(EDCS_text_cleaned_2022-09-12.json,537,262 条记录;1,086,124 个修复方括号)计算所得。缩写菜单转录自 OHRE 附录二。例铭引自 EDCS,并以 EDCS-ID 标注。诸挑战所涉的更广数据生态;LIRE(罗马帝国拉丁铭文数据集)、EDCS_ETLLat-Epig 处理流程,以及 kcl_tei EpiDoc 语料库(阿芙罗狄西亚、的黎波里塔尼亚、昔兰尼加);藏于 epidoc/inscription_databases

Continues from承接自

Week 12 · XML and Structured Data for Computational Analysis — the C.101 case study (six imperial documents on one stele) supplied this week's example of formulaic official language. See epidoc-six-papers-week12.html and week12-qa-interactive.html.第十二周 · 《用于计算分析的 XML 与结构化数据》,其 C.101 个案(一方石碑上的六份帝国文书)为本周提供了公式化官方语言的范例。参见 epidoc-six-papers-week12.htmlweek12-qa-interactive.html

Greek expansion — sources (Challenges VI–IX)希腊扩展,出处(挑战六至九)

The Greek track is built on the works below; quotations are reproduced from them, page-verified.

希腊部分据下列著作建成;所引文句均出于此,并经页码核校。

Badian, E. 1989. "History from 'Square Brackets'." Zeitschrift für Papyrologie und Epigraphik 79: 59–70.
Chambers, M., Gallucci, R., and Spanos, P. 1990. "Athens' Alliance with Egesta in the Year of Antiphon." ZPE 83: 38–63.
Henry, A. S. 1992. "Through a Laser Beam Darkly: Space-age Technology and the Egesta Decree (IG I³ 11)." ZPE 91: 137–146.
McGregor, M. F. 1966. "Method and Manners in Greek Epigraphy." Phoenix 20.3: 210–227.
Meritt, B. D., Wade-Gery, H. T., and McGregor, M. F. 1939–1953. The Athenian Tribute Lists. 4 vols. Cambridge, MA / Princeton.
Paarmann, B. 2007. Aparchai and Phoroi: A New Commented Edition of the Athenian Tribute Quota Lists and Assessment Decrees. Diss., Université de Fribourg.
Woodhead, A. G. 1981. The Study of Greek Inscriptions. 2nd ed. Cambridge: Cambridge University Press.
Assael, Y., Sommerschield, T., et al. 2022. "Restoring and attributing ancient texts using deep neural networks." Nature 603: 280–283. doi:10.1038/s41586-022-04448-z

Greek inscription readings (IG I³ 11; I³ 159; I³ 259–272; II² 399) follow the cited editions. The interactive widgets present them in simplified, illustrative form — verify against Inscriptiones Graecae before scholarly use.希腊铭文之读法(IG I³ 11;I³ 159;I³ 259–272;II² 399)依所引校本。互动控件以简化、示意之形式呈现,学术征引前请核对《希腊铭文集成》(Inscriptiones Graecae)。

§VIII · The four editions — deep-dive§VIII · 四部校勘本 — 深入

Each case unpacked: the anatomy, the formula slots, a verbatim sample, the connecting links.逐案展开:解剖、套语槽位、原文样品、相关链接。

The four editions exist in full at the Governance Corpus with SCPP-standard line-keyed apparatus, hover-glossary, and EpiDoc TEI XML downloads. This section is the deep-dive layer between the §0 mosaic (one paragraph per case) and the full editions (the entire text + apparatus + commentary). It surfaces, for each case, the formula bank the editor consulted, a verbatim passage with restoration brackets visible, and the chain of links into the rest of the family.

四部校勘本以全文形式收于公共文书语料库,配 SCPP 标准行键校勘栏、悬浮词汇、EpiDoc TEI XML 下载。笔者按:本节介于 §0 四案概览与全文校勘本之间:每案展示编者所据之套语谱系、一段含修复方括号之原文,以及通往家族其他页面之链接。

Case A · Laudatio Turiae — the virtue-catalogue slot案 A · 图利娅颂辞 — 美德目录槽位

The Laudatio's left column lines 30–34 carry the virtue-catalogue slot — a fixed eight-virtue inventory the genre uses (pudicitia, obsequium, comitas, facilitas, lanificium, religio sine superstitione, ornatus non conspiciendus, cultus modicus). Mantzilas 2017 finds the same eight-slot bank in five Latin laudationes mulierum; that is the parallel pool the editor consulted to restore the words Sirmond's transcript could no longer read.

图利娅颂辞左列 30-34 行承载美德目录槽位 — 此体裁所用八美德之固定清单(贞洁、顺从、和悦、亲和、纺织、不迷信之虔信、不显露之装饰、节制之妆容)。Mantzilas 2017 在五部拉丁妇女颂辞中皆见此八槽位之套语谱系;即编者据以补 Sirmond 抄本所不能读之处之平行文本池。

(30) Domestica bona pudici[t]iae, opsequi, comitatis, facilitatis,
(31) lanificii stud[i, religionis] sine superstitione, o[rJnatus non
(32) conspiciendi, cultus modici cur [memorem? Cur dicam de tuorum cari-]
(33) tate, familiae pietate, [clum aeque matrem meam ac tuos parentes col[ueris eandemque quietem]
(34) ili quam tuis curaveris, cetera innumerabilia habueris commun[ia cum omnibus] matronis
Wistrand 1976: 21 (text of CIL VI 1527, col. I, lines 30–34). Square brackets enclose editorial restoration.

Open the full Laudatio edition · Case study #21 (8-slot decomposition) · See §III of the Laudatio edition for the Mommsen-vs-stone box-score on the right-column lines Gordon 1949 rediscovered.

Case B · Segesta — the prescript-archon slot案 B · 塞格斯塔 — 前言执政官槽位

The Segesta decree opens with the Athenian decree-prescript formula: ἐπὶ + ἄρχοντος-genitive. Only the letters ]ΟΝ ΕΡΧΕ are securely read; the archon's name is restored. For 46 years the restoration was Habr]on (Wade-Gery, on three-bar sigma palaeography → 458/7 BCE). In 1990, Chambers, Gallucci, and Spanos used laser-enhanced photography and read Antiph]on → 418/7 BCE. The 40-year shift cascades into the entire chronology of Athenian imperialism: every decree dated relative to this one moves with it.

塞格斯塔法令以雅典法令前言套语开始:ἐπὶ + 执政官-属格。石上仅 ]ΟΝ ΕΡΧΕ 数字可读,执政官之名经补出。四十六年间补为 Habr]on(Wade-Gery,据三划西格玛字形 → 公元前 458/7 年)。1990 年 Chambers、Gallucci、Spanos 经激光增强照相,读出 Antiph]on → 公元前 418/7 年。四十年之挪移波及整个雅典帝国主义之年代学:以这是基准之每一条法令皆随之移位。

[ἐπὶ Ἁβρ]ονος / [ἐπὶ Ἀντιφ]ονος ἦρχε ⋯ IG I³ 11 = ML 37, prescript line 1. The three central letters carry the chronology of the Athenian empire.

Open the full Segesta edition · See Challenge VIII above for the interactive three-bar sigma / laser-photograph demonstration.

Case C · Athenian Tribute Lists — the year-prescript and aparche-quota slots案 C · 雅典贡金表 — 年份前言与初奉额槽位

40 years of annual aparche records (454/3–415/4 BCE). Each list opens with a fixed year-prescript (τάδε ἀπὸ τοῦ φόρου τῷ θεῷ ἀπὸ τοῦ τάλαντου), then lists cities in alphabet-blocks with their aparche figure to the right. The formula is so regular that the editors of the ATL (Meritt, Wade-Gery & McGregor 1939–53) routinely restored both the city-name and the quota figure from the surrounding entries. Paarmann 2007 has argued they restored too much — that the formula tells you a name preceded a figure but does not entitle the editor to write the name. Challenge VII above shows the contrast: ~46% restored (ATL) vs ~9% restored (Paarmann).

四十年间之年度初奉记录(公元前 454/3–415/4 年)。每表以固定年份前言开始(τάδε ἀπὸ τοῦ φόρου τῷ θεῷ ἀπὸ τοῦ τάλαντου),其后按字母排列列城邦及其初奉数额。套语之规整,使 ATL 编者(Meritt, Wade-Gery & McGregor 1939-53)惯于据上下文同时补出城邦名与数额。Paarmann 2007 谓其补得过多 — 套语告诉你数字之前有一个名字,但并未授权编者写下该名。挑战七已示对照:约 46% 经补(ATL)对约 9% 经补(Paarmann)。

[Καρπά]θιοι [Η]ΗΗ ← ATL maximal (~46% restored)
[. . . . .]θιοι [. .]Η ← Paarmann minimal (~9% restored)
A reconstructed quota-list entry modelled on the lapis primus (IG I³ 259–272), shown in two restoration schools.

Open the full Athenian Tribute List edition (Year 1 + Year 2 + Year 9 + Year 39 + the §VI comparative synthesis showing the 425/4 Thoudippos reassessment shock).

Case D · Persicus — the proconsular-heading and bilingual-pair slots案 D · Persicus — 总督开端与双语对照槽位

Persicus's edict on the Artemision (AD 44) survives in both Greek and Latin. The Greek face translates the Latin: Paullus Fabius Persicus, proconsul becomes Παῦλλος Φάβιος Πέρσικος ἀνθύπατος. The two faces let us watch a formula cross the language barrier — which Latin terms stay translated, which become Greek loanwords, which get expanded into glosses. This is the bridge between the Latin track (Challenges I–V) and the Greek track (Challenges VI–IX).

Persicus 关于阿尔忒弥斯神庙之诏书(公元 44 年)希拉两文皆存。希腊面译自拉丁:Paullus Fabius Persicus, proconsul 译为 Παῦλλος Φάβιος Πέρσικος ἀνθύπατος。两面并置可观一条套语如何越语言之界 — 何字得译、何字成为外来词、何字扩为夹注。即拉丁线(挑战一至五)与希腊线(挑战六至九)之桥梁。

Open the full Persicus edition (14 sections; bilingual parallel view).

§IX · The future of prediction — three engines§IX · 预测之未来 — 三引擎

DeepSeek (live LLM) · Ithaca-style (corpus-grounded simulation) · Aeneas / predictingthepast — same lacuna, three epistemologies.DeepSeek(在线大模型)· Ithaca 式(语料驱动模拟)· Aeneas / predictingthepast — 同一缺口,三套认识论。

Each engine treats a lacuna differently. An LLM (DeepSeek, Claude, OpenAI) is trained on internet-scale text including epigraphic editions; it fills lacunae the way it fills any cloze — with the most likely token. Ithaca (Assael et al. 2022, Nature 603) is a transformer trained specifically on the I.PHI Greek epigraphic corpus, producing top-k character predictions plus chronological and geographic attribution. Aeneas / predictingthepast is the DeepMind successor work, structured around contextualised parallels rather than character-level prediction.

Pick a passage below. Each engine returns its top candidates and a tagged basis (formula · stoichedon-geometry · frequency · free-conjecture). The comparison strip beneath the three columns lines up their top-1 outputs side by side. The pedagogical question: when the engines agree, where does the agreement come from? When they disagree, which is the responsible editor's friend?

三引擎对缺口之处置各异。大语言模型(DeepSeek、Claude、OpenAI)受互联网级文本训练(含铭文校勘本),填补缺口之方式与填补任何完形填空相同 — 以最似然之 token 为之。Ithaca(Assael 等 2022,Nature 603)为专以 I.PHI 希腊铭文语料库训练之 Transformer,输出 top-k 字符预测兼年代与地理归属。Aeneas / predictingthepast 为 DeepMind 之后续工作,以语境化平行文本为结构,而非字符级预测。

下方择一段。各引擎返回其首选候选及标记之基础(套语 · 方阵几何 · 频次 · 自由揣测)。下方对照带将三引擎之 top-1 输出并列。教学问题:三引擎一致时,一致从何而来?三引擎相左时,何者为负责编者之友?

Engine 1 · LLM

DeepSeek live API
DeepSeek Chat v3 (api.deepseek.com) · BYO key via ⚙ API · Behaviour governed by the Restoration Lab system-prompts (Challenge IX above).
Pick a passage and click "Run on all three engines" above.

Engine 2 · Ithaca-style

restoration.html · Panel 5 simulation
Empirically-grounded simulation built from the project's own corpus parallels (frequency · regional distribution · decade distribution). Output shape matches Assael et al. 2022 (Nature 603). Live Ithaca via the magalia API is P+2.
Pick a passage and click "Run on all three engines" above.

Engine 3 · Aeneas

predictingthepast simulation
DeepMind's successor work (Assael et al. 2024, github.com/google-deepmind/predictingthepast) — contextualised parallel retrieval. The simulation here returns the project's three most similar passages from the dossier corpus for the same lacuna.
Pick a passage and click "Run on all three engines" above.
Methodological caveat. Only Engine 1 is live (a real LLM call); Engines 2 and 3 are simulations built from this project's own corpus statistics, structured to match the shape of the real models' outputs. They are honest about that — the simulation badge is visible. Real Ithaca / Aeneas inference requires GPU hosting via the magalia API (sketched in html_dossier_plan.md §14.9.4; status: P+2). For the pedagogical purpose of contrasting three different epistemologies of prediction, the simulations suffice. For scholarly use, verify against the published models directly. 方法学告诫。仅引擎一为在线调用(真实大模型),引擎二与三为模拟,由本项目语料统计构建,输出结构与真实模型一致。其性质坦白可见,徽章标明"模拟"。真实 Ithaca / Aeneas 推断需经 magalia API 之 GPU 主机(详 html_dossier_plan.md §14.9.4;状态:P+2)。就对照三种预测认识论之教学目的而言,模拟足矣;学术用途请直接核对已发表之模型。

§III · The corpus in space — Pleiades-linked map§III · 语料库之空间分布 — Pleiades 古地图

Where each of the four cases sits in the Mediterranean — and what the EDCS provincial distribution looks like geographically.四案于地中海中之分布 — 以及 EDCS 行省分布之地理样貌。

EDCS holds 537 262 Latin inscriptions; their provincial distribution is wildly uneven. The bars in the workshop's Challenge V show this numerically; the map below shows it geographically. Pins for the four Wk13 case studies are emphasised in their period colour; smaller dots show the broader corpus density.

EDCS 收 537 262 条拉丁铭文,其行省分布极不均衡。工坊挑战五之柱图以数字示之;下方地图则以地理示之。四例 Wk13 案以期别色为加重之针标,较小之点示更广语料密度。

Rome Laudatio · A Segesta decree · B Athens ATL · C Ephesos Persicus · D Schematic — geographic accuracy approximate. Click each pin to open Pleiades.
Imperial period Classical period Other corpus dots (EDCS provincial density)

What this layer adds. Challenge V plots provincial counts as bars; that's a histogram of where the corpus was produced. The map shows the same data as geography. The four Wk13 cases sit in: Rome (Latium/Italy), Segesta (Sicilia), Athens (Achaia), Ephesos (Asia). Three are coastal; one is the metropole. None is in Britannia, in Hispania, in Pannonia — the empire's epigraphic density does not mirror the empire's territorial extent. Each pin links to the canonical Pleiades record. The verified Pleiades URI table is at governance-corpus.html and was built under M-PLEIADES.1.

本层之所增。挑战五以柱图示行省数量分布;该图为语料生产地之直方图。本地图以地理形式示同一数据。四例 Wk13 分别落于罗马(拉丁姆/意大利)、塞格斯塔(西西里)、雅典(阿凯亚)、以弗所(亚细亚)。其中三例为沿海,一例为大都。无一在不列颠尼亚、伊斯帕尼亚或潘诺尼亚 — 帝国铭文密度并不反映帝国疆域。每针标皆链至 Pleiades 之标准记录。已校核之 Pleiades URI 表参governance-corpus.html,构建于 M-PLEIADES.1。

§XII · Connect the corpus — every Wk13 link in one place§XII · 网状导览 — Wk13 所有链接汇聚于此

All the Wk13 surfaces, sources, and plan documents, organised by role.所有 Wk13 页面、来源、规划文件,按角色分类。

Editions you can read in full

Case-study deep-dives (Formula Dossier)

Workshop interactive (student-exercise mode)

Hub surfaces (navigation)

Primary sources (PDFs in folder)

  • Kaše, Heřmánková, Sobotková 2022 (PLOS ONE)methodological touchstone · the corpus-quantitative argument
  • Bruun & Edmondson 2015 (Oxford Handbook of Roman Epigraphy)Appendix II (abbreviation menus) drives Challenges I–V
  • Meritt, Wade-Gery, McGregor 1939–53 (ATL vols I–IV)drives Challenge VII + the ATL edition
  • Paarmann 2007 (Aparchai and Phoroi)the ATL-minimal restoration school
  • Durry 1950 + Wistrand 1976 + Flach 1991 + Gordon 1950Laudatio Turiae text-and-apparatus tradition
  • Chambers, Gallucci & Spanos 1990 · Henry 1992 · Badian 1989Segesta dating + "history from square brackets"
  • Assael, Sommerschield et al. 2022 (Nature 603)Ithaca · the engine of §IX middle column

Plan + audit documents

  • WK13-BUILD-PLAN.mdmaster architecture · §11 integrated Greco-Roman corpus model
  • WK13-FULLTEXT-PLAN.mdper-case full-text plan
  • ATL-EXPANSION-PLAN.mdYear 2 + Year 39 + comparative view
  • week13-greek-expansion-PLAN.mdGreek track + 3 DeepSeek API roles (now built)
  • WK13-CONNECT-PLAN.mdfamily-wide connectivity plan
  • WK13-BUNDLE-DISCUSSION.mdthis bundle's discussion + decisions
  • INSTRUCTOR_WK13.md90-min lecture flow, discussion prompts, data sources

Companion modules

  • Week 12 · Six imperial documents on one stelethe C.101 case study — supplies this week's formulaic-language frame
  • Week 14 · Ithaca lectureextends §IX Engine 2 from simulation to real Ithaca inference
  • Week 15 · Aeneas lectureextends §IX Engine 3

Pre-baked engine outputs

  • DeepSeek sample completions on the 4 case-study passagesalways works without an API key · click "Try a passage" in §IX
  • Ithaca-style simulation predictions for 12 exercisesembedded in restoration.html · Panel 5
  • EDCS aggregations: 537 262 records by 66 provinces × 27 centuriesoutputs/wk13_edcs_aggregations.json
  • Kaše 2022 verbatim quotesoutputs/wk13_kase_quotes.json