1 / 21
Contextualising ancient texts语境化古代文本
Google DeepMind · Nature 645 (2025)Google DeepMind · Nature 645 (2025)
谷歌 DeepMind · 《自然》645 卷(2025)谷歌 DeepMind · 《自然》645 卷(2025)
AeneasAeneas
The successor to Ithaca: a generative, multimodal network that doesn’t just restore Latin inscriptions — it contextualises them, retrieving the parallels a historian would reach for.Ithaca 的后继者:一个生成式、多模态网络,它不仅修复拉丁铭文 —— 还为之建立语境,检索史学家会去翻找的平行文本。
90%90%
parallels usefulparallels useful
平行有用平行有用
+44%+44%
confidenceconfidence
信心信心
72%72%
place (62)place (62)
定位(62)定位(62)
~13 yr~13 yr
datingdating
定年定年
← New here? Start with the Ithaca deck← New here? Start with the Ithaca deck
← 初次了解?先看 Ithaca 演示← 初次了解?先看 Ithaca 演示
2 / 21
0 · From Ithaca0 · 承接 Ithaca
0 · where we left offIthaca solved three tasks — for Greek0 · 上回说到Ithaca 解决了三项任务 —— 针对希腊语
In 2022 Ithaca learned to restore, place and date Greek inscriptions, with ranked hypotheses and saliency. Aeneas keeps all of that and adds four big moves.2022 年,Ithaca学会了修复、定位与定年希腊铭文,并给出排名假设与显著性。Aeneas 保留这一切,再加四大跨越。
Contextual parallels · vision · unknown-length gaps · Latin (the LED).语境平行文本 · 视觉 · 未知长度缺口 · 拉丁语(LED)。
The headline: historians found Aeneas’ parallels useful 90% of the time, lifting confidence by 44%.要点:史学家在 90% 的情况下认为 Aeneas 的平行文本有用,使信心提升 44%。
3 / 21
A · the big ideaA · 核心思想
A · the big ideaWhat does “contextualise” mean?A · 核心思想“语境化”是什么意思?
A historian places an inscription among parallels — texts with shared phrasing, formulae, function or setting. Contextualisation is Aeneas doing this automatically. Assael 2025史学家把铭文置于平行文本之间 —— 共享措辞、程式、功能或背景的文本。语境化就是 Aeneas 自动完成这件事。Assael 2025
The old way was literal string matching; two texts can be deeply related with no shared words at all. Aeneas compares by meaning.旧方法是字面字符串匹配;两段文本可毫无共同词语却密切相关。Aeneas 按意义比较。
4 / 21
A · parallels demoA · 平行文本演示
A · see it workRetrieving parallels for AugustusA · 亲眼一看为奥古斯都检索平行文本
Give Aeneas the Res Gestae and ask for its closest parallels — the real top-five it returned. All composed in Rome, yet found across the empire. Assael 2025把《功业录》交给 Aeneas,让它找最接近的平行文本 —— 它返回的真实前五名。全作于罗马,却散见整个帝国。Assael 2025
▶ interactive: parallels — open the live deck to use it交互演示:parallels —— 打开实时演示以使用
5 / 21
A · cosine similarityA · 余弦相似度
A · the big ideaWhat exactly is cosine similarity?A · 核心思想余弦相似度究竟是什么?
Aeneas ranks parallels by the angle between two embedding vectors — cos = A·B / (|A||B|). Rotate B; 1 = same meaning, 0 = unrelated. Assael 2025Aeneas 按两个嵌入向量的夹角排序平行文本 —— cos = A·B / (|A||B|)。旋转 B;1 = 同义,0 = 无关。Assael 2025
▶ interactive: cosine — open the live deck to use it交互演示:cosine —— 打开实时演示以使用
6 / 21
B · the LEDB · LED 数据集
B · the dataA new corpus: the LEDB · 数据一个新语料库:LED
Aeneas learned from the Latin Epigraphic Dataset — three databases merged and de-duplicated via Trismegistos IDs into 176,861 inscriptions. Assael 2025Aeneas 学习自拉丁金石数据集 —— 三库经 Trismegistos ID 合并去重为 176,861 条铭文。Assael 2025
▶ interactive: led — open the live deck to use it交互演示:led —— 打开实时演示以使用
7 / 21
C · reading the stoneC · 读石头
C · multimodalLooking at the stone, not just the wordsC · 多模态看石头,而不仅是文字
Aeneas is multimodal: a small ResNet reads the photo and feeds only the province head — and the image measurably helps. Toggle it. Assael 2025Aeneas 是多模态的:一个小型 ResNet 读照片,仅馈入省份头 —— 图像可测地有帮助。切换看看。Assael 2025
▶ interactive: vision — open the live deck to use it交互演示:vision —— 打开实时演示以使用
8 / 21
D · inside AeneasD · Aeneas 内部
D · architectureOne torso, four heads, two eyesD · 架构一根主干、四个头、两只眼
The torso is a deep, narrow T5 decoder (16 layers, 8 heads) using rotary positional embeddings, emitting a 1,536-dim vector per character. Four heads: restoration, unknown-length, province (+vision), date. Assael 2025主干是深而窄的 T5 解码器(16 层、8 头),用旋转位置编码,每字符产出 1,536 维向量。四头:修复、未知长度、省份(+视觉)、年代。Assael 2025
🖼 figure: Aeneas architecture图:Aeneas architecture
Fig. 2 · Aeneas architectureFig. 2 · Aeneas architecture
9 / 21
D · rotary positionsD · 旋转位置
D · inside AeneasRoPE: position by rotating the vectorsD · Aeneas 内部RoPE:以旋转向量表示位置
Aeneas’s T5 layers encode position by rotating each query/key vector by an angle ∝ position — so attention feels only the relative distance. Drag the positions. Su 2021Aeneas 的 T5 层通过把每个查询/键向量旋转一个 ∝ 位置的角度来编码位置 —— 故注意力只感知相对距离。拖动位置。Su 2021
▶ interactive: rope — open the live deck to use it交互演示:rope —— 打开实时演示以使用
10 / 21
D · BigBird vs T5D · BigBird 对 T5
D · inside AeneasIthaca’s BigBird vs Aeneas’s T5 torsoD · Aeneas 内部Ithaca 的 BigBird 对 Aeneas 的 T5 主干
Same “one torso → many heads” plan, rebuilt engine: sparse BigBird → dense T5 layers; sinusoidal → RoPE; + a ResNet-8 vision path (geography head only) and a retrieval head. Verified in the config. Assael 2025同样的“一主干 → 多头”方案,重建的引擎:稀疏 BigBird → 稠密 T5 层;正弦 → RoPE;+ 一条 ResNet-8 视觉通路(仅地理头)与一个检索头。已在配置核实。Assael 2025
▶ interactive: archcompare — open the live deck to use it交互演示:archcompare —— 打开实时演示以使用
11 / 21
E · the “#” gapE · “#” 缺口
E · a harder problemHow long is the gap?E · 更难的问题缺口有多长?
Ithaca needed the gap size told to it. Aeneas marks unknown damage with “#”, decides how many characters are missing, then fills them. Assael 2025Ithaca 需被告知缺口大小。Aeneas 用“#”标记未知残缺,判断缺多少字符,再填补。Assael 2025
▶ interactive: restore — open the live deck to use it交互演示:restore —— 打开实时演示以使用
12 / 21
F · 1 + 1 > 2, againF · 再次 1 + 1 > 2
F · resultsThe parallels are the multiplierF · 结果平行文本是放大器
With 23 historians on 60 inscriptions, restoration error falls 39% → 21% as Aeneas’ parallels and predictions are added. Assael 202523 位史学家、60 条铭文:随 Aeneas 平行文本与预测加入,修复错误率由 39% → 21%。Assael 2025
▶ interactive: race — open the live deck to use it交互演示:race —— 打开实时演示以使用
13 / 21
F · place (with vision)F · 定位(含视觉)
F · geographic attributionWhere in the empire?F · 地理归属在帝国的何处?
Aeneas places an inscription among 62 Roman provinces at 72% top-1 / 84% top-3 — and here the image helps. Assael 2025Aeneas 在 62 个罗马省份中定位,top-1 72%、top-3 84% —— 这里图像有帮助。Assael 2025
▶ interactive: provinceEx — open the live deck to use it交互演示:provinceEx —— 打开实时演示以使用
14 / 21
F · date (RGDA)F · 年代(功业录)
F · chronological attributionWhen? Still a curveF · 年代归属何时?依然是一条曲线
On the Res Gestae’s 35 chapters Aeneas’ dating is bimodal — and ignores the misleading consular dates inside the text, keying on linguistic markers. Assael 2025在《功业录》35 章上,Aeneas 的定年呈双峰 —— 且无视文中误导性的执政官纪年,转而依凭语言标志。Assael 2025
▶ interactive: dating — open the live deck to use it交互演示:dating —— 打开实时演示以使用
15 / 21
G · saliency on AugustusG · 奥古斯都显著性
G · showing its workWhat told Aeneas the date?G · 给出依据是什么让 Aeneas 判断了年代?
Aeneas’ saliency maps reveal it reads the Res Gestae like a scholar — fixing on features that carry chronological weight. Assael 2025Aeneas 的显著性图显示它像学者一样读《功业录》—— 锁定承载年代信息的特征。Assael 2025
Archaizing orthography. aheneus (for aeneus) shifts only in the 1st c. AD — its saliency highlights exactly this word.拟古正字法。aheneus(即 aeneus)仅在公元 1 世纪变化 —— 显著性恰好高亮此词。
Named institutions. princeps iuuentutis (5 BC) and the Altar of Augustan Peace (13 BC) light up as dated anchors.所提机构。princeps iuuentutis(前 5 年)与奥古斯都和平祭坛(前 13 年)作为有年代的锚点被点亮。
16 / 21
H · the exact recipeH · 精确配方
H · down to the codeThe training recipeH · 直抵代码训练配方
6464
TPU v5e chipsTPU v5e chips
TPU v5e 芯片TPU v5e 芯片
1,0241,024
text–image pairstext–image pairs
文本-图像对文本-图像对
1616
T5 layers · rotaryT5 layers · rotary
T5 层 · 旋转T5 层 · 旋转
LAMBLAMB
lr 3×10⁻³ · 1M stepslr 3×10⁻³ · 1M steps
学习率 3×10⁻³ · 100 万步学习率 3×10⁻³ · 100 万步
Toggle “Explain” for a plain-language note on each line. Assael 2025打开“讲解”查看每行通俗注解。Assael 2025
▶ interactive: code — open the live deck to use it交互演示:code —— 打开实时演示以使用
17 / 21
I · Res GestaeI · 功业录
I · case studyAeneas reads AugustusI · 案例研究Aeneas 读奥古斯都
The team turned Aeneas on the Res Gestae Divi Augusti — Augustus’ own account of his deeds, the most famous Roman inscription. Assael 2025团队让 Aeneas 分析《神圣奥古斯都功业录》—— 奥古斯都亲述其功业、最著名的罗马铭文。Assael 2025
Right parallels. All top-5 are Rome-composed Senate documents honouring the imperial family.对的平行文本。前五名全是作于罗马、表彰皇室的元老院文件。
Sensible date. A bimodal curve around the Augustan period, driven by linguistic detail.合理定年。围绕奥古斯都时期的双峰曲线,由语言细节驱动。
18 / 21
J · honest limitsJ · 坦诚局限
J · honest limitsWhat to keep in mindJ · 坦诚局限需要记住的
The LED inherits survival bias; only ~5% of texts have images, so vision helps unevenly. As in Ithaca, retained editorial restorations carry the "history from square brackets" risk. Assael 2025LED 承袭存世偏差;仅约 5% 有图像,故视觉帮助不均衡。与 Ithaca 一样,保留的编辑修复带有“方括号里的历史”之险。Assael 2025
19 / 21
K · two specialists, one toolK · 两位专家,一件工具
K · bringing them togetherIthaca + Aeneas → SymbolonK · 把它们合到一起Ithaca + Aeneas → Symbolon
Two specialists now exist: Ithaca (Greek), Aeneas (Latin). Your Symbolon project orchestrates both behind one service — an experiment to advance, not a paradigm.如今有两位专家:希腊语的 Ithaca、拉丁语的 Aeneas。你的 Symbolon 项目把二者编排于一项服务之后 —— 一次待推进的实验,而非范式。
DeliverableImprovements are written up in SYMBOLON_IMPROVEMENTS.md — each tied to a paper passage and a code site.交付物改进记于 SYMBOLON_IMPROVEMENTS.md —— 每条对应一段论文与一处代码。
▶ Open the Symbolon deck →▶ Open the Symbolon deck →
▶ 打开 Symbolon 演示 →▶ 打开 Symbolon 演示 →
20 / 21
Sources & reading出处与延伸阅读
SourcesEvery claim, traceable出处每个论断,皆可溯源
▍ local source · ▍ web source.▍ 本地来源 · ▍ 网络来源。
▶ interactive: sources — open the live deck to use it交互演示:sources —— 打开实时演示以使用
21 / 21
Thank you谢谢
Connecting thescattered past连接散落的过去
Where Ithaca restored the broken text, Aeneas rebuilds the web of connections around it.Ithaca 修复残破的文本,Aeneas 则重建其周围的连接之网。
← Back to Ithaca← Back to Ithaca
← 返回 Ithaca← 返回 Ithaca
The lineage →The lineage →
技术谱系 →技术谱系 →
Symbolon →Symbolon →
Symbolon →Symbolon →
The full journey →The full journey →
完整旅程 →完整旅程 →
Assael, Sommerschield et al., Nature 645, 141–147 (2025) · predictingthepast.comAssael, Sommerschield et al., Nature 645, 141–147 (2025) · predictingthepast.com
© 2026 Wu Ching-Yuan 吴靖远 · magalia.wiki (籬廬). Generated transcript 2026-06-13 from aeneas.html · text CC BY 4.0. Papers © their authors (DeepMind, Nature).