A few weeks ago, I had a fascinating conversation with OpenAI's O1 model about decoding the Indus Valley script - one of the world's oldest and still undeciphered writing systems. What started as a curious experiment led to an intriguing demonstration of how large language models might contribute to archaeological and linguistic research.
Here is a pdf ofthe entire chat transcript: https://drive.google.com/file/d/1Y3XRtUpKOMEsWjxQgAIPBu-GR-NaWG40/view?usp=sharing
The Indus Valley civilization (c. 3300-1300 BCE) left behind thousands of seals with short inscriptions that have puzzled scholars for decades. Without a "Rosetta Stone" equivalent, these ancient texts remain silent witnesses to a sophisticated Bronze Age culture.
I began by showing O1 some seals and the script and asking it to analyze patterns and propose theories about their meaning. Its initial response was methodical, breaking down the analysis into distinct categories:
"Sign Repetition and Symbol Clusters: The short strings of symbols on each seal typically feature a set of recurring motifs—forked shapes, plant-like icons, geometric forms, and simple line-and-dot patterns. The tight clustering and consistent size of these signs suggest that the script might be composed of a finite set of standardized symbols, possibly 300–400 in total, as researchers have cataloged."