Ithaca (DeepMind)

What is Ithaca?

Ithaca is a software or an algorithm developed by the British company DeepMind, which uses Machine learning is intended to complete fragmentary ancient texts. To this end, the programme is primarily used in epigraphy, the science of inscriptions on various materials. Currently, the algorithm is aimed at the analysis of ancient Greek texts, but is In future, the application is also planned for other languages.

In addition to filling in the gaps, Ithaca should also provide information about the place of origin as well as the date of composition of the texts. The dating of these texts was previously not possible with common methods such as the radiocarbon method, because for this application the texts had to be written on carbonaceous materials.

The algorithm was developed by the company DeepMind (AlphaGo, AlphaZero, AlphaFold) was developed in cooperation with several companies such as Google and universities such as the University of Oxford, the Athens University of Economics and Business or the Università Ca'Foscari di Venezia. The name Ithaca was chosen as a homage to the homonymous home island of Odysseus from Homer's epics.

In order to make the algorithm accessible to as many people as possible, DeepMind has published the source code on the Open Source version management platform GitHub, which can be used and further developed there.

What are the functions of DeepMind's new ML model?

The algorithm for Text completion works - like most of the algorithms which artificial intelligence Apply - on the basis of probabilities. To do this, Ithaca uses the largest digital database for ancient texts from the Packard Humanities Institute in California to calculate the words that most likely fit into the gaps. The Database comprises almost 180,000 inscriptions, all of which are provided with metadata such as place and time.

Based on this, the calculation result is presented to experts for final determination of the text gaps, who evaluate the original text with their expertise. In an experiment conducted by DeepMind, Ithaca alone was able to recover single words with an accuracy of 62 %, but historians were only able to do so with an accuracy of 25 %. By collaborating the assessment of the programme and the experts, the accuracy to recover missing words increased to 72 %.

The "Lexicon of Greek Personal Names" (LGPN), a lexicon of Greek names from the British University of Oxford, supports the dating and classification of the region in which the respective writing was produced. With this data, it can be analysed when which names were most frequently represented in which region, thus generating insights into the geographical and temporal classification of the ancient texts. In this way, Ithaca should achieve an accuracy of 71 % for the date and lie within a time span of less than 30 years for the proposed date of origin.

How can researchers access Ithaca from DeepMind?

The algorithm can be accessed in several ways:

  • On the open source version control platform GitHub is the source codeThe libraries and other integration options for using Ithaca are also listed.
  • Ithaca was founded in a Contribution of the trade journal Nature published. In the publicly available contribution, reference is made to the The working principle of the neural network. In addition, the methods used are described in detail and examples are presented. The connection to the previous text retrieval based on neural networks from DeepMind called Pythia is also formed.
  • The algorithm is furthermore about the Ithaca website of the DeepMind company retrievable and is made freely and publicly available to researchers. There, the ancient Greek text can be entered using a text box, in which the missing passages are restored. In addition, a dating and a localisation of the place of publication is carried out.

Data Navigator Newsletter