XML-TEI-Bible

XML-TEI-Bible

This project involves encoding the Spanish Bible in TEI. The main focus is on structure (chapters, verses and pericopes), entities (places, people and groups), and communication (who communicates with whom and how).


XML-TEI Bible

This project involves applying some of the annotation possibilities offered by the Text Encoding Initiative (TEI) to the Bible. It contains all 66 books of the Bible in Spanish. The main goal is to manually encode the entities and communication of the Bible.

Entities

For entities, the <rs> element is used together with a reference to an ID in the @key attribute. This ID distinguishes between the main types of entities, such as people (including divine entities, such as angels and gods), places (both terrestrial and non-terrestrial, known and unknown), groups, and other entities. This ID enables the identification of different entities and, for example, the analysis of how frequently entities (such as God, Jesus or Israel) are referred to in the Bible. It also enables the different books of the Bible to be described by looking at the number of entities per verse, as shown in Figure 1.

Figure 1: Bar plot of relative number of entities pro verse in each book of the Bible by type of entity

Commmunication

The annotation relating to communication mainly involves who is communicating with whom and how. To achieve this, the element <q> is used with the attributes @who, @toWhom and @how. In fact, this project formed the basis of the argument for a new '@toWhom' attribute as part of the TEI, which was accepted following discussion. The @how attribute can have different values, such as 'oral', 'prayer', 'written', and 'song'.

An interesting aspect of biblical communication is that nested communication is very common. For example, Person 1 tells Person 2 that Person 3 told Person 1 to tell Person 2 that Person 3 would like to tell Person 2 that... In other words, there are many quotations within quotations. This process of nested communication is especially common in the prophetic books. Figure 2 illustrates how frequently direct communication occurs in the different books and the level of nested communication achieved, with up to five levels.

Figure 2: Bar plot of the number and level of direct quotation in all books of the Bible

The information about the communication process can be also shown as a network, as shown in Figure 3 for the Gospel of Matthew.

Figure 3: Direct communication in the Gospel of Matthew

Publication in the TextGrid Repository and XSLT-Transformation

The project aims to take advantage of the TextGrid Repository's features, such as persistent identifiers, import into other platforms and catalogues, use of European public infrastructure, access via APIs and Python libraries, and straightforward transfer to other tools.

As the reading experience is an important aspect of the project, and the TextGrid Repository transforms TEI files into HTML files to enable reading, I use a project-specific XSLT transformation. This transformation highlights some metadata at the beginning of each file and then uses different typographical means to show the main structure (book, chapters, pericopes, verses and verse IDs), as well as the annotation of entities and communication. To achieve this, colours and typographical options such as bold and italics are used together with mouse-over functions that show the exact content of the attributes.

References

The project and annotation were completed by me, Dr José Calvo Tello. Please bear in mind that this was an experiment to see if a more structured approach to the Bible could reveal interesting insights, and that I have no training in theology or religious studies.

This project used the Reina-Valera 1995 translation. The text is in Spanish because it is my mother tongue.

The project had previously been published in a GitHub repository. The GitHub repository is rather disorganised and contains much more information, code and output data. I will likely prepare more documents and import them into the TextGrid repository.

If you want to reference the project in the TextGrid Repository, here is a suggestion:

To gain more insight into the project, check out the following publications for which I created the illustrations shown on this landing page:

  • Calvo Tello, José. 2023. ‘Text Encoding Initiative (TEI) como formato para datos cualitativos a escala cuantitativa: el caso de XML-TEI Bible’. In Literatura, didáctica y humanidades digitales: aportaciones para la docencia y la investigación, edited by Pedro Mármol Ávila. Dykinson. https://doi.org/10.14679/2124.