You can download our Encoding Guidelines (alongside XML files of all texts) here.
Our text encoding decisions were based around several key aims:
- to produce text editions of documents that are useful for a range of audiences including students, the general public, and subject experts;
- to analyse and demonstrate the main structures and features of different sorts of Persianate documents; and
- to provide ways of understanding the interrelationships between different documents in the corpus and tools for further analysis.
The page for each text and the additional information pages have been produced in accordance with the following schema in order to support these aims.
All documents are transcribed according to the script, style, and spellings of the original texts, with the exception that words have been separated in the transcription of Hindi, Marathi, and Rajasthani texts and hyphens used to indicate where words cross lines for clarity of reading. In the main sections, the transcription follows the line breaks of the original text. Intentional gaps are indicated with a long underscore, while elevated text is rendered in reading order accompanied by a footnote describing the text’s elevated position. Unclear readings are indicated using square brackets, with alternate readings included if applicable, and the expanded forms of abbreviations are provided wherever possible. Gaps, whether due to illegibility or damage, are indicated by ellipses.
As far as possible while retaining coherence, the translations follow the line-by-line reading of the text. In order to achieve idiomatic English word order and greater comprehensibility, the translations of adverbial and prepositional phrases may occur a line below where they occur in the source text. In some cases, the translations vary slightly from the verbal tense or number in the source text. In particular, where the Persian texts use a plural verb conjugation to mark respect toward a single person, the editors have used singular pronouns and verbs in the English. As in the transcriptions, gaps and uncertain translations are indicated. Where the editor has inserted words for clarification or intelligibility, they are enclosed in square brackets .
Emic Terms and Glossary
Emic terms for official posts and ranks, document types, and units of territory and measure widely used in the expert literature are preserved in the translation. Brief glosses of these terms are provided in hover-over text, while more in-depth commentary is provided as necessary in the glossary. In the abstracts of documents, these terms are provided as parentheticals.
Emic terms are transliterated according to the Library of Congress (LOC) transliteration guidance for the respective language, without the use of diacritics except for the Perso-Arabic script letter ʿain, represented by ʿ. Where an emic term is used across multiple languages, the transliteration follows the pattern for Persian. For emic terms such as diwan that have entered English usage, standard English spellings are used. Names are transliterated without diacritics according to pronunciation and common forms. Thus, Abdullah is used, not ʿAbd Allah; Aurangzeb, not Awrangzib.
Display Style and Structure
The document text is rendered in reading order. In general, this means from top to bottom of the page recto (the main front side), then any marginalia on the recto, and lastly the verso (the back of the document). The text has been analysed according to its structure; each division of the text is clearly labelled and visually separated in the display. The text divisions in use are invocation, authorization, main text, witness statements, endorsements, particulars, clerical notes, and later additions. The text divisions are also labelled with the main language in use in the section. All divisions except for the main text are also labelled with their position on the page to help orient the reader to their location in the document image.
Many of the texts include visual features such as stamps and seals, logographs (symbols representing words), charts, and drawings. These are encoded as visual features and any text included in a figure or represented by a logograph is recorded both in the transcription and the translation. The translation also includes a description of the appearance of stamps, seals, and drawings. A general discussion of these figures and how to interpret them is presented separately.
Tables and Lists
A number of documents in the collection present particulars in tabular formats, for example of lands granted. Often, they follow well-established formatting norms described in siyaq accountancy manuals. However, they do not fully conform to the modern expectation of data presentation in a table, where quantities and details about single type of entity is given in labelled columns. Therefore, we have decided to present the information in these ‘tables’ as nested lists instead. For an example see here.
People and Places
Key people in the texts are listed separately and their roles in the document identified. The names of people and places in the texts and text summaries are encoded and rendered in a different colour from the surrounding text. Clickable links lead the interested reader to separate pages where additional information is provided. For individuals, this includes additional names such as titles and patronyms, sex, occupation, and social group, as well as a list of all other documents in which that person occurs. For places, a geo-reference is provided where the place can be accurately identified with a modern-day location and any epithets or alternate names are recorded.
Each text is labelled with its formal and functional document type. The formal type reflects the emic classification of documents in the Persianate world. Often, but not always, this term is included in the document itself. Functional type refers to the editors’ broad classification of the documents by their form and use: imperial and noble decrees, private deeds, tax receipts, administrative records, letters, and other. Links provide further information about each of the formal and functional types.
Each document has been assigned one or more key thematic terms from a controlled vocabulary. These terms provide another way to understand the nature of the documents presented on the site and their interconnections. They also facilitate searches for topics that are addressed in the documents but not expressed directly within the texts.
Each text is assigned to a document collection. While in some cases a collection is all from one repository, in others it connects documents related to the same family across multiple repositories. This feature allows the user to explore interrelated documents.
Each document is dated according to the date as provided in the text, with the calendar type specified, and its conversion to the Gregorian calendar. Additional information on calendars and conversion is provided on the document features page. Where the document does not include a precise date, a date range is suggested based on details such as when document participants were known to be active. On the documents page, the texts are listed on a chronological basis and the Gregorian year is shown.