Skip to content

Mission, Research Themes and Guiding Questions

Date: October 18, 2024
Version: 1.0
Status: Work in progress
Authors: Matthias van Rossum ORCID and Manjusha Kuruppath ORCID
Input received from: Sophie Arnoult, Arno Bosse, Mrinalini Luthra, Brecht Nijman, Kay Pepping, Lodewijk Petram, Merve Tosun, Henrike Vellinga, Stella Verkijk, Leon van Wissen; participants in the GLOBALISE researcher panel meetings


Introduction: Our Position and Contributions

The extensive and detailed archives of the Dutch East India Company (the Vereenigde Oost Indische Compagnie, or VOC) are invaluable for studying not only the VOC world itself but also the often under-documented societies of early modern Africa, Asia, and Australia. They shed light on the many local, regional, and global interactions between these regions (e.g., Kuruppath 2019). Funded to create a research infrastructure for these archives, the GLOBALISE project focuses on the Overgekomen Brieven en Papieren (OBP), which contains copies of reports, letters, and other documents from regions where the VOC was active. We make the OBP’s contents available by identifying entities and events in a full transcription of the archival text, further contextualised with high-quality reference data and documentation. This information is freely accessible in a user-friendly, advanced research environment. With these activities, the GLOBALISE project aims to foster new, diverse, and more representative histories based on VOC archives.

The project is conceived as both an infrastructural and archival contribution, with expected impacts on digital humanities and historiography.

  1. Moving Beyond Searchability to Researchability
    First, we aim to move the focus of infrastructures for digised archives beyond searchability (e.g., keyword search of transcribed text) to focus on advancing researchability. We achieve this by combining natural language processing and digital humanities technologies with ‘historical contextualisation’. Here, we create high-quality, curated, and validated historical knowledge that is incorporated into domain-specific reference data and thesauri (for example, for polities, persons, commodities, and ships). This combination ensures that automatic methods for recognising and identifying events and entities in the digitised archival material produce observations relevant to historical research. Our aim is not merely to enable researchers and the broader public to perform free text searches on a vast, rich archival corpus; instead, we go beyond free-text searches to provide contextualisation, enabling new ways of conducting historical research.

  2. Innovative, Inclusive Techniques
    Our mission lies in finding responsible and useful methods to combine historical and semantic contextualisation while simultaneously encouraging the ‘democratisation’ of research, data, and knowledge. We strive to keep technical barriers low for potential users, making our infrastructure accessible to academic and societal audiences worldwide. The acronym of our project, derived from its technical approach (the General Letters Ontology-Based Accessibility InfraStructurE1), also emphasises that globalisation has historically been unequal. The forces of colonialism and capitalism continuously produced connectedness alongside dispossession and exploitation. This applies to both ‘early globalisation’ which accelerated with European colonial expansion, and ‘high globalisation’, which began in the 1980s and continues in the digital era. The project name reminds us of these historical and contemporary inequalities, and we seek to move away from the exploitative aspects of globalisation by building and publishing our work in an open, reusable, and inclusive way. To this end, we actively reach out to scholars worldwide to incorporate multiple perspectives and scrutinise colonial biases that led to uneven knowledge production and its availability in digital form.

  3. Historiographic Intervention
    The longstanding disregard of the VOC’s colonial, expansionist, and violent character has led to a limited, persistently Eurocentric perspective in VOC historiography. While area studies, non-Western, and world history have used VOC archives to explore histories of local societies, polities, people, and their resilience, the dominant narrative still portrays the VOC as a ‘merchant’ focused on trade and shipping. This ignores the broader historical reality: the VOC’s colonial empire-building, its organisation of international political order (war and diplomacy), its administration of territorial possessions (rule and governance), and its regulation of commodity production (agriculture, mining, etc.). This revised perspective reframes the VOC as a colonial force acting as ‘military’, ‘government’, and ‘producer’. It also raises questions about the VOC’s impact on local people and societies and how these interactions shaped local, regional, and global histories (Van Rossum 2019).

This historiographic shift emphasises using VOC archives to study (a) local, global, and colonial political, economic, cultural, social, and everyday interactions in the early modern world and (b) the transformations many societies experienced as a result of these interactions. These questions on European colonial expansion and globalisation’s effects on local societies are crucial to academic and societal debates on the legacies of colonialism, capitalism, slavery, and racialisation. By utilising the VOC archives, the project aims to ‘unseat’ the VOC as the principal object of study, instead highlighting the politics and societies in the Indian Ocean World as historical subjects in their own right. To support this objective, the project is actively developing, collecting, and curating more data on non-European entities, such as persons, polities, commodities, ships, and communities.

Research Themes Guiding the Infrastructure

Rigorous historical and technical work is required to handle the abundance and heterogeneity of primary source material in the archives. Central to this effort is understanding the historical contexts and relationships between entities. This contextualisation requires a focused, domain-specific approach, but it is clear that GLOBALISE should not be limited to a single thematic research area. Instead, it should cover a well-considered selection of domains to serve a wide range of audiences and research interests.

Shaping our vision based on the three interventions described above, we have formulated the following guiding questions:

  • Why and how did interactions between (European) colonial actors and (non-European) local societies develop according to specific historical patterns?
  • How did expanding European influence impact political, societal, economic, environmental, and cultural developments in the wider Indian Ocean, Indonesian Archipelago, and East Asian Seas regions?
  • How can the VOC archives be used to their fullest potential in writing histories of local, non-Western societies?

To answer these questions, we have identified a series of themes and sub-themes that can stimulate a wide range of research into the social, political, economic, and cultural history of European expansion, non-Western societies, early globalisation, and related interactions. The key thematic domains we have defined are:

  • Political
  • Economic
  • Social
  • Cultural
  • Environmental
  • Other

These themes will guide the development of a research infrastructure allowing extraction of useful historical observations from the Overgekomen Brieven en Papieren. While we recognise that any selection is inherently subjective, we have based our choices on several factors: the content of the source material (what appears most frequently), the relevance of themes for the widest range of research interests (which domains and information are most essential for diverse questions and research types), and the potential for contributing new perspectives and histories (especially for studying local and non-Western societies, globalisation, and histories of colonisation and encounter).

Given the prominent place of political and economic history, it may appear that our work prioritises themes that have traditionally dominated historiography over the last two centuries. However, it is important to note that our focus on seemingly conventional themes is paired with deliberately unconventional efforts: (i) in the case of political history, we aim to highlight lesser-known Asian polities that have received minimal attention in traditional historiography; and (ii) we seek to mitigate the prevailing emphasis on Dutch trade and shipping by also examining the more invasive aspects of European colonialism, as well as the economic, political, and social activities of non-European actors.

From Themes to Research Practice

The GLOBALISE infrastructure enables both simple and sophisticated text searches on the resource corpus. Our temporary, basic, transcription viewer already allows for this. In our future user interface, researchers will be able to query combinations of entities and events that yield results with contextual information. Notably, while these queries are descriptive (what, where, and when), researchers must further analyse and interpret results to answer explanatory research questions (how and why).

Key infrastructure components that allow querying on entities and events, and provide contextualised results, include:

  1. Structured Historical Reference Data
    Reference data provides basic information about entities, such as early modern polities, rulers, locations, and persons, and also information about non-European weights, measures, currencies, calendars, and conversions between them.

  2. Thesauri and Vocabularies
    Thesauri and vocabularies provide definitions for many European and non-European concepts in the VOC archives, such as commodities, occupations, ethnicities, religious groups, and location types.

  3. Annotations
    Annotations link mentions of entities, concepts, and events in the archival materials to the reference data.

  4. Ontology
    The ontology structures relations between entities, concepts, and events.

The labour-intensive development of these components, especially high-quality reference data and ontologies, requires strategic choices about the scope and coverage of themes and contextualisation efforts. A set of guiding key research queries for each of the thematic domains help direct this work, ensuring we provide the necessary elements to support a wide range of research questions.

Responsible Design and Use of the GLOBALISE infrastructure

The questions outlined in this document guide the development of the infrastructure and are based on our knowledge of the source structure and content, as well as our understanding of how entity annotation, event annotation, and historical contextualisation can facilitate complex, historicised querying.

Accessibility and Project Accountability

We recognise the immense value of creating an infrastructure that allows users to ask precise questions of the archive. However, we are mindful of the challenges this endeavour brings. While the infrastructure can make the archive more accessible, it also influences what is searched and found, which histories are written, and which topics enter public debate. This concern is critical given the archive’s colonial bias. To ensure we remain accountable, we document all our work responsibly and strive to communicate the limitations and implications of our infrastructure clearly to users. Interim publications of curated data, published on our Dataverse, include rich documentation, and the final infrastructure will feature reflective documentation.

Similarly, a critical perspective is required when using the project’s curated historical data. This data is shaped by the worldviews, knowledge, and expertise of the project team and contributors. We strive for transparency about our positionality, adhering to up-to-date ethical standards. However, we acknowledge that neutrality is unattainable. Large-scale digitisation and enrichment efforts carry inherent pitfalls, and given the size of the archives, we must rely on machine learning for event and entity annotations, which we can only partially manually check for accuracy. Transparency about methods, findings, and efforts to ensure consistency, historical accuracy, and limit gaps and biases remains our priority.

This discussion of the potential drawbacks or shortcomings of our infrastructure is not intended as an excuse, but rather as a reminder and call to action: to engage with the archive responsibly, to design our infrastructure in an open and reflective manner, to check and improve the quality of its content, and to instruct, educate, and challenge users to fully explore its potential (while remaning aware of the limitations), and continuing to develop our source and methodological criticism.

Harnessing Big Data Responsibly

As researchers engage with the new possibilities of digital archives, we emphasise the need for critical methods when interpreting digital sources. Researchers should reflect on each document’s type, context, authorship, and purpose. The archives do not provide unfiltered historical truths; they offer fragmented observations and interpretations that require careful contextualisation. Advanced search functions enable researchers to extract snippets of information, but these should be understood as representations of historical realities shaped by the document’s context.

While the infrastructure project remains transparent about the processes, tools, and methods used, we cannot anticipate all forms of future usage. We call on users to engage responsibly and contextually, and we will work to provide guidance on how to do so.

Research Questions and Infrastructure Elements

What questions can be asked of the archive? What queries can be facilitated by more advanced forms of querying and research? In the second part of this document, we explain and illustrate the guiding research questions that our infrastructure supports. These guiding research questions are formulated as generalised queries. For instance, a researcher might ask, ‘Which polities were in conflict with the VOC between 1640 and 1690?’, or ‘At which moments was Banten conquered by other polities?’ These are all manifestations of our guiding key research question: ‘When and which polities (or rulers) were in conflict with other polities?’ These questions inform our infrastructure’s components, such as datasets, thesauri, and the event ontology.

For a user to find relevant information about these questions, the GLOBALISE infrastructure needs to provide the following elements:

  1. Reference data Our dataset on early modern polities provides users with chronological information on polities like Banten, including locations, rulers, and periods of rule. If Sultan Agon appears in the archive, users will find contextual information identifying him as Banten’s ruler from 1651 to 1683. We also compile all name and spelling variations (e.g., Banten as Bantam, Banttam, Bantamm), making it easier for users to locate these entities, regardless of spelling differences in their queries.

  2. Thesaurus For the research questions mentioned here, we classify polities and rulerships into categories allowing users to expand or narrow their searches.

  3. Annotations and Ontology We tag all mentions of polities in the archive and our event detection process identifies events that involve actions such as attacking (which includes sub-classes such as starting a war, ending a war, besieging and invasion), destroying and relationship-change. This enables us to plot all such events that are suggestive of troubled interactions which involve actors such as the polities Banten and the VOC, together with other relevant information such as when and where these events occurred. Users will therefore be able to sift through the archive and search for precisely those passages which signal conflict or conquest.

In the following overview of our guiding key research questions we will for each question indicate how these are related to different domains of our historical reference data for entities (and conversions and events), as well as for different sections of the thesauri and event ontology. These guiding questions shape our efforts to design an infrastructure that facilitates targeted historical research, ensuring that the VOC archives are accessible for exploring specific themes and inquiries.

Researchers can use this overview to gain a clearer understanding of the types of research the GLOBALISE infrastructure is built to support. Each question aligns with one of the project’s five core domains—Political, Economic, Social, Environmental, and Cultural—that we identified as essential for studying the VOC’s historical impact. For each domain, we provide a list of ‘research ingredients’—specific data types, document categories, and resources within the infrastructure—that aid in locating relevant information.

The accompanying diagram provides an overview of this structure. The five research domains are represented in colored ellipses, each associated with guiding questions. Surrounding these domains are various components of the GLOBALISE infrastructure, displayed in white boxes. Colored diamonds indicate which components are relevant to each domain, illustrating how the infrastructure links diverse data elements to address a broad range of historical research needs.

GLOBALISE Research Themes diagram
Click image to enlarge

Political

Scope

  1. History of colonial expansion
  2. History of non-European states
  3. History of diplomacy, political (and polity) relations
  4. History of conflict: war, genocide and depopulation

Research Queries

  • When and which polities (or rulers) were in conflict with other polities?
  • Where and when did polities (or rulers) make agreements with other polities (or rulers)? What types of contracts were employed?
  • What were the conditions of these contracts? How did polities (or rulers) negotiate or discuss contracts and terms? Where, when and how did polities enforce imposed duties or obligations from other polities?
  • What were the effects of polity expansion (VOC/other polity)? How, when and where did the VOC (or other polities) politically engage with local populations?
  • Which polities ruled where and when (over what locations or over whom)?
Guiding Key Research Queries Reference Data (Entities) Thesauri and Concepts Event Types
Where and when did a polity (or rulers) come into conflict with other polities? Polities; Rulers; Rulerships; Polity Relations; Places; Temporal Expressions Actors; Actor Roles; Relations; Temporal Units; Places War; Conquest; Vassalage; Alliance; Treaty Events; Alliances; Death; Destruction; Mobilisation of Troops; Deposition; Enthronement; Succession; Imprisonment; Exile; Appointment; Election; Marriage
Where and when did polities (or rulers) make agreements with other polities (or rulers)? What kinds of contracts were made? Polities; Places; Temporal Expressions Actors; Actor Roles; Relations; Temporal Units; Places; Political Terminology (e.g., Firman, Grant, Contract, Sovereignty, Autonomy, Monarchy) Treaty Events; Contract Conditions involving Trade, Plantations Crops, Political and Administrative Conditions, Embassies
What were the conditions of these contracts? How did polities (or rulers) negotiate or discuss contracts and terms? Where, when and how did polities enforce imposed duties or obligations from other polities? Polities; Places; Temporal Expressions; Events Actors; Actor Roles; Relations; Statuses; Temporal Units; Places; Political Terminology Treaty Events
What were the effects of polity expansion (VOC/other polity)? How, where and when did the VOC (or other polities) politically engage with local populations? Polities; Places; Temporal Expressions; Events Actors; Actor Roles; Relations; Statuses; Temporal Units; Places; Political Terminology Conquest; Subjugation; Policy Making; Treaty Events; Treaty Conditions
Which polities (or rulers) ruled over what locations or over whom? [Political Governance] Polities; Persons; Places; Temporal Expressions Actors; Actor Roles; Relations; Statuses; Temporal Units; Places Appointments

Economic

Scope

  1. History of economic production and global commodities (sugar, coffee, spices (pepper, cloves, cinnamon), indigo, etc.)
  2. History of trade (shipping, merchant networks)
  3. History of finance and taxation

Research queries

  • When, where and how were certain commodities produced, and by whom?
  • What was traded? In which quantities? When, where, by whom and by what means?
  • Where and when did merchants/mercantile actors interact? And how?
  • What was the value of certain commodities in specific regions and periods of time? How did the values of commodities develop over time?
  • Where and when were polities or populations taxed? What were the means of taxation (payment, labour) and how much?
Guiding Key Research Queries Reference Data (Entities) Thesauri and Concepts Event Types
When, where and how were certain commodities produced, and by whom? Places; Temporal Expressions; Polities Commodities; Actors; Statuses; Actor Roles; Temporal Units; Places; Location Terminology (e.g., Thuynvelden and Specerijperken) Plantation/Crop Events (e.g., Aanplantingen, Kruidnageloogst, Indigoculture); Production/Manufacturing Events; Mining
What was traded? In which quantities? When, where, by whom and by what means? Measures; Places; Temporal Expressions; Ships Commodities; Actors; Statuses; Measures; Temporal Units; Places Purchase; Trade Events; Smuggling (Morsserijen)
Where and when did merchants/mercantile actors interact? And how? [Merchant Networks] Persons; Polities; Temporal Expressions; Places; Ships Actors; Measures; Commodities; Relations; Temporal Units; Places; Financial Concepts (e.g., Profit, Loss, Interest, Loan, Payment) Trade Events; Profit; Loss; Interest; Debt; Loan; Payment
What was the value of certain commodities in specific regions and periods of time? How did the values of commodities develop over time? Measures; Temporal Expressions; Places Commodities; Measures; Temporal Units; Places
Where and when were polities or populations taxed? What were the means of taxation (payment, labour) and how much? [Taxation] Persons; Polities; Measures Actors; Relations; Statuses; Measures; Commodities; Temporal Units; Places; Financial/Governance Concepts (e.g., Revenue Farming, Customs Farming, Customs Duties, Poll Tax, Import and Export Duties) Revenue Farming; Customs Farming; Levies; Poll Tax; Import and Export Duties

Social

Scope

  1. History of Slavery and Slave Trade
    Focus on slavery and slave trade within the Indian Ocean; Indonesian Archipelago, and East Asia, exploring themes like enslavement, bondage, and debt.

  2. History of Labour
    Examination of various forms of labor, including the roles of sailors, soldiers, artisans, agrarian workers, and female workers and merchants.

  3. Gender History
    Study of gender roles, gendered labor, and the representation of gender in historical records.

Research Queries

Slave Trade Patterns - How did the slave trade evolve across the Indian Ocean and Indonesian Archipelago? - Key questions include: transportation routes and timings, individuals and groups involved, and the origins, destinations, and identities of those enslaved. - How did enslavement practices vary by location and period?

Labour History - What were the roles and tasks associated with labor across time and place? - Focuses include: occupations, work locations, social groups engaged in labor, and instances of labor-related resistance or revolt. - Key questions: which groups were mobilized, when and where resistance occurred, and the connections between resistance and labor conditions.

Gender History - How is gender represented in the archives? - Investigates references to gender roles, descriptions of gender, and work or actions associated with specific genders.

Guiding Key Research Queries Reference Data (Entities) Thesauri and Concepts Event Ontology
Where and when were enslaved transported? From where to where? By whom? Persons; Places; Temporal expressions; Polities; Ships; Measures Actors; Statuses; Relations; Commodities; Measures; Temporal units; Places Transport; Slave trade; Changing status; Enslavement; Setting free
Who were enslaved, where and when? How were people enslaved? Persons; Places; Temporal expressions; Polities Actors; Statuses; Relations; Temporal units; Places Enslavement; Abduction; Conviction
Who worked where and when (occupations, tasks)? Persons; Temporal expressions; Places; Ships; Polities Statuses; Actors; Relations; Commodities; Occupations; Temporal units; Places Employment; Conflict (e.g., riots, uprisings); Marriage (related to labor obligations)
Which populations/social groups were mobilized for what work? Persons; Temporal expressions; Places; Ships; Polities Statuses; Actors; Relations; Commodities; Temporal units; Places
Who resisted or revolted where and when? In relation to which work, obligation, polity, or event? Persons; Temporal expressions; Places; Ships; Polities Statuses; Actors; Relations; Commodities; Temporal units; Places Revolt
What mentions of gender can be recovered from the archive? How are genders portrayed? What roles, work, or actions are gendered? Persons; Temporal expressions; Places; Polities (e.g., female rulers, merchants) Statuses; Relations; Actors; Commodities; Temporal units; Places Marriage; Widowhood; Gendered labor

Cultural

Scope

  1. Religious History
    Exploration of the historical spread and practice of religions in various regions.

  2. History of Intercultural Interaction
    Focus on how different cultures interacted, represented each other, and formed biases or stereotypes.

  3. History of Knowledge Circulation
    Study of how knowledge, including scientific information and cultural objects, circulated and was documented.

Research Queries

Intercultural Interaction
- How were groups labeled and represented in various archival document types? - Focus on terms used to describe or label groups, examining potential biases, prejudices, and power dynamics.

Religious History
- How and where were different religions practiced or spread? - Key questions: religious affiliations across regions, roles of religious figures, and religious concepts found in the archives.

Knowledge Circulation
- How were natural or scientific objects transported or documented? - Focuses include: routes and contexts of scientific or botanical object exchange, documentation in various texts, and objects related to knowledge production, including maps, books, and scientific instruments.

Guiding Key Research Queries Reference Data (Entities) Thesauri and Concepts Event Ontology
What labels were imposed on groups where and when (representation, prejudices, biases)? Polities; Persons; Temporal expressions; Places Statuses; Documents; Places; Actors; Relations; Temporal units
Where and when were what religions followed or spread? Persons; Temporal expressions; Places; Polities Relations; Actors; Statuses; Places; Temporal units (including religious terms and roles like priests) Conversion; Preaching; Religious instruction
Where and when were natural or scientific objects transported (in gift-giving, knowledge production)? Where and when were these documented? Persons; Places; Temporal expressions (including specific groups like doctors, priests, artists, etc.) Actors; Commodities; Documents; Places; Temporal units; Actor roles; Statuses; Occupations Transportation; Documenting knowledge

Environmental

Scope

  1. History of environmental change, weather, natural events
  2. History of animals
  3. History of famines

Research queries

  • How, when and where did natural disasters occur? What were their consequences and how were they coped with?
  • How did humans interact with animals?
  • When and where did famines occur? What information can be drawn about related phenomena in the form of war, food scarcity, impairment of food supply chains and extreme weather?
  • What weather patterns and climatic conditions were written about? How were they written about?
Guiding Key Research Queries Reference Data (Entities) Thesauri and Concepts Event Ontology
What weather patterns and climatic conditions were written about? Fog; Mist; Rain; Wind; Heat; Seasons Mentions of weather and climate; Climatic change; Impact
How, when and where did natural disasters occur? What were their consequences and how were they coped with? How did the environment and climate change in the early modern period? Natural Disasters; Locations; Chronology Monsoon; Seasons; Winds; Storms Mentions of weather and climate; Climatic change; Impact
How did humans interact with animals? Commodities; Persons; Groups Commodities; Persons; Groups Trade; Diplomacy (gifts); Death; Destruction; Hunting; Domestication; Draught animals
When and where did famines occur? What information can be drawn about related phenomena in the form of war, food scarcity, impairment of food supply and extreme weather? Commodities; Groups; Persons; Locations; Polities Commodities; Locations; Groups; Professions; Agriculture; Taxation; Finance Death; Destruction; Scarcity; Shortage; Drought; Pestilence

Other

Scope

  1. History of science (science)
  2. History of disease (medical)
  3. History of emotions (cultural)
  4. History of olfaction

Current Status

Our goal is to deliver high-quality relevant data for the five prioritised domains. Datasets have been created or are in the making for polities (and rulers and dynasties), commodities, units of measurement, locations and persons. We publish datasets as soon as first versions are ready for the public on our Dataverse. The social domain will be next on our list, after which we will explore what we can do to work on the environmental and cultural domains. The thesaurus for different domains is currently work in progress with colleagues from all over the globe. The entity recognition and event detection models are in full development, with the quality the first output of the entity recognition for a wide variety of classes now being evaluated. For more information, visit our websites:
- GLOBALISE project website
- GLOBALISE GitHub environment

Please note that this document is not a final statement but a plan for developing themes, reference data, and queries over time based on project demands, infrastructure, and user interaction. We welcome any input, advice, and questions to improve our work.


  1. The ‘General Letters’ are summarising reports within the Overgekomen Brieven en Papieren (OBP) archival document series. Early in the project, we have widened the scope to include the full OBP series. We think this better fit the aims of the project, as also described in this statement.