Bridging the Gap – Reviewing the Getty / WMF Arches System

archesThis week I participated on a panel (at the Computer Applications and Quantitative Methods in Archaeology Conference) talking about Linked Open Data in the context of the Arches Heritage Inventory and Management System, jointly developed by the Getty Conservation Institute (GCI) and World Monuments Fund (WMF). Designed initially to support immovable cultural heritage, the Arches system uses a new approach to information system design by implementing Open World ontologies at the core of a self contained data management application. The first implementations of Arches use the CIDOC CRM for the underlying data models (for example, input screens are generated directly from the ontology model) to make full use of the rich relationships and semantics that CIDOC CRM has to offer, and showing the full potential of the system. It represents a significant milestone in data management design because it confronts the issue of how internal information systems can more easily and meaningfully connect to Open Data environments, necessary for collaboration.

We typically use traditional information systems that contain meaning and context in many different layers of the application. Meaning exists in the data, the business rules and application logic, the user interface; and some meaning is also typically left implicit and only understood by the internal users of the system who ‘fill in’ the semantic gaps.

Research data requires context. Actually, engagement data requires context! Typically relatively little of this context makes it through to Linked Data publications. Data publication often means “Raw Data” straight from database files with essential meaning, locked away in the other application layers, missing. People who publish data and create the models are often not the data experts and they don’t collaborate sufficiently with data ‘knowledge’ experts. Therefore target LOD schemas/ontologies, and therefore open data, lacks context and meaning.

The Arches system is different in that the system is based on graph database principles and creates a user oriented data management environment based on real world ontologies like the CIDOC CRM. This means that context and meaning is built into the underlying database at the start (when the need for modelling is generally accepted) rather than trying to interpret and extract the meaning at the end. Yes, it still means modelling, but all new database applications require data modelling. With Arches, and other CIDOC CRM applications, CRM models (templates) can potentially be published and reused making the modelling process far more collaborative than traditional artificial Closed World modelling.

By using an underlying data model that natively supports context and semantics, and not locking it away in other components that can’t be easily (or are not) transformed into data, Arches bridges the gap between the Closed World information systems and contextual Open World requirements. Arches now joins a growing portfolio of innovative CIDOC CRM applications.