Data journey - excavation data goes to large dataset

Karin Lund

University of Southampton OCS (beta), CAA 2012

Karin Lund

Last modified: 2011-12-18

Abstract

Excavation data has been collected and recorded digital for decades, with or without prior knowledge of the final destination. The Intrasis documentation system, from Swedish National Heritage Board, was created without the knowledge of destination of a general storage, where data from different excavations are joined into a large dataset. The primary focus for Intrasis data model was to make a fixed structure – but with the tools to store any desired data. The final data model is therefore normalized and not possible to change – but since it is object oriented the outcome for the user is fully flexible.

Only recently more widely discussion has been started in Sweden concerning the joining of Intrasisdata into large datasets, containing data from every excavation and institution. This could be seen as a “simple” task since approximately 90% of the archaeological excavations use Intrasis for their data collection. Technically it would, I suppose, be rather easy due to the normalized data model, but when it comes to archaeology, it will be rather complicated. Because collecting data from a site is at the same time an interpretation of the site. Different sites mean different attributes and meta data settings, which is necessary to make the site understandable. Different excavation methods are also reflected in the collected data. So there are hardly any common standard attributes in the Swedish Intrasis databases, apart from a basic feature interpretation (e.g. posthole, pit).

Another example is the finds section where the situation is better, since the standard meta data template contain the information required from museums – but there are no standard or fixed alternatives for different types of artifacts e.g. flakes or ceramics. The nomenclature for the registration of finds is therefore varying. The quality of the data is instead in the closely connected context information – which only is available in the exported find tables as a context ID. The close connection with the complete information about the context and its other relations is lost. It becomes impossible to answer a question like “What is the most common find in an Iron Age longhouse” to find data stored at museums. This is a question that an Iron Age specialist easy could answer by experience – or ask to a single Intrasis database.

So it is possible for the current Swedish excavation data to take a journey? I will in my paper explore some different paths that could be taken for the data journey from a single Intrasis excavation into large datasets.

Keywords

Intrasis;database