Modeling Experimental Data – basic description

October 22, 2010

The schema used in Tadis is based on the Core Scientific Metadata Model (CSMD) developed for the ICAT project.

At the simplest level, the experimental data is simply a collection of files (Datafiles), which are grouped in to Datasets, which are grouped in to Experiments:

Tardis High-level data model

Tardis High-level data model

(Please note that the schema is only partially shown in the diagram above)

At the top level, Tardis stores a flat list of Experiments.   Each Experiment contains one or more Datasets, and each Dataset contains one or more Datafiles.

At each level, Experiment, Dataset and Datafile, user defined parameters may be added, grouped in to Parameter Sets.

Tardis doesn’t impose any interpretation on what is considered an Experiment or Dataset.   Examples of how datasets may be grouped are: by sample, by instrument settings, or as a time sequence, e.g. artificially aging a material and investigating the effects.

In the last post I listed two metadata hierarchies: 1) The Core, Discipline and Project hierarchy from the University of Southampton, and 2) the Core, Instrument and Science hierarchy from STFC.  The core metadata schema is hard-coded in Tardis.  The Instrument, Science and Project schema’s can all be implemented using Parameter Sets.

Metadata strategy

October 6, 2010

The University of Southampton data management project has proposed a three-level metadata strategy, see their blog entry “Metadata strategy“:

  1. Project
  2. Discipline
  3. Core

Tardis is based on the Core Scientific Metadata model (CSMD) developed within the Science & Technology Facilities Council (STFC).  One metadata hierarchy they’ve adopted is (turned upside down to match Southampton’s):

  1. Science Specific
  2. Instrument Specific
  3. Core

(This reminds me of Robert Pirsig’s Intellectual Scalpel)

We’re extending Tardis for use within the Australian Synchrotron and ANSTO, where the STFC model is more appropriate.  However, institutional use of Tardis may also be project based.

Tardis supports configurable schemas (parameter sets) at  the experiment, dataset and datafile level.  Appropriate use of the configurable schema should allow us to handle both models, or a combined model.

Using a Core Scientific Metadata Model in Large-Scale Facilities

July 22, 2010

Thanks to the UKOLN News Feed for pointing to the International Journal of Digital Curation Vol 5., No 1. It contains a paper titled Using a Core Scientific Metadata Model in Large-Scale Facilities.  The paper provides a good overview of the CSMD schema, which is “a model for the representation of scientific study metadata developed within the Science & Technology Facilities Council (STFC) to represent the data generated from scientific facilities”.