Assessment of core CDISC2RDF schemas

From PhUSE Wiki
Jump to: navigation, search

News

26 June, 6pm CET, Teleconf/Webex: Intro/Overview of the RDF Data Cube and DDI-RDF Discovery Vocabulary. Notes and links see below under References.

11 June, 1pm CET, Teleconf/Webex: Intro/overview to the ISO11179 model/ontology behind the SemanticMDR from the SALUS project. Recording, slides and related links see below under References


Background

The first versions of the core CDISC2RDF schemas were intentionally developed to represent a minimal part of the ISO11179 model for metadata registries. The assessment of these is an important task for a smaller team of semantic modeling experts than the bigger FDA/PhUSE team.

The review of the core schemas should also position them in relation to other semantic web representations of the full ISO model and also position it in relation to other models for describing research data. The output is a recommendation of any potential updates and the pros and cons of these for the next version of RDF representations of the CDISC standards.

The assets to review are the core schemas in the overview below, that is the metadata model schema (mms) and the controlled terminology schema (cts) These are available on the CDISC2RDF Google code prokject: https://code.google.com/p/cdisc2rdf/

CDISC2RDF Overview.jpg

Figures from http://www.slideshare.net/kerfors/cdisc2-rdf-overveiw


Metadata model schema (mms) Represents the core Data Description part of the ISO11179 model CDISC2RDF metadata model schema.jpg


ISO11179 model, Part 3, Registry metamodel and basic attributes ISO11179 Data Description metamodel.jpg


Controlled terminology schema (cts) Uses the metadata model schema (mms) and add a few additional properties to represent the existing NCI Thesaurus export.

CDISC2RDF controlled terminology schema.jpg


Project Plan

  • Teleconf./WebEx in June to get an overview potential additional/alternative schemas and approaches, see references below.
  • Draft list of issues with, and ideas to improve, the core CDISC2RDF schemas in early Sept.
  • Draft version in early Sept. of a recommendation of any potential updates of the schemas together with the pros and cons of these for the next version of RDF representations of the CDISC standards.
  • Final report late Sept.


References

SemanticMDR
https://github.com/srdc/semanticMDR/ full implementation of the ISO model as an OWL ontology (direct link to ISO11179 ontology ) Developed by the EU FP7 project called SALUS (Scalable, Standard based Interoperability Framework for Sustainable Proactive Post Market Safety Studies) See also the Convergence meeting across six Research projects targeting semantic interoperability for Clinical Research & Patient Safety (IMI and FP7 projects: SALUS, EHR4CR, Open PHACTS, Linked2Datefy, eTRIKS, EURECA)

From WebEx/TC 11 June
Streaming recorcordingDownload recording Slides File:SemanticMDR.pptx Blog post: Semantic Metadata Registry/Repository introducing the main capabilities of the Semantic MDR with four screen-casts.


The RDF Data Cube
http://www.w3.org/TR/vocab-data-cube/ is an RDF vocabulary for publishing multidimensional data, particularly statistical data. It is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), a widely used ISO standard. The Data Cube Vocabulary brings essential SDMX elements to RDF, providing a standard way for governments to publish statistical information as Linked Data.

DDI-RDF Discovery Vocabulary
A metadata vocabulary for documenting research and survey data http://www.ddialliance.org/Specification/RDF

From WebEx/TC 26 June (only notes, no recording)

RDF Data Cube vocabulary, W3C Government Linked Data (Dave Reynolds)
- Summary level data - UK Government, based on SDMX, statistical/observational data, multi-dim. data, every cell value is a identified resource, values, attributes, slices of data, datasets, candidate rec. (stable) - Applications: e.g. Eurostats, COINS (26 dims), weather forecasts, World Bank, WHO etc. - Well-formed data cube, abbreviate, … - Attributes such as unit of measures, the data cube is agnostic, practice of using the QUDT ontology - Standard libraries in e.g. R and for visualization tools (e.g. CubeViz), importers/exporters for e.g. SAS datasets � See also:
Specification: http://www.w3.org/TR/vocab-data-cube/ Implementations: http://www.w3.org/2011/gld/wiki/Data_Cube_Implementations Presentation: http://www.slideshare.net/der42/linked-data-hypercubes Use case: Towards Next Generation Health Data Exploration: A Data Cube-based Investigation into Population Statistics http://www.hicss.hawaii.edu/hicss_46/bp46/hc6.pdf

Data Documentation Initiative (DDI Alliance)
- “Micro-level data”, code book for surveys - 10 years, from ISO11179 community

DDI-RDF Discovery Vocabulary (Arofan Gregory)
- Companion to the RDF Data Cube for micro-level data - Structure of the data, agnostic to the domain See also http://www.ddialliance.org/Specification/RDF#disco

XKOS (Extended SKOS for statistical data)
- Statistical classifications See also:
http://www.ddialliance.org/Specification/RDF#xkos


See also the references to standards to consider, especially for Questionnaires and analysis data listed in this Google+ post https://plus.google.com/102357888530456587158/posts/QknVEa1F2zC from Laurent Lefort Laurent have also been working on semantic web representations of CDISC standards and also of actual clinical data, see http://ceur-ws.org/Vol-952/paper_7.pdf