Tel: +49 (0) 911 97341-500 • Fax: +49 (0) 911 97341-510

    

Products

SeDI

Using SPARQL to Access Data Stores in Healthcare Abstract:

Objectives and Motivation

For the implementation of semantic web technology in healthcare it is necessary to access the already existing IT-infrastructure and data stores, which is a problem related to the theme of clinical harmonization. A very important standard in healthare is DICOM (Digital Imaging and Communication in Medicine) dealing with imaging for diagnostics and workflows for medical devices. PACS (Picture Archiving and Communication System) servers provide access to the data through a network protocol standardized by DICOM. There are plans to store semantical information directly in the DICOM data and reuse the existing infrastructure for storage and transmission.

This leaves the problem how a semantic application shall access this data including querying with SPARQL, inferencing with ontologies encoded in OWL, and federation with other semantic data sources.

Principally, there are two different approaches: Either the database of the PACS is completely converted into RDF or SPARQL queries are converted to the query protocols defined by DICOM.

Methods and Tools

The prototype introduced in this presentation is called SeDI (Semantic DICOM) and uses Jena as RDF framework, Joseki for implementing a SPARQL endpoint, and Pellet as reasoner. SeDI uses dcm4chee as PACS, which has no built-in support for semantic applications. With SeDI it is possible to use SPARQL for querying a PACS, because SeDI transforms a SPARQL query directly into a DICOM C-Find or Move request as appropriate and converts the result from the DICOM protocol into a SPARQL result set. SeDI allows semantic applications to directly access a PACS without completely exporting the existing data of the PACS to a triple store. The DICOM data model is encoded in an ontology, on which SeDI depends to a large extent for transforming the query and adding semantically meaningful information to the query result.

Results

The main advantage of this approach is that the existing IT infrastructure remains unaltered, since SeDI uses the already existing DICOM protocol access to the systems. A semantic application can access the DICOM data as if the PACS were a SPARQL endpoint, use inferencing on the data, and federate it with other data sources. Furthermore there is no problem with synchronizing the original data source with a triple store, because the original data source is directly accessed and there is no need for an export to a triple store. A drawback of this approach is that the queries run with less performance compared to the access to a real triple store, and an ontology containing all DICOM concepts is rather difficult to model.

Conclusions

For the future a standardized ontology of DICOM is desirable so that queries can use concepts from defined namespaces. This could even be part of the DICOM standard itself. Furthermore the unobtrusive access to legacy databases containing data not encoded in RDF seems to be the natural way to evolve the semantic web. SeDI proofs this concept for medical data encoded in DICOM by implementing a seamless integration of a PACS as a SPARQL endpoint in a semantic web application.