Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Data Access Recommendations

This section of the framework makes recommendations for describing and documenting policies related to the retrieval of a digital object for research, and subsequent operations performed on that digital object. In the CDIF framework, we use the term ‘access’ to include the activities connected to the initial retrieval of a digital object, and ‘usage’ to include all subsequent operations performed on that digital object. Access and usage policies (herein ‘access policies’), when defined, are typically unstructured and bespoke. Data providers may not make access policies explicit and when they do, they tend to re-invent new policies locally. Therefore data users experience new data access policy content and structure at every access-related interaction across the science system. Any kind of aggregation or orchestration of data across providers is stymied by an incoherent data access policy environment in terms of existence, coverage, content, and machine-actionability.

Objective

The objective of the Data Access Profile in the Cross Domain Interoperability Framework (CDIF) is to progress to a more structured and standardised, machine-actionable approach. The benefits and convenience for data requestors accrue through:

For data custodians, the structured and standardised approach to data asset permissions through a structured ontology (e.g., ODRL) provides:

The goal is interoperable, machine-actionable, expression of data access policies. It is important to note this is not the entirety of ‘Access’ from a plain English or FAIR perspective. Accessing sensitive data is a complicated, multi-faceted, multi-party process involving for example:

In scope for this recommendation are:

Useful enablers of an access policy but ‘out of scope’ for this recommendation are:

High-Level Recommendation

To promote interoperability and mutual intelligibility around access conditions, CDIF recommends use of the Open Digital Rights Language (ODRL) to describe data asset access policies. ODRL is a RDF-based, ‘widely adopted language for expressing permissions, obligations, and conditions related to digital rights’ (Policy Patterns for Usage Control in Data Spaces, 2023 ) that can be serialised in JSON and XML. While minimal in terms of classes and relationships, it nonetheless allows sufficient flexibility through the use of constraints and refinements and descriptive logic. ODRL also allows straightforward extensions to the core model and vocabulary with ODRL Profiles.

Risks and Enablers for CDIF Using ODRL

The existence of ODRL as an existing, well-supported W3C standard is a key ‘enabler’ for CDIF, based on the the stated principle committed to using existing, well-supported standards wherever possible. The first risk with ODRL is that it doesn’t cover all scenarios. The scenario-based discussion in following sections aim to tease out what can be done with ODRL immediately and what scenarios might need further extensions or new approaches. Our conclusion is that ODRL and its extension/profiling capability is quite useful as is. The second risk is the barrier to adoption of ODRL. There is an urgent need for tooling to simplify how domain scientists and data curators use this mature W3C standard. The information model, ontology, and linked data implementation are beyond the capability of many scientists and infrastructure service providers who can benefit from standardised access policies at scale. This risk is both high and likely and currently has no mitigation.