Welcome to the GA4GH Data Use and Researcher Identity (DURI) product line!
Vision
There are restrictions on the use of human genomics data that are derived from ethical requirements including participants’ research consent. For example: “Data can only be used for breast cancer research with non-commercial purpose”. The current process to request access to data to ensure studies are consistent with these restrictions is inefficient and slows down science.
We envision a world where biomedical researchers will be able to efficiently discover genomics and health data, and then apply and get access automatically based on their digital identity and, if required, a machine readable research purpose.
To achieve this, researchers need to have a reliable global electronic identity that is recognised for data access. In collaboration with the data authorities who are responsible for proper data use, this researcher identity would be authorised to access restricted data. We envision this type of access could be allowed for this use across a federated network of data repositories for researchers who have a trust attribute on their identities. For example: researchers would be able to query a network of repositories across country borders and find the genomes and phenotypes of all females of the age of 20-40 that have a BRCA mutation and can be used to study cancer; once they found that data they would be able to apply to access the data in an easy fashion and could often also instantly get access to research the data based on a digital assessment of appropriate claims, trust credentials and agreed policy.
Mission
Our group’s mission is to create the standards that will enable automated data discovery and access and drive their evolution. We do this in partnership with driver projects that support our standard creation and promote their adoption.
Specifically, we:
- Establish a globally recognised researcher identity standard that human data service providers can rely on, and create a model to decorate these electronic identities with trusted claims/attributes.
- Develop a Data Use Ontology (DUO) that supports algorithm-based automated matching between a research purpose and data use restrictions on datasets.
- Establish a researcher Library Card/Access Passport standard, a digital identity that describes claims about the “bonafide” researcher status of an individual enabling:
- A user to enter a ‘registered access’ environment in which they make exploratory discovery queries without stating their research purpose
- An automated data access process where access decision can be made based on a codified research purpose, the digital identity and ethics approval claims.
How to contribute
To become a member please email: melissa.konopko@ga4gh.org
Or CLICK HERE and choose “Apply for membership”.
2019 Proposed Goals
Draft goals (Basel Plenary 2019)
DUO:
- Release DUO V2
- Resolve GRU vs. GRU-CC controversy based on driver project data stewards input.
- Input from: dbGaP (NIH Data Sharing Council), TOPMED (Kathy Laurie), AoU (Laura R, Pearl O’rourke), Sanger DAC for EGA, Australian genomics, someone from THL?, who else?
- Have 2-3 software products adopt DUO for data discovery (e.g. an algorithm will match DUO terms describing research purpose and restrictions on data)
- Groups that have expressed interest
- DUOS for TOPMED/ANVIL (moran)
- REMS (Tommi, Mikael)
- Google cloud data access test bed (Aleksandra, Craig)
- EGA? (Melanie)
- Datadex (Francis Jeanson)
RI:
- Find 2 (or more) driver projects and **data steward **that would like to enable controlled access or registered access decisions based on a library card
- Develop the RI claims to ensure they address the requirements of the >2 stewards
- Implement a test bed for the implementation (google team expressed interest)
- Implement an alpha version of 2 driver project that pilot this library card for sharing
- Implement a project where ana EU and US identities are accepted (Elixir and eRA Commons?)
- Suggested projects
- TOPMED: automated discovery for NHLBI DataSTAGE pilot with a “data passport “ issued on top of eRA identities
- Elixir + THL? (which dataset and stewards can we engage for this pilot? )
- CANDIG
- EGA registered access beacon
- NHGRI AnVIL data environment with DUOS
Current roadmap
Please see the global GA4GH strategic roadmap