About us
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Learn how GA4GH helps expand responsible genomic data use to benefit human health.
Our Strategic Road Map defines strategies, standards, and policy frameworks to support responsible global use of genomic and related health data.
Discover how a meeting of 50 leaders in genomics and medicine led to an alliance uniting more than 5,000 individuals and organisations to benefit human health.
GA4GH Inc. is a not-for-profit organisation that supports the global GA4GH community.
The GA4GH Council, consisting of the Executive Committee, Strategic Leadership Committee, and Product Steering Committee, guides our collaborative, globe-spanning alliance.
The Funders Forum brings together organisations that offer both financial support and strategic guidance.
The EDI Advisory Group responds to issues raised in the GA4GH community, finding equitable, inclusive ways to build products that benefit diverse groups.
Distributed across a number of Host Institutions, our staff team supports the mission and operations of GA4GH.
Curious who we are? Meet the people and organisations across six continents who make up GA4GH.
More than 500 organisations connected to genomics — in healthcare, research, patient advocacy, industry, and beyond — have signed onto the mission and vision of GA4GH as Organisational Members.
These core Organisational Members are genomic data initiatives that have committed resources to guide GA4GH work and pilot our products.
This subset of Organisational Members whose networks or infrastructure align with GA4GH priorities has made a long-term commitment to engaging with our community.
Local and national organisations assign experts to spend at least 30% of their time building GA4GH products.
Anyone working in genomics and related fields is invited to participate in our inclusive community by creating and using new products.
Wondering what GA4GH does? Learn how we find and overcome challenges to expanding responsible genomic data use for the benefit of human health.
Study Groups define needs. Participants survey the landscape of the genomics and health community and determine whether GA4GH can help.
Work Streams create products. Community members join together to develop technical standards, policy frameworks, and policy tools that overcome hurdles to international genomic data use.
GIF solves problems. Organisations in the forum pilot GA4GH products in real-world situations. Along the way, they troubleshoot products, suggest updates, and flag additional needs.
NIF finds challenges and opportunities in genomics at a global scale. National programmes meet to share best practices, avoid incompatabilities, and help translate genomics into benefits for human health.
Communities of Interest find challenges and opportunities in areas such as rare disease, cancer, and infectious disease. Participants pinpoint real-world problems that would benefit from broad data use.
Find out what’s happening with up to the minute meeting schedules for the GA4GH community.
See all our products — always free and open-source. Do you work on cloud genomics, data discovery, user access, data security or regulatory policy and ethics? Need to represent genomic, phenotypic, or clinical data? We’ve got a solution for you.
All GA4GH standards, frameworks, and tools follow the Product Development and Approval Process before being officially adopted.
Learn how other organisations have implemented GA4GH products to solve real-world problems.
Help us transform the future of genomic data use! See how GA4GH can benefit you — whether you’re using our products, writing our standards, subscribing to a newsletter, or more.
Help create new global standards and frameworks for responsible genomic data use.
Align your organisation with the GA4GH mission and vision.
Want to advance both your career and responsible genomic data sharing at the same time? See our open leadership opportunities.
Join our international team and help us advance genomic data use for the benefit of human health.
Share your thoughts on all GA4GH products currently open for public comment.
Solve real problems by aligning your organisation with the world’s genomics standards. We offer software dvelopers both customisable and out-of-the-box solutions to help you get started.
Learn more about upcoming GA4GH events. See reports and recordings from our past events.
Speak directly to the global genomics and health community while supporting GA4GH strategy.
Be the first to hear about the latest GA4GH products, upcoming meetings, new initiatives, and more.
Questions? We would love to hear from you.
Read news, stories, and insights from the forefront of genomic and clinical data use.
Attend an upcoming GA4GH event, or view meeting reports from past events.
See new projects, updates, and calls for support from the Work Streams.
Read academic papers coauthored by GA4GH contributors.
Listen to our podcast OmicsXchange, featuring discussions from leaders in the world of genomics, health, and data sharing.
Check out our videos, then subscribe to our YouTube channel for more content.
View the latest GA4GH updates, Genomics and Health News, Implementation Notes, GDPR Briefs, and more.
Discover all things GA4GH: explore our news, events, videos, podcasts, announcements, publications, and newsletters.
4 Oct 2019
On Friday, October 27, the GA4GH Data Use and Researcher Identity (DURI) Work Stream hosted the webinar “Automating access to human genomics datasets: the GA4GH Data Use Ontology in action.” More than 100 individuals tuned in to learn about the Data Use Ontology (DUO), a GA4GH standard for automating access to human genomics data. The webinar featured presentations from eight international speakers who have contributed to DUO’s development or implemented it at their local institutions. Another six implementers attended as panelists to answer audience questions following the presentations. Speaker slides and a recording of the webinar are available online.
On Friday, October 27, the GA4GH Data Use and Researcher Identity (DURI) Work Stream hosted the webinar “Automating access to human genomics datasets: the GA4GH Data Use Ontology in action.” More than 100 individuals tuned in to learn about the Data Use Ontology (DUO), a GA4GH standard for automating access to human genomics data. The webinar featured presentations from eight international speakers who have contributed to DUO’s development or implemented it at their local institutions. Another six implementers attended as panelists to answer audience questions following the presentations. Speaker slides and a recording of the webinar are available online.
The current version of DUO is always available at http://purl.obolibrary.org/obo/duo.owl. The ontology can be browsed online via the Ontology Lookup Service.
Melanie Courtot, metadata standard coordinator at EMBL’s European Bioinformatics Institute began the webinar with a high-level overview of the DUO standard, describing the current model for depositing, defining limitations for, and accessing data. This process, while common practice in genomic data sharing, is long and strenuous. Further, it does not scale due to the diversity of language and manual review required to grant or deny access requests.
“A community of people came together to build, deliver, and deploy a standard, focusing on addressing their existing challenges for data access at scale. DUO builds on their expertise and pre-existing efforts in data access and use,” said Courtot, who leads the DUO development team.
Courtot explained the benefits of using an ontology such as DUO. DUO can be browsed via the Ontology Lookup Service, which renders human-readable pages for each term with a stable ID, unique label and an unambiguous definition. “Stability of IDs”, Courtot notes, “provides confidence to DUO users that DUO terms remain available and their meaning does not change over time.” Additionally, the hierarchical tree structure of DUO can be leveraged by automated software to determine access permissions.
Jonathan Lawson, Software Product Manager at the Broad Institute and co-lead of the product team, provided an outline of DUO in action using Broad’s Data Use Oversight System (DUOS) software. Lawson explained how DUOS’ algorithm evaluates whether a data access request is compatible with the imposed restrictions on the data using DUO. Lawson further demonstrated how a researcher can submit a data access request (DAR) via DUOS using either a standard data access agreement or the forthcoming “Library Card”— a unique permission granted to a researcher by their institution’s signing official (ISO) to submit DARs to a Data Access Committee (DAC), eliminating the need for ISOs to approve each DAR individually.
Once the DAR is submitted, the DAC respond by either granting or denying access to the data via the DUOS system, leveraring decision-support from the DUOS algorithm which evaluates the DAR decision alongside the human DAC using DUO.
The DUOS implementation at the Broad Institute has already been successful with a pilot DAC, with the human DAC and DUOS algorithm agreeing on granting or denying all 38 data access requested submitted via DUOS.
Soichi Ogishima, a DUO implementer from ToMMo, Tohoku University, explained how the Japan Agency for Medical Research Development (AMED) Biobank Network is implementing DUO to promote the use of data, as well as of biospecimens stored in biobanks. Researchers using the biobank will encode their research use using DUO, then find and apply to access the appropriate datasets using the AMED Biobank cross-search system.
GEM Japan has benefitted from the standard because DUO provides a framework that simplifies and shortens the data access process for biobank users, particularly in industry research, Ogishima explained.
Tiffany Boughtwood of Australian Genomics gave webinar participants a view of DUO from a plain-language, participant-oriented perspective. In explaining how Australian Genomics has utilized DUO in their participant portal, CTRL (“control”), Boughtwood showed a use for DUO in research prior to data collection.
CTRL is an online platform that research participants use for dynamic, granular choice and consent around the use of their data for future research. The DUO system provides a framework for the patient portal, allowing participants to determine who can be granted access to their data.
“Applying DUO gives us confidence around the future proofing of data access,” said Boughtwood. “We really value the opportunity to contribute to the development and piloting of GA4GH standards because it allows us to make sure the outcomes will fit our research.”
Aina Jene, bioinformatician at the Center for Genomic Regulation, gave an overview of the implementation and use of DUO at the European Genome-phenome Archive (EGA). EGA has started implementing DUO codes within their policy structure, and will eventually be implementing DUO into the submitted portal so users can add these codes themselves.
Jene also provided webinar attendees with high-level instructions for data discovery using DUO codes in the EGA (either searching for a dataset tagged with DUO codes or specific DUO codes themselves).
Hayley Clissold, a policy officer for the Data Access Committee (DAC) at the Wellcome Sanger Institute (WSI), discussed how her team has applied DUO codes to nearly 300 cancer datasets, a third of the datasets in WSI’s data archive. Applying DUO to the remaining datasets, Clissold explained, will be more time-consuming because they have free text usage restrictions, manually entered by research administrators, which have to be carefully reviewed to be sure the correct DUO terms are used.
In the future, WSI plans to train research administrators to use the DUO tags to describe usage restrictions upon submission, so the datasets are properly tagged as soon as they are available for sharing. Using DUO at WSI increases the findability of datasets, thereby promoting their reuse.
Mikael Linden of ELIXIR AAI presented the full data access request process, leveraging both Researcher Passports and DUO for authentication and data access authorization. Researcher Passports, another product in development from the DURI Work Stream, are used to describe an authenticated researcher’s properties and help data holders manage data access.
Linden compared the role of DUO to that of Researcher Passports through three phases: discovery, Data Access Committee oversight (DACO), and data access/use, and highlighted how they play complementary roles in streamlining access to data.
Adrian Thorogood, who manages the GA4GH Regulatory and Ethics Work Stream, spoke about the role of aligning consent language with DUO. Thorogood announced the recently-updated GA4GH Consent Policy and described its role in respecting the autonomous decisions of data subjects and increasing transparency regarding how data is shared.
Thorogood discussed the importance of adopting consent language that maps to DUO, and how this will impact data collection and sharing in the future. Aligning consent language to DUO encourages researchers to prepare and clarify in advance how the participant’s data will be shared and accessed. In turn, such mapping reduces the burden faced by Institutional Review Boards (IRBs) and DACs when trying to interpret consent at the point of data release.
“Actively exploring how to improve the consent process now that DUO is here will really enable the effectiveness of DUO to ensure data are shared maximally while respecting commitments made to participants,” said Thorogood.
Both when drafting consent forms for future and previously collected data, Thorogood stressed the importance of exercising care. Being too restrictive with consent drafting or interpretation restricts legitimate data sharing and slows research, while being too liberal can lead to data leakage that harms participant privacy and trust.
The presentations were followed by a brief session in which presenters and panelists took questions from webinar viewers. Melissa Konopko, who manages the DURI Work Stream, invited viewers to get involved with the development and implementation of DUO and future GA4GH standards either by accessing the DUO Github repository or reaching out to Konopko directly.
Panelists:
Pinar Alper — Data Steward, Luxembourg Center for Systems Biomedicine (LCSB)
Anthony Brookes — Professor of Genomics and Bioinformatics, University of Leicester
Francis Jeanson — Founder, Datadex Inc
Giselle Kerry — EGA Project Coordinator & Senior Helpdesk Officer, EMBL-EBI (EGA)
Kathy Reinold — Principal Data Modeler, Broad Institute of MIT and Harvard
Heidi Sofia — Program Officer, NIH National Human Genome Research Institute (NHGRI)