GA4GH TES API: bringing compatibility to task execution across HPC systems, the cloud and beyond

News

9 Mar 2021

GA4GH TES API: bringing compatibility to task execution across HPC systems, the cloud and beyond

9 Mar 2021

Recently approved by the GA4GH Standards Steering Committee, the Task Execution Service (TES) API v1 provides a standard mechanism for orchestrating these complex analyses across different compute environments.

Federated analysis of data distributed across the world can make genomics research more powerful by connecting multiple large-scale datasets for simultaneous analysis.

Such investigations utilize complex methods, such as aligning multiple sequences to the human reference genome to identify potentially pathogenic variants. These analyses often involve up to hundreds of thousands of computational tasks, which can take considerable time and compute power to execute. Recently approved by the GA4GH Standards Steering Committee, the Task Execution Service (TES) API v1 provides a standard mechanism for orchestrating these complex analyses across different compute environments.

To support large scale federated analysis, institutions and organizations employ queueing systems that send tasks out to high performance computers (HPCs) or cloud environments—but each compute system is unique, and each cloud vendor uses incompatible APIs for running batch tasks. Because of these discrepancies, researchers carrying out federated analysis must employ unique code for each.

The TES API adds to the suite of standards produced by the GA4GH Cloud Work Stream, whose mission is to help the genomics and health community take full advantage of modern cloud environments by bringing algorithms to data that cannot be moved due to various regulatory limitations.

“By building analysis software with the TES API, researchers can quickly move from a university cluster, to Amazon, to Microsoft Azure, without changing their code,” said Kyle Ellrott, Assistant Professor of Computational Biology at Oregon Health & Science University and co-lead of the TES API development team. “With the TES API, moving large-scale batch computing between private computers and the cloud becomes seamless.”

On the backend, the TES API wraps around an institution’s HPC system or cloud environment, and then manages the deployment, scheduling, running, and clean-up of tasks while providing status updates and logging information back to the researcher.

For example, if a researcher is running genomic analysis pipelines, they may send out a thousand task requests, usually with the help of a workflow engine. The workflow engine, which may be custom made or from an existing software project, needs a way to talk to the local compute resources. The TES server accepts the requests, communicates with the local job queuing system, and tracks progress and output. This is all done in a single API that looks the same no matter what infrastructure manages the computational resources. Thus, the TES API provides a flexible and standardized approach to connect complex workflow engines to new compute systems—saving time and resources.

Furthermore, the TES API can help extend systems that provide the Workflow Execution Service (WES) API, another GA4GH Cloud standard. While the WES API orchestrates a series of steps in a workflow, the TES API can connect the workflow to a compute backend to execute specific steps. So when a researcher takes their WES-enabled workflow engine to a new computational environment, they can plug into the local TES API without having to write new adaptors.

“This concept of pluggable compute backends is key to the TES API,” said Ania Niewielska, Lead Software Engineer at EMBL-EBI and co-lead of the TES API development team. “Since many existing workflow engines have already implemented the TES API, adding support for a new compute backend, such as a new cloud provider, can be achieved through a single TES implementation—instead of writing separate implementations for each workflow engine. Additionally, TES backends can be implemented in the technology of choice, independent from the tech stack used for the workflow engines.”

“The European life science infrastructure is very fragmented,” said Alexander Kanitz, co-lead of the ELIXIR Cloud project, a GA4GH Driver Project. “The TES API offers a means to abstract over different compute backends in an effort to federate the execution of computational workflows across the various nodes, from hospitals to research centers. This is one of the key reasons why we chose to implement the TES API.”

Joris Vankerschaver, Manager of Strategic Technologies and Life Science Solutions at Enthought said, “The TES API allows us to ‘code against TES’ rather than against a particular environment. This is important when working with clients who may have a mixture of on-premise servers and cloud resources, or who are interested in moving to the cloud.”

The TES API was also designed with real-world constraints in mind. “The complexity of moving health-generated data for secondary research purposes is immense, due to patient privacy, security, and legal considerations,” said Leslie Glass, Project Manager at EMBL-EBI where she leads the CINECA project. “We chose to implement the TES API to help ensure that these data do not become siloed and inaccessible for research.”

Many workflow engines, including Cromwell, Nextflow, and Snakemake, have already begun to support the TES API. In the future, the team plans to expand support for the API and to focus on compatibility with other GA4GH standards, including the Data Repository Service (DRS) API and the GA4GH Passports Specification to manage authentication and authorization.

Related Work Streams

Cloud Work Stream

Latest News

24 Jun 2025

GA4GH and CRDSA agree to a Strategic Partnership

A doctor writing on iPad with health data and global connections coming out of the pen.

17 Jun 2025

Policy Brief: will the UK participate in the European Health Data Space?

A colorful strand of DNA set against images of a patient health record, a database, and a magnifying glass.

12 Jun 2025

GA4GH approves two new products: Categorical Variation Representation Specification (Cat-VRS) and Variant Annotation Specification (VA-Spec)

See all news and events

About us

About us

Strategic Road Map

History

GA4GH Inc.

Leadership

Funders Forum

Equity, Diversity, and Inclusion (EDI) Advisory Group

Staff

Our community

Our community

Organisational Members

Driver Projects

Strategic Partners

Assigned Experts

Individual Contributors

What we do

What we do

Study Groups

Work Streams

GA4GH Implementation Forum

National Initiatives Forum

Communities of Interest

Technical Alignment Subcommittee (TASC)

Calendar

Our products

Our products

Product Development and Approval Process

Implementations

Get involved

Get involved

Join us

Open calls

Implement a product

Attend an event

Become a funder

Subscribe to the GA4GH newsletter

Contact us

News and events

News

Events

Announcements

Publications

Podcasts

Videos

Newsletters

See all

News