Overview

The workflows package supports running Big Data Genomics tools using Toil in both local mode and in distributed mode on AWS using an autoscaled or statically provisioned cluster.

Currently, we support the following workflow:

Additional workflows are in development and alpha versions of these workflows are packaged and made available in this repository. However, since they are alpha, we have not documented these workflows, and do not currently support them.

Preparing to Run on AWS

If you plan to run the workflows on your local machine, then you only need to ensure that Toil is installed. If you are running on the Amazon Web Services cloud, then you need to ensure that Toil is installed with the AWS extra. You should use Toil to launch a cluster in AWS. Once you have launched the cluster, you should ssh in to the cluster and install the bdgenomics.workflow package locally in a virtualenv on the cluster leader. Once you have followed this process, you can run your desired workflow.