Persona: A High-Performance Bioinformatics Framework
Persona is an open source framework for executing bioinformatics computations on the Aggregate Genomic Data format. Persona is fast, efficient, and scalable.
This repo contains the source implementation of the Persona dataflow operators, all of which is built on top of TensorFlow.
This repo contains the Persona python layer that allows you to access common functions via a command line interface. You will need both repos to use Persona.
The Persona system can be built and installed in a similar manner to installing TensorFlow from sources. We recommend using a Python virtual environment when installing so as not to conflict with any existing TensorFlow installation.
First you should clone this repo with --recurse-submodules
and prepare your environment.
Disregard any setup for GPUs.
In addition, you will need to install the following dependencies via your package manager:
- liblttng-ust-dev
- librados-dev
- libboost-system-dev
- libboost-timer-dev
- libsparsehash-dev
e.g.
sudo apt-get install liblttng-ust-dev librados-dev libboost-system-dev libboost-timer-dev libsparsehash-dev
The following may need to be installed from source:
Next, configure your environment using:
cd persona-system
./default-configure.sh
This configures TensorFlow with the minimum requirements for Persona. Next we will create our virtual environment. Persona provides a convenient script. Simply:
./setup-dev.sh
Enter the environment:
source python-dev/bin/activate
Compile the pip package:
./compile.sh
Next, head on over to the Persona repo to see how to use the Persona framework.