ClusterMerge

Bottom up merging to cluster similar protein sequences.

Install dependencies:

$ sudo apt-get install bazel

Bazel will use the default C++ compiler on your system.

Building

Linux (dbg): bazel build -c dbg src:clustermerge

Note: I'm not sure how to increase the default stack size limit on Mac, so currently clustermerge will only work on Linux. Mac (dbg): bazel build -c dbg --spawn_strategy=standalone src:main_dsym

Mac/Linux (opt) bazel build -c opt src:clustermerge

Running

First, set the thread stack limit to 64MB:

$ ulimit -s 65532000

This is to accomodate some alignment routines that stack allocate a lot of data.

Clustermerge usage:

$ ./bazel-bin/src/clustermerge file1.fa file2.fa

$ ./bazel-bin/src/clustermerge -h  # view help

By default clustermerge will use as many threads as are available on your system. You can optionally provide a list of your datasets to cluster to avoid typing so much crap on the command line:

# data.json
[
  "file1.fa",
  "file2.fa",
  "fileN.fa"
]

and run with

$ ./bazel-bin/src/clustermerge -i data.json

PAM matrices for optimal alignments using SWPS3 are provided in the repo under data/matrices/json, if you run the tool elsewhere you will need to use the -d / --data_dir option to provide the path.

If you have complete all vs. all results you want to compare against, tools has a compare script:

python2 tools/compare_results.py path/to/AllAll/ path/to/clustermerge/output_matches/

Distributed Version

bazel build -c opt //src/dist:dist_cluster

Name		Name	Last commit message	Last commit date
Latest commit History 314 Commits
data		data
src		src
third_party		third_party
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
WORKSPACE		WORKSPACE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ClusterMerge

Building

Running

Distributed Version

About

Releases

Packages

Contributors 4

Languages

epfl-vlsc/clustermerge

Folders and files

Latest commit

History

Repository files navigation

ClusterMerge

Building

Running

Distributed Version

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages