This is a package for indexing and querying a sequence database for fast nearest-neighbor search by means of vantage point trees. For reasonably large databases, such as RDP, this results in sequence lookups that are typically 5-10 times faster than other alignment-based lookup methods.
Vantage-point tree search uses global-to-global alignment to compare sequences, rather than seed-and-extend approximative methods as used for example by BLAST.
VPsearch can be installed and updated through pip:
pip install -U vpsearch
This will install a standalone command-line utility vpsearch
into your
environment, which can be used to build and query a sequence database. For more information on how to do so, see the documentation.
If you use vpsearch, please cite our paper:
- Joris Vankerschaver, Steven J. Kern, Robert Kern. VPsearch: fast exact sequence similarity search for genomic sequences. Journal of Open Source Software, 7(78), 4236, 2002. https://doi.org/10.21105/joss.04236
This package is licensed under the 3-clause BSD license.