A Review on GPU Accelerated Bioinformatics Tool

Subrata Sinha, Abinash Hazarika, Gopal C Hazarika

Abstract


Traditional computational methods and software tools developed in the research fields of Bioinformatics, Computational Biology, and Systems Biology have a common property which is that they are computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining increasing attention worldwide, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. The latest generation of graphics processing units (GPUs) has democratized the use of HPC also as they push desktop computers to cluster-level performance.  In this review work, we present a collection of GPU enhanced bioinformatics tools developed recently in the past two to three decades to perform computational analysis in various life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures.


References


Amdahl GM. ,Validity of the single processor approach to achieving large scale computing capabilities. In: AFIPS ’67(Spring) Proceedings of the April 18–20, 1967, Spring Joint Computer Conference. New York: ACM, 483–5, 1967.

Joshua Anderson, Aaron Keys, Carolyn Phillips ,Trung Dac Nguyen ,Sharon Glotzer; HOOMD-blue, general-purpose many-body dynamics on the GPU, American Physical Society, APS March Meeting 2010, March 2010.

Alberghina L, Westerhoff HV. Systems Biology: Definitions and Perspectives, Vol. 13 of Topics in Current Genetics. Berlin,Germany: Springer-Verlag, 2005.

Daniel L. Ayres, Aaron Darling, Derrick J. Zwickl, Peter Beerli, Mark T. Holder, Paul O. Lewis, John P. Huelsenbeck, Fredrik Ronquist, David L. Swofford, Michael P. Cummings, Andrew Rambaut, and Marc A. Suchard, BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics, Systematic Biology, 170–173,61(1),January 2012 .

Beberg AL, Ensign DL, Jayachandran G, Khaliq S, Pande VS. Folding@ home: Lessons from eight years of volunteer distributed computing. InParallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium; (pp. 1-8), May 23, 2009.

Berendsen HJ, van der Spoel D, van Drunen R. GROMACS: a message-passing parallel molecular dynamics implementation. Computer Physics Communications.;91(1-3):43-56, Sep 2 1995.

Michael Bergdorf, Sean Baxter, Charles A. Rendleman, and David E. Shaw, "Desmond/GPU Performance as of November 2016," D. E. Shaw Research Technical Report DESRES/TR--2016-01, November 2016.

Bland AS, Wells JC, Messer OE, et al. Titan: early experience with the Cray XK6 at Oak Ridge National Laboratory. In: Proceedings of Cray User Group Conference (CUG 2012). Stuttgart, Germany: Cray User Group, 2012.

Brooks BR, Brooks CL, MacKerell AD, Nilsson L, Petrella RJ, Roux B, Won Y, Archontis G, Bartels C, Boresch S, Caflisch A. CHARMM: the biomolecular simulation program. Journal of computational chemistry.;30(10):1545-614, Jul 30,2009.

D. Andrew Carr, Christine Paszko,Donald Kolva, SeqNFind: A GPU Accelerated Sequence Analysis Toolset Facilitates Bioinformatics, Nature Methods,August 2011.

L. Cebamanos, A. Gray, I. Stewart, A. Tenesa, Regional heritability advanced complex trait analysis for GPU and traditional parallel architectures, BIOINFORMATICS, pages 1177–1179, Vol. 30 no. 8,January 7, 2014.

Komal B. Deshmukh M. U. Kharat ; Accelerating Smith-Waterman Alignment Based on GPU; International Journal of Advanced Research in Computer Science and Software Engineering , Volume 5, Issue 5, ISSN: 2277 128X ,May 2015.

Peter Eastman, Vijay S. Pande,OpenMM: A Hardware Independent Framework for Molecular Simulations, Comput Sci Eng.; 12(4): 34–39, July 1 2015.

Ginés D. Guerrero, Baldomero Imbernón, Horacio Pérez-Sánchez, Francisco Sanz,JoséM. García, and JoséM. Cecilia; A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing; Hindawi Publishing Corporation BioMed Research International Volume 2014, Article ID 474219, June 15th 2014.

Molecular Dynamics Simulation: Elementary Methods.New York: Wiley, 1997.

Harvey MJ, Giupponi G, Fabritiis GD. ACEMD: accelerating biomolecular dynamics in the microsecond time scale. Journal of chemical theory and computation, 5(6):1632-9, May 21, 2009.

He M, Petoukhov S. Mathematics of Bioinformatics: Theory,Methods and Applications. Hoboken, NJ: John Wiley & Sons,2011.

(http://lammps.sandia.gov/)

http://lammps.sandia.gov/

(http://halmd.org/#) http://halmd.org/

(http://nvlabs.github.io/nvbio/index.html)

http://nvlabs.github.io/nvbio/index.html .

(http://seqbarracuda.sourceforge.net/index.html

http://seqbarracuda.sourceforge.net/index.html

(https://crantastic.org/packages/WideLM)

https://crantastic.org/packages/WideLM .

(https://software.intel.com/sites/default/files/m/8/b/8/D9156103.pdf ) IntelVR SSE4 Programming Reference. Reference Number: D91561-003. Intel Corporation, Denver, CO, USA, 2007.Available at:https://software.intel.com/sites/default/files/ m/8/b/8/D9156103.pdf.

Joubert W, Archibald R, Berrill M, et al. Accelerated application development: The ORNL Titan experience. Comput Electr Eng;46:123–38, 2015.

Karr JR, Sanghvi JC, Macklin DN, et al. A whole-cell computational model predicts phenotype from genotype. Cell;150(2):389–401,2012.

Petr Klus , Simon Lam, Dag Lyberg, Ming Sin Cheung , Graham Pullan, Ian McFarlane , Giles SH Yeo and Brian YH Lam*, BarraCUDA - a fast short read sequence aligner using graphics processing units, BMC Research Notes , 5:27, 2012.

Kai J. Kohlhoff, Marc H. Sosnick William T. Hsu Vijay S. Pande Russ B. Altman, CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms ,Bioinformatics, Pages 2321–2322,Volume 27, Issue 16, 15 August 2011.

Johannes Koster,Sven Rahmann, Massively parallel read mapping on GPUs with PEANUT,Genome Informatics,University of Duisburg-Essen,March 10, 2014.

Limbach HJ, Arnold A, Mann BA, Holm C. ESPResSo—an extensible simulation package for research on soft matter systems. Computer Physics Communications.;174(9):704-27, May 1,2006.

Sánchez-Linares, Horacio Pérez-Sánchez* , José M Cecilia, José M García, High-Throughput parallel blind Virtual Screening using BINDSURF, BMC Bioinformatics, 13(Suppl 14):S13,2012.

Liu CM, Lam TW, Wong T, Wu E, Yiu SM, Li Z, Luo R, Wang B, Yu C, Chu X, Zhao K. SOAP3: GPU-based compressed indexing and ultra-fast parallel alignment of short reads. In3th Workshop on Massive Data Algorithms, June 2011.

Yongchao Liu, Bertil Schmidt, Douglas L. Maskell, CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows–Wheeler transform, BIOINFORMATICS, pages 1830–1837, Vol. 28 no. 14 , , May 9, 2012.

Yongchao Liu ; Bertil Schmidt ; Douglas L. Maskell, An Ultrafast Scalable Many-Core Motif Discovery Algorithm for Multiple GPUs, IEEE, 01 September 2011.

Yongchao Liu*, Douglas L Maskell and Bertil Schmidt, CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units, BMC Research Notes, 2:73 ,06 May 2009.

Luo R, Wong T, Zhu J, Liu CM, Zhu X, Wu E, Lee LK, Lin H, Zhu W, Cheung DW, Ting HF. SOAP3-dp: fast, accurate and sensitive GPU-based short read aligner, PloS one, 8(5):e65632,May 31 ,2013.

MacCallum JL, Perez A, Dill KA. Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference. Proceedings of the National Academy of Sciences.;112(22):6985-90, Jun 2, 2015.

Nelson MT, Humphrey W, Gursoy A, Dalke A, Kalé LV, Skeel RD, Schulten K. NAMD: a parallel, object-oriented molecular dynamics program. The International Journal of Supercomputer Applications and High Performance Computing;10(4):251-68,December 1996.

Okonechnikov K, Golosova O, Fursov M, UGENE team. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics ;28(8):1166-7, Feb 24, 2012 .

Pavlopoulos GA, Wegener AL, Schneider R. A survey of visualization tools for biological network analysis. Biodata mining.;1(1):12, Nov 28, 2008.

Pearlman DA, Case DA, Caldwell JW, Ross WS, Cheatham TE, DeBolt S, Ferguson D, Seibel G, Kollman P. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Computer Physics Communications.; 91(1-3):1-41, Sep 2, 1995.

Pop M, Salzberg SL. Bioinformatics challenges of new sequencing technology. Trends Genet;24(3):142–9, 2008.

Sita Rani, O. P. Gupta, CLUS_GPU-BLASTP: accelerated protein sequence alignment using GPU-enabled cluster,Springer,07 April 2017.

Rapaport DC. The Art of Molecular Dynamics Simulation. Cambridge: Cambridge University Press, 2004.

Rinaldi A. Science wikinomics. EMBO reports.;10(5):439-43, May 1 2009.

Sauro HM, Harel D, KwiatkowskaM, et al. Challenges for modeling and simulation methods in systems biology. In: L Perrone, F Wieland, J Liu, et al. (eds). Proceedings of the 38th Conference on Winter Simulation. New York: IEEE, 1720–30, 2006.

Michael C Schatz, Cole Trapnell, Arthur L Delcher,Amitabh Varshney, High-throughput sequence alignment using Graphics Processing Units, BMC Bioinformatics, 8:474, 10 December 2007.

Schulz R, Lindner B, Petridis L, et al. Scaling of multimillion atom biological molecular dynamics simulation on a petascale supercomputer. J Chem Theory Comput;5(10):2798–808, 2009.

Xiaoquan Su , Jian Xu , Kang Ning ,Parallel-META: A high-performance computational pipeline for metagenomic data analysis, Systems Biology (ISB), 2011.

Shuji Suzuki, Takashi Ishida, Ken Kurokawa, Yutaka Akiyama, GHOSTM: A GPU-Accelerated Homology Search Tool for Metagenomics.,PLoS ONE 7(5): e36060, May 4, 2012.

Andrea Tangherloni, Marco S. Nobile, Daniela Besozzi, Giancarlo Mauri ,Paolo Cazzaniga, LASSIE: simulating large-scale models of biochemical systems on GPUs, BMC Bioinformatics, 18:246, 2017.

Panagiotis D. Vouzis, Nikolaos V. Sahinidis, GPU-BLAST: using graphics processors to accelerate protein sequence alignment, BIOINFORMATICS, pages 182–188, Vol. 27 no. 2, November 18, 2010.

Richard Wilton, Tamas Budavari, Ben Langmead, Sarah J.Wheelan, Steven L. Salzberg and Alexander S. Szalay, Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space,PeerJ, March 3rd 2015.

Oriol Canela-Xandri, Andy Law, Alan Gray, John A. Woolliams, Albert Tenesa, DISSECT: A new tool for analyzing extremely large genomic datasets,Nature, Jun. 5, 2015.

Kaiyong Zhao, Xiaowen Chu, G-BLASTN: accelerating nucleotide alignment by graphics processors, BIOINFORMATICS, pages 1384–1391,Vol. 30 no. 10, January 24, 2014.

Zhmurov A, Dima RI, Kholodov Y, Barsegov V. Sop‐GPU: Accelerating biomolecular simulations in the centisecond timescale using graphics processors. Proteins: Structure, Function, and Bioinformatics;78(14):2984-99, Nov 1, 2010.


Refbacks

  • There are currently no refbacks.


Published by:

 Indian Science and Technology Foundation (ISTF)

 C-1/31, Yamuna Vihar, New Delhi-110053 

Email: contact@isto-india.org

www.isto-india.org