In the experiment, the proposed FastSearch algorithm, the GGSEARCH algorithm and the Needleman-Wunsch (NW) algorithms are performed to find their best alignment sequences in the database for every query sequence. Then we compare the FastSearch and the GGSEARCH algorithms by mean error based on NW algorithm. The FastSearch algorithm can adjust filtering ratios alpha and beta according the size of the database. In the experiments, the data sets are obtained from the National Center for Biotechnology Information (NCBI) site. We select high similar dataset (protein from COVID-19), diverge identity dataset (EUMAT) and general dataset (Human) to compare efficiency and accuracy. The experimental environment is a computer of Windows 7 64- bit Ultimate operating system with an Intel(R) Core(TM) i5-4570 CPU @3.20GHz processor and 16 GB RAM.
The protein from COVID-19 dataset.
The EUMAT protein dataset.
The Human protein datasets.
Each identity has 1000 pairs and data are taken from Human protein dataset by computing NW algorithm.