Fast databank searching with a reduced amino-acid alphabet
Abstract
Abstract Fast sequence databanks search algorithms generally make use of hash tables and look for exactly matching words. An increased sensitivity—at the expense of a decreased selectivity—can be attained in the case of proteins by using a reduced amino acid alphabet. We propose here an alphabet reduced to 10 symbols, that we used in modified versions of the FASTP and SCAN programs. An application to the aminoacyl-tRNA synthetases shows that this technique may be useful in detecting distant relationships between proteins. © Oxford University Press