ABSTRACT

Step 2: The obtained accession number of the representative homology is used to retrieve the corresponding GO terms from the UniProtKB-GOA database. Let the total number of obtained GO terms obtained be κ. Step 3: The query sequence P is represented by using 20 AAC features and the m GO terms (called AAC-GO) i.e., P = [p1, p2, …, p20, ….. pm+20]. Step 4: Verify whether the κ GO terms exist in the m GO terms. If the j-th GO term is in the set of the m GO terms, then the element pj = 1; otherwise, pj = 0, where j = 21, …, m+20 and j = 1, 2, …, κ. Step 5: The query sequence P = [p1, p2, …, p20, ….. pm+20] are the input to a SVM predictor. The output label is one of non-virulent, adhesion, toxin, secretion, and biofilm.