ABSTRACT

LncRNAs are an integral part of the gene regulatory machinery of plants. While mRNAs encode proteins essential for the functioning of a biological system, lncRNAs play crucial roles in regulating gene expression during growth, development and stress response. Several machine learning tools have been developed for discriminating lncRNAs from mRNAs. However, due to lack of sequence conservation and experimental data, limited knowledge about the structural attributes and, non-availability of high-quality reference genomes and transcriptomes, comprehensive annotation of lncRNAs in plants is a significant challenge. Currently, several tools are available for the identification of lncRNAs. Here, we review a selection of 15 tools developed using only plant or a mix of plant and animal data for training the model. The tools have been categorized based on the shallow or deep machine-learning algorithms used for the classification. The most popular tools used for benchmarking these tools have also been reviewed. Next, we provide an overview of the key strengths and limitations of the available tools and conclude by highlighting the opportunities and challenges associated with high-quality annotation of lncRNAs in plants.