ABSTRACT
The proposed research aims at enhancing the domain of ASD diagnostic by utilizing machine learning tools including, Decision Trees, Support Vector Machines (SVM), Artificial Neural Networks (ANN), and Recurrent Neural Network (RNN). We built upon a diverse dataset of 47,500 entries from medical records and reports, doctor's notes, clinical assessments as well as various sources on the web. The current work underscores this unique genealogy. Overall, by adhering to a more tedious 70–30 training-testing breakdown, we engender our models with robust integrity safeguarding against overfitting and allowing for more practical testing of model performance. In this way the research continues, with a range of feature extraction methods Bag-of-Words, TF-IDF, word embeddings sentiment analysis and topic modeling for example that allow the rich structures in text to be summarized into meaningful patterns. We train our machine learning models and also evaluate them based on the accuracy, precision, recall and F1 Score metrics. DT provides interpretability, SVM is particularly well-suited to high-dimensional problems, ANN can model complex patterns and RNN is capable of handling the time dimension. By breaking down true positives, true negatives, false positives and false negatives in the confusion matrices on a granularity level we can clearly identify where our models excel well in and where they need more work. These results highlight the importance of considering a complete perspective knowing, which are those context nuances that make selecting the ideal model for ASD diagnosis different. Here, we report our work to definitively justify the ongoing discussion surrounding early ASD detection by presenting some strong model capable results from a wide variety of ML models.
