ABSTRACT

Machine learning is a core area in big data analysis. Machine learning has been widely used in robotics, driverless cars, space exploration, websearch, E-commerce, and finance. Machine learning is also a powerful tool for disease risk prediction, diagnosis, management, treatment selection, and precision medicine. Machine learning is increasingly woven into our daily lives. A new machine learning revolution emerges.

This chapter covers the core part of machine learning and data science. Machine learning includes unsupervised learning, supervised learning, and semisupervised learning. This chapter focuses on supervised learning. It discusses discriminant analysis and penalized discriminant analysis, support vector machine, kernel support vector machine, sparse support vector machine, and multitask and multiclass support vector machine, classical penalized and network-penalized logistic regression, low rank models with both generalized cost functions and generalized constraints, generalized canonical correlation analysis, and canonical correlation analysis for classification. To efficiently implement machine learning, this chapter develops multilevel of genomic data representation. Feature selection is a key component for the success of machine learning. This chapter also introduces unsupervised and supervised dimension reduction and sufficient dimension reduction. Finally, this chapter discusses application of machine learning and feature selection to disease risk prediction and precision medicine.