ABSTRACT

Statistics is a branch of mathematics dealing with the collection, tabulation and interpretation of numerical data. There are specialist statistical software packages available such as GraphPad Prism, Minitab, Statistical Analysis Software (SAS), Statistical Package for the Social Sciences (SPSS) and R, for example. This chapter shows the reader how Python can be used to solve a wide range of statistical problems and is also useful in Data Science and Machine Learning. The first section covers simple linear regression and shows the linear relationship between carbon dioxide emissions and gross domestic product in the USA from 2000 to 2020. The second section is concerned with stochastic processes and Markov chains; directed graphs are plotted and convergence to steady states are illustrated with plots. The student's t-test is discussed in the third section, and a simple two-sample t-test is applied to the heights of female and male students in a lecture. The final section introduces powerful Monte-Carlo simulations and a simple example is presented of gambling on a European roulette wheel, where it is clearly demonstrated that casinos and online betting companies always win in the long term, and why none of us should gamble.