ABSTRACT

The exponential growth of industrial data being generated by sensors, modern equipment and devices is pushing the service sector to use more sophisticated analytics tools that can produce useful knowledge and predict certain events, especially for those which require reducing loss through preventive maintenance. This work presents the application of big data analytics for machine learning processing through a railway company problem approach, using one of the most powerful tools for large scale data management: the open-source Apache Spark platform. The practical implications of this, are in a reliable prediction of the condition of trains before being loaded and sent to a destination.