ABSTRACT

Now that we have an understanding of what the data mining technologies are and how they contribute to data mining, let us next discuss what data mining is all about and how we go about doing mining. Remember that data mining is the process of posing queries and extracting useful information, patterns and trends previously unknown from large quantities of data stored possibly in databases. That is, not only do we want to get patterns, and trends, these patterns and trends must be useful, else we can get irrelevant data that could turn out to be harmful or cause problems with the actions taken. For example, if an agency finds incorrectly that its employee has carried out fraudulent acts and then starts to investigate his behavior, and if this is known to the employee, then it could damage him. This is called a false positive. However, we also do not want results that are false negatives. That is, we do not want the data miner to return a result that the employee was well behaved when he is a fraud. So data mining has serious implications. This is why it is critical that we have good data to mine and we know the limitations of the data mining techniques.