Predictive Analysis in Data Mining

Whether it is called data mining, predictive analytics, sense making, knowledge discovery, or data science, the rapid development and increased availability of advanced computational techniques have changed our world in many ways.

Good analysts are like sculptors. They can look at a data set and see the underlying form and structure. Data mining tools can function as the chisels and hammer, allowing the analysts to expose the hidden patterns and reveal the meaning in a data set so that others can enjoy its composition and beauty.

Data mining, on the other hand, is a highly intuitive, visual process that builds
on an accumulated knowledge of the subject matter, something also known as
domain expertise. While training in statistics generally is not a prerequisite for
data mining, understanding a few basic principles is important.

Throughout the data mining and modeling process, there is a fair amount of user discretion. There are some guidelines and suggestions; however, there are very few absolutes. As with data and information, some concepts in modeling are important to understand, particularly when making choices regarding accuracy, generalizability, and the nature of acceptable errors.

It really is true with predic- tive analytics and modeling that if it looks too good to be true it probably is; there is almost certainly something very wrong with the sample, the analysis, or both. Errors can come from many areas; however, the following are a few common pitfalls.

This point also highlights the importance of working with the operational personnel, the ultimate end users of most analytical products, throughout the analytical pro- cess. While they might be somewhat limited in terms of their knowledge and understanding of the particular software or algorithm, their insight and per- ception regarding the ultimate operational goals can signi cantly enhance the decision-making process when cost/bene t and error management issues need to be addressed.

Some events are so unique or rare that they are referred to as “Black Swans.” “True” Black Swans cannot be predicted or anticipated; and by extension, they cannot be prevented or thwarted. More recently, though, the concept of “Anticipatory Black Swans” has been introduced. In contrast to the True Black Swans, an Anticipatory Black Swan “can be known beforehand” as com- pared to “what truly is a surprise.”6 Like data mining generally, the proposed approach to both types of Black Swans is to, “ask the right question at the right time (and be wise enough to understand the response or appreciate the signs)…[which] could lead to success in the matter of anticipating what can be anticipated, and at least understanding sooner the impact of what cannot be anticipated.”

It is important that we clearly recognize what these tools can and cannot accomplish, though. Instead, it is all about increasing the likelihood that a desired outcome will occur – at the right time, the First time. These concepts of increased likelihood and timeliness are what make applying it to decision making so enticing.”These are very worthy, yet attainable goals for operational public safety and security analysis. With that objective in mind, I wish you well, and encourage you to go forward and do good.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s