Key difference: Data Mining is actually the analysis of data. It is the computer-assisted process of digging through and analyzing enormous sets of data that have either been compiled by the computer or have been inputted into the computer. Data warehousing is the process of compiling information or data into a data warehouse. A data warehouse is a database used to store data.
Data Mining is actually the analysis of data. It is the computer-assisted process of digging through and analyzing enormous sets of data that have either been compiled by the computer or have been inputted into the computer. In data mining, the computer will analyze the data and extract the meaning from it. It will also look for hidden patterns within the data and try to predict future behavior. Data Mining is mainly used to find and show relationships among the data.
The purpose of data mining, also known as knowledge discovery, is to allow businesses to view these behaviors, trends and/or relationships and to be able to factor them within their decisions. This allows the businesses to make proactive, knowledge-driven decisions.
The term ‘data mining’ comes from the fact that the process of data mining, i.e. searching for relationships between data, is similar to mining and searching for precious materials. Data mining tools use artificial intelligence, machine learning, statistics, and database systems to find correlations between the data. These tools can help answer business questions that traditionally were too time consuming to resolve.
Data Mining includes various steps, including the raw analysis step, database and data management aspects, data preprocessing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
In contrast, data warehousing is completely different. However, data warehousing and data mining are interrelated. Data warehousing is the process of compiling information or data into a data warehouse. A data warehouse is a database used to store data. It is a central repository of data in which data from various sources is stored. This data warehouse is then used for reporting and data analysis. It can be used for creating trending reports for senior management reporting such as annual and quarterly comparisons.
The purpose of a data warehouse is to provide flexible access to the data to the user. Data warehousing generally refers to the combination of many different databases across an entire enterprise.
The main difference between data warehousing and data mining is that data warehousing is the process of compiling and organizing data into one common database, whereas data mining is the process of extracting meaningful data from that database. Data mining can only be done once data warehousing is complete.
Image Courtesy: webpro.com, donmeyer.com