Delivery in day(s): 5
Data Mining Oz Assignments
Data mining also known as Knowledge Discovery in Databases (KDD), is defined as the process of drawing or extracting inherent, earlier unrecognized and conceivably useful information and knowledge from a large data collection or datasets in databases or data warehouse. The extraction process is done by using automated data analysis techniques, which sorts data sets to identify patterns to establish relationships. These are used by the data mining toll to predict future trends. The necessary difference between traditional data analysis such as query, online application of analysis and reporting with data mining is that data mining excavate information and reveal knowledge on ground of indistinct presumption (Sahu, Shrma & Gondhalakar, 2011).
Data mining stages: They are as follows.
1) Preparation of data (or pre-processing of data): Data preparation is data manipulation into suitable form for further processing and analysis. The process of data preparation consists of various tasks such as collecting, integrating, structuring, translating and validating data and those tasks cannot be fully automated and many of them are tedious and time-consuming. The main reason for data preparation is to ensure that the required prepared information for analysis must be consistent and accurate. Data preparation is required for successful data mining (Provost & Fawcett, 2013).
2) Data Mining: This stage is the nucleus of the overall process, which primarily utilises the gathered techniques of mining as well as tools for dealing with the data. This stage involves the following activities.
Data mining method collection
Clustering: It is a technique of data mining that places data elements into connected groups without having group definitions prior knowledge.
Classification: It generalizes the known structure and apply it to new data.
Regression: It takes the numerical dataset and builds the mathematical formula as per the results.
Association rule learning (or market basket analysis): It searches and discover the relationships between variables (Witten et al., 2016).
Data mining algorithm collection: The data mining algorithm are divided into components such as model or pattern structure, score function, searching procedure, strategy of data management.
3) Information expression: It is the method of using the technology for visualization and knowledge information expression.
4) Analysis and decision-making: Data mining’s final objective is to help in the decision-making process as decision-makers analyses data mining outcomes and modify the methods of decision-making by incorporating with real situation (Romero & Ventura, 2013).
Advantages of data mining: They are as follows.
Marketing companies can use data mining to develop models for predicting about who will respond to new marketing campaign. The prediction from the campaign helps the marketers to approach targeted customers for selling profitable products.
Banks can use data mining to detect fraudulent transactions of a credit card to prevent losses of the credit card owners.
Government agencies can use data mining to analyse the financial transaction records and build pattern that can criminal activity or money laundering.
Manufacturers can use data mining technique on functional data to detect malfunctioning machinery or device and find out best control guidelines.
The tools and techniques can be used to handle database security problems such as detection of intrusion and database auditing (Sahu, Shrma & Gondhalakar, 2011).
Drawbacks of data mining:
1) Issues of privacy: Data mining process is effectively based on data preparation task that can disclose information or patterns that may compromise privacy and confidentiality accountability. This occurs, commonly through data aggregation. In data aggregation, data are compiled or accumulated from various sources and put together for data analysis purpose. After compilation, anybody who mines data has access to the recently accumulated data set can identify particular individuals particularly when data were anonymous in original. This can put threat to an individual’s privacy (Wu et al., 2014).
2) Issues of security: Application of data mining techniques can cause security problems. Data mining makes it possible for unethical people to pick up and gather a significant amount of information about individuals from the routine business transactions that is associated with his / her buying habits and preferences (Jaseena & David, 2014).
3) Inaccurate information / misuse of information: It is notable that the tools and technique of data mining may not always be perfectly accurate and thus if inaccurate information is provided for decision-making then it will result serious consequence. In addition, there are chances of misuse of information that is collected through data mining. Unethical people can exploit this information to assailable people (Sahu, Shrma & Gondhalakar, 2011).
From the above discussion, it draws conclusion the technology of data mining is an application-oriented one which prepares, integrates, validates, transforms and analyses data besides searching and querying in order to provide appropriate solution to problem of real-life, find the connection between events as well as to forecast future activities. The discussion also depicts about how data mining provide benefits to society, government and business. Further, the discussion also shows that accounting issues such as security, privacy and misuse of data and information can pose big problem if they are not addressed appropriately in time.
1. Jaseena, K. U., & David, J. M. (2014). Issues, challenges, and solutions: Big data mining. Computer Science & Information Technology (CS & IT), 131-140.
2. Provost, F., & Fawcett, T. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking. " O'Reilly Media, Inc.".
3. Romero, C., & Ventura, S. (2013). Data mining in education. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(1), 12-27.
4. Sahu, H., Shrma, S., & Gondhalakar, S. (2011). A brief overview on data mining survey. International Journal of Computer Technology and Electronics Engineering (IJCTEE) Volume, 1, 114-121.
5. Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.
6. Wu, X., Zhu, X., Wu, G. Q., & Ding, W. (2014). Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1), 97-107.