Interrogate a nominated data set and prepare a Knowledge Discovery Report. The report must summarise your insights into the patterns of data for people with annual income categories of:
- More than $50,000.00.
- $50,000.00 or less.
Knowledge obtained through all topics from Topic 1 up to Topic 6 can be useful for preparing this report.
The Knowledge Discovery Report must include:
- An explanation and justification of your choice of suitable data mining algorithm/s and software tool/s that are used to extract knowledge from the nominated dataset.
- At least 5 interesting rules you have identified using your chosen algorithm/s. Use the nine properties of interesting rules as a basis for nominating your interesting rules and include a justification of the interestingness of the rules discovered.
- Your insight into the patterns of the dataset as a result of a comprehensive 360 degree analysis using the software tool/s of your choice. This must be supported by statistics and presented with appropriate tables, figures and graphs.
The nominated data set is contained in the UCI Machine Learning Repository. You will be using the Adult Data Set (donated by Ronny Kohavi and Barry Becker) and accessing the adult.names and adult.data files.
The Knowledge Discovery Report must be formatted as a business report and comply with the details contained in the Presentation and Requirements sections.
SUBJECT LEARNING OUTCOMES
This assessment task will assess the following learning outcome/s:
- be able to compare and evaluate various knowledge discovery techniques.
- be able to identify and design approaches for knowledge discovery from data for making critical business decision.