The coursework is an individual piece of assessment, requiring you to analyse the ORGANICS dataset within SAS Enterprise Miner, using the directed data mining techniques covered in the IMAT3613 module, and detailing your results, interpretations, conclusions and recommendations in a well-structured technical report.  You are provided with:

  1. This Brief
  2. The ORGANICS dataset contains 10,000 observations and 13 variables shown in Appendix B.
  3. The coursework will be assessed according to the marking grid in Appendix C
  4. Self/Peer Assessment Rubric Appendix D
  5. Template Report in Appendix E

Lab Journal and Reflection

To help your produce this report in a timely manner, the report is built up from four biweekly activities.  You have an opportunity to modify your work from each activity in light of your own reflection and self-assessment feedback.  Ten percent of the marks are awarded for a reflection on how you have developed your report over the term.  The length of this reflection should be at least 200 words and is to be included in your appendix. To help you produce this reflection you may make use of the journal feature on blackboard and the self-assessment grid to record your progress.

The last and fifth activity is to produce an integrated report with conclusions and recommendations you will complete independently.

In the odd weeks it is suggested that you upload your answer to the activity on the lab journal.

In the even weeks you are expected to comment on your work using the self-assessment rubric and assign a grade A, B, C, D or F. At the beginning of the even weeks a rubric will be produced for each activity to guide you in the assessment of your self-assessment.  These marks are not used to make up the final mark however your engagement with the process is.  This has been designed to help you structure your work and pace the development of the report over the term.

You may modify your weekly contributions through your engagement in the lab journal.  In fact, you are encouraged to do so.  You should treat the lab journal as a notebook of your activities for the week.

In the final exercise you will integrate all four activities and the final activity into a report.

In each activity you are expected to produce a piece of writing from between 200 and 400 words, producing a final report to a maximum of 2000 words excluding, table of contents, diagrams and appendices. You are provided a template report to complete, existing words in the template do not count to the report maximum.

This type of assessment is known as a patchwork assessment and is more UDL friendly compared to traditional forms of report assessment.

The patchwork assessment gives you an opportunity to improve your work over the term and reduces the stress of having to produce one piece of writing at the last minute.

The final report is summative and is marked by your tutor according to the attached marking grid.

Individual Data Set

You will each individually generate a unique model set personal to you.

Each of you will be working on your own random sample of data generated by typically inserting the last 5 figures of your DMU student id number into the random seed generator within the Data Partition node.  You will be shown how to do this in the labs.

Note: If spurious output for any of the models should occur, insert the last 4 figures (or the last 3 figures) of your DMU student id number into the random seed generator – to enable you to generate sensible output that you can interpret.

In SAS Enterprise Miner

Submission

You will need to submit a copy of your report using the Turnitin link in the assessments section of the Data Mining module shell on Blackboard (to be made available prior to the coursework deadline).

SCENARIO: THE ORGANICS DATASET

  1. A supermarket is beginning to offer a line of organic products. The supermarket’s management would like to determine which customers are likely to purchase these products.
  2. The supermarket has a customer loyalty program. As an initial buyer incentive plan, the supermarket provided coupons for the organic products to all of their loyalty program participants and have now collected data that includes whether or not these customers have purchased any of the organic products.

You are a data miner and have been commissioned by the supermarket’s manager to analyse the ORGANICS data and to provide the manager with the best model that s/he should use to identify the customers who are likely to buy the supermarket’s new line of organic products.

The analysis you are conducting will represent the first flow of the virtuous cycle of data mining.

You will be assessed on producing a technical, well-structured, comprehensive but concise report to the manager of the supermarket.  This report is broken up into five activities, four of which you are encouraged to do biweekly and self-assess your work using the lab journal.  The final activity integrates the pieces into one report detailing:

Activity 1: Week 3 – Week 4

  1. Develop a description of the business problem and appropriate data mining problem and describe a data mining framework that is appropriate for your brief.  Identify the target variable.
  2. Make appropriate use of Exploratory Data Analysis on your data set to develop insights that will inform your data mining process suggest any transformations which might be appropriate.

Activity 2: Week 5 – Week 6

  1. Apply regression analyses to your dataset including the full model and the Selection Methods: Forward, Backward and Stepwise.  Develop a regression equation which includes only significant parameters at the 95% confidence interval.
  2. Conduct a Decision Tree analysis on the data set, vary the default parameters and present an interpretation of your results. Identify the target path(s) and critical path.

Activity 3: Week 7 – Week 8

  1. Conduct a Neural Network analysis on the data set, vary the default parameters and present an interpretation of your results. 
  2. Choose to try different neural network architectures.  Identify the most important weights together with a diagram identifying the neural network architecture.

Activity 4: Remaining time

  1. Justification of your final selected model, by considering appropriate data mining strategies: Cumulative Lift Charts, Non-Cumulative Lift Charts and Diagnostic Charts.
  2. Conclusions
  3. Recommendations on how to improve the quality of the supermarket’s data collection process in the future, to enable you as a data miner the opportunity to improve on the accuracy of the data mining model in further flows of the data mining cycle.  Develop and integrate your activities into a full technical report.

Appendix

In the Appendix of the report, you need to include:

  1. A table of the model roles and measurement levels of the variables (to produce sensible analyses).
  2. A view of the random seed generator illustrating the digits of your DMU student id number that you have used (to produce sensible analyses).
  3. A copy of the process flow diagram.
  4. A reflection of at least 200 words describing how your interaction with the discussion board modified or shaped the development of your report during the patchwork process.
Check List for Written Report (not all of the below will be relevant to your report)
  1. Title page Does this include the: Title? Author’s name? Module/course details?   2. Acknowledgements Have you acknowledged all sources of help?   3. Contents Have you listed all the main sections in sequence? Have you included a list of illustrations?   4. Abstract or summary Does this state: The main task? The methods used? The conclusions reached? The recommendations made?   5. Introduction Does this include: Your terms of reference? The limits of the report? An outline of the method? A brief background to the subject matter?   6. Methodology Does this include: The form your enquiry took? The way you collected your data?   7. Reports and findings Are your diagrams clear, labelled and simple? Do they relate closely to the text? If you have used colour keys in your diagrams, have you made provision that these keys are understandable if you have submitted a black and white report    8. Discussion Have you identified key issues? Have you suggested explanations for your findings? Have you outlined any problems encountered? Have you presented a balanced view?   9. Conclusions and recommendations Have you drawn together all of your main ideas? Have you avoided any new information? Are any recommendations clear and concise?   10. References Have you listed all references? Have you included all the necessary information for locating each reference? Are your references accurate? Are your references in Harvard Notation?   11. Appendices Have you only included supporting information? Does the reader need to read these sections?   12. Writing style Have you used clear and concise language? Are your sentences short and jargon free? Are your paragraphs tightly focused? Have you used the active or the passive voice?

Notes

Report Guidance:

Your contribution to the report should be no longer than 2000 words use a minimum font size 12.  You are given a report template, with the first steps of the work already written and a recommended structure, table of contents.  You are free to modify the layout to suit your own style.  The aim of the template is to provide you guidance as to the level of presentation that is expected in a technical report.   To get marks for this section of the work you must complete the blanks.  You should also grey out the font to indicate that these are not your words, the grey words do not count to the total word count.

Marks are awarded for technical correctness, descriptions of models, appropriate justification of node and parameter choices, appropriate actions to guard against overfitting, indications of model robustness, model limitations, data insights, analysis supported by appropriate charts.

Reports are expected to be written to a professional standard, clear concise.  Text supported by relevant choice of diagrams and use of tables to summarise data, avoidance of repetition and redundancy, appropriate use of appendices, table of contents and use of page numbers, table numbering and figure numbering, presence of an informative abstract or executive summary.

All diagrams must be legible and appropriately labelled if short of space use appendices. If you use coloured diagrams to illustrate or contrast points then you must provide a key for the colour.  Assume that the audience of the report is senior management with no knowledge of the technical details of data mining.

I do not expect you to describe everything that you attempted, poor models or models of no consequence can be summarised in a table in the appendix.  You should make your report concise by only providing one model description for logistic regression, decision tree and neural network.  i.e. the best performant model for each type of classifier.  Only report surprising or contrasting details or exceptional model performance to preserve the word limit.

Time Management Guidance:

You should budget half your time on using the SAS enterprise miner software to generate informative models, explore nodes and appropriate non-default options. The other half of your time should be budgeted on production of a clear, well structure report which describes your work and addresses the assignment brief.

The SAS software is a professional data mining products full of features and options.  You are free to explore these options, however if you use something not covered in the course you must justify its use to receive credit.  You will be penalized for inappropriate use of features you cannot adequately justify or explain.

There will come a point of diminishing returns where no matter how much effort you put into the software you cannot improve on model performance.  This will be the point you should focus on the report. You should start work on the assignment immediately, and make allowance for any difficulties you might face.  Leaving the work till the last minute will result in poor quality report.  You should not underestimate the time it takes to become fluent in the use of the software.  If you have followed all the labs attentively and with understanding you should not face too many hurdles.  The completion of the assignment represents a cap-stone moment which will integrate everything you have learnt on the course.

All papers are written by ENL (US, UK, AUSTRALIA) writers with vast experience in the field. We perform a quality assessment on all orders before submitting them.

Do you have an urgent order?  We have more than enough writers who will ensure that your order is delivered on time. 

We provide plagiarism reports for all our custom written papers. All papers are written from scratch.

24/7 Customer Support

Contact us anytime, any day, via any means if you need any help. You can use the Live Chat, email, or our provided phone number anytime.

We will not disclose the nature of our services or any information you provide to a third party.

Assignment Help Services
Money-Back Guarantee

Get your money back if your paper is not delivered on time or if your instructions are not followed.

We Guarantee the Best Grades
Assignment Help Services