DIGITAL LIBRARY
MINING FAMILY LEARNING DATA: NASA FAMILY SCIENCE NIGHT EXPERIMENT
Rochester Institute of Technology (UNITED STATES)
About this paper:
Appears in: EDULEARN10 Proceedings
Publication year: 2010
Pages: 4342-4348
ISBN: 978-84-613-9386-2
ISSN: 2340-1117
Conference name: 2nd International Conference on Education and New Learning Technologies
Dates: 5-7 July, 2010
Location: Barcelona, Spain
Abstract:
In mining survey data, it helps to be able to check on your own research instincts and hunches. The data collected from a year's worth of NASA Family Science Nights is one of the first of its kind with the intent of digging deeper into the educational dynamics within families. As a first step in the knowledge mining effort, we have developed Associator, an association rule mining tool to test rules representing those initial research hunches. This tool builds on the basic knowledge mining tools provided by Weka, the open source data mining software.

Association mining is used to find inferences within data. It can be used in large data sets to determine the probable values for missing data within a dataset. This ability to determine probable missing values extends to the ability to predict certain patterns from a data set. These predictions are expressed using rules of the form : ``If this (the antecedent) then that (the consequent)'' with a certain level of probability based on available data. For this reason, association mining fits best with the desire to learn and predict more about family learning patterns.

The tool, Associator, is developed using Java. This was most appropriate given that the source for Weka is made available online. This also allows platform independence for anyone looking to test their rule hunches. The portability of the application is also beneficial to the classroom as it can be used with any existing infrastructure. The source for Associator is available for distribution upon request.

In addition to being a useful aid for researchers, this tool can also be used to teach the basics of association mining in a data mining course. The interface strips out the set structured Apriori algorithm approach and allows the student to observe the consequences of their own rules. Associator allows the user to insert their own nominal data sets in ARFF format to pick apart the schema and present tables of selectable antecedent and consequent features and values.

The antecedent of the rule created will determine the instances in the data set that should be looked at for relationships. The consequent then determines which of those instances are appropriately related to the antecedent and with what confidence, i.e. the number of instances that follow the rule entirely. Associator calculates the confidence of each rule without regard to support, which is the number of instances matching the rule divided by the number of instances total. By allowing students to choose the antecedent and the consequent, they can better observe the relationship between them given a data set with which they are familiar.
Keywords:
research project, data mining, apriori, rule mining.