DIGITAL LIBRARY
TEACHING STATISTICS FOR COMPUTER SCIENCES WITH R: THE IMPORTANCE OF THE VISUALIZING THE VARIABILITY
1 Universidad de Oviedo (SPAIN)
2 European Centre for Soft Computing (SPAIN)
About this paper:
Appears in: EDULEARN10 Proceedings
Publication year: 2010
Pages: 4898-4904
ISBN: 978-84-613-9386-2
ISSN: 2340-1117
Conference name: 2nd International Conference on Education and New Learning Technologies
Dates: 5-7 July, 2010
Location: Barcelona, Spain
Abstract:
Recent tendencies in teaching Statistics point out to the importance of the practice against the more traditional abstract presentation of the theory and the results. The Problem-Based Learning (PBL) of Statistics combined with basic worked examples is showing to lead to promising results in many fields. In order to capture the attention of the students, real-life and challenging situations have to be considered. In our opinion, the structure of the statistical reasoning is easy to identify when sequential situations leading to different kind of experimental data and similar aims are exposed. The students are required to actively participate, collaborate and conjecture about the common features of the experiments, and the ways to analyze them by focusing on the characteristics that are suggested by the teacher. In this way, transversal competences as the proactive problem solving, the communication and the cooperation, in addition to the specific ones of the subject are addressed.

One of our main goals is to introduce the Statistics as the science that handles the variability of the experimental information, and the way in which this experimental information has to be collected, stored and analyze in order to get knowledge. For this purpose, PBL can be easily complemented with Computer-Based Learning (CBL) in degrees as Computer Science and Engineering. The free software environment R provides us with extremely useful tools to simulate and visualize the stochastic variability. Apart from the usual descriptive measures (as the mean, the standard deviation and so on) provided by most of the statistical software, R allows to get and design graphical displays that help to better understand such measures. We consider that comparing different situations is essential to recognize the role of such summarizing measures. The presentation of the most common probability models is more informative if R is used to show the distributions and the way in which they change as the parameters do. We propose to ease the teaching of probabilistic results (as the Central Limit Theorem or the Strong Law of Large Numbers) by allowing the students to experiment: self-made simulations of different realistic distributions help to understand the nature of the theoretical results with a considerable save of time. Moreover, the variety of tools for the Statistical Inference provided by R makes possible to focus the teaching on the methods and the results, avoiding tedious computations. At the same time, the students are in contact with a new programming language and start to learn how to implement their own statistical algorithms. Thus, valuable transferable competences in Computer Sciences and Engineering are also taken into account.
Keywords:
Problem-Based Learning, statistical software, Computer Science and Engineering.