DIGITAL LIBRARY
THE USE OF AN INTELLIGENT FORUM CRAWLER FOR DATA RETRIEVAL FROM E-LEARNING PORTALS
University of Belgrade, School of Electrical Engineering (SERBIA)
About this paper:
Appears in: EDULEARN14 Proceedings
Publication year: 2014
Pages: 2441-2449
ISBN: 978-84-617-0557-3
ISSN: 2340-1117
Conference name: 6th International Conference on Education and New Learning Technologies
Dates: 7-9 July, 2014
Location: Barcelona, Spain
Abstract:
This paper introduces a solution to ease data retrieval from the forums created on the university portals and e-learning sites. The Web forum is an interactive, alive form of communication that allows its users to ask questions, talk back and contribute user-created content in spontaneous and intuitive way. We have developed the system to assist students in classes where effective online learning communities already exist, as well as some of the established LMS (Learning Management System), that allow download of required teaching material.

However, much of the information is exchanged when students request help while preparing for exams, doing homework, or requesting some additional materials on Web forums. University domains often contain multiple forums, so we need to search them all when asking for some particular information. It should be also noticed that students are sometimes not sure of the accuracy of their search queries, so they often have to repeat the search process, which has to be very quick and efficient.

The first phase of this method is to search and index all the information within all forums of one university domain. To automate this process, we use a specialized search engine for indexing and parsing forums - FCbRE (Forum Crawler based on Regular Expressions). When this type of search engine encounters a certain educational portal, it automatically generates regular expressions required for later forum crawls within that portal. In this way, the crawler can be used in wide-scale search and it can be expanded to all types of educational portals based on different kind of technologies. The data collected in this way is analyzed and grouped for easier user search. For the analysis of information gathered from the forum we use the SVM (Support Vector Machine). Based on the collected data and documents, a sample is selected that can be later used as a training set input to SVM. Based on this set the SVM constructs classifiers, which can be used in detecting the similarity between various documents and information. References to similar documents are grouped together in a database to enable later multiple display and selection of links when the user makes a query. The same approach is applied for thread names and forums.

To make the search and review more descriptive when displaying it to the user, we developed a special module that can be integrated in different e-learning systems, which offers advanced search features. Additionally, we introduced the monitoring of the system, so when the user tries to create a new topic, the system checks whether the same or similar issue exists, and informs the user, thus preventing duplication of topics.

Our results showed that duplication of the same information and documents at the multiple locations on e-learning portals and forums is quite common. Aggregation of identical or similar information contributed to the rapid and efficient utilization of the e-learning portals, and reduced the time required for the search. It has been shown that due to the badly organized forum sections, users re-ask a question that has already been addressed somewhere. By using our solution, this situation can be avoided. In addition, it is shown that this model, thanks to its modularity can be applied to a different set of eLearning systems and their forums, implementing different technologies and design.