دانلود مقاله ISI انگلیسی شماره 22098
ترجمه فارسی عنوان مقاله

داده کاوی آموزشی: بررسی از 1995 تا 2005

عنوان انگلیسی
Educational data mining: A survey from 1995 to 2005
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
22098 2007 17 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Expert Systems with Applications, Volume 33, Issue 1, July 2007, Pages 135–146

ترجمه کلمات کلیدی
داده کاوی - سیستم های آموزشی - وب کاوی - سیستم های آموزشی مبتنی بر وب
کلمات کلیدی انگلیسی
Data mining, Educational systems, Web mining, Web-based educational systems
پیش نمایش مقاله
پیش نمایش مقاله  داده کاوی آموزشی: بررسی از 1995 تا 2005

چکیده انگلیسی

Currently there is an increasing interest in data mining and educational systems, making educational data mining as a new growing research community. This paper surveys the application of data mining to traditional educational systems, particular web-based courses, well-known learning content management systems, and adaptive and intelligent web-based educational systems. Each of these systems has different data source and objectives for knowledge discovering. After preprocessing the available data in each case, data mining techniques can be applied: statistics and visualization; clustering, classification and outlier detection; association rule mining and pattern mining; and text mining. The success of the plentiful work needs much more specialized work in order for educational data mining to become a mature area.

مقدمه انگلیسی

During the past decades, the most important innovations in educational systems are related to the introduction of new technologies (Ha, Bae, & Park, 2000) as web-based education. This is a form of computer-aided instruction virtually independent of a specific location and any specific hardware platform (Brusilovsky & Peylo, 2003). It has considerably gained in importance and thousands of web courses have been deployed in the past few years. But many of the current web-based courses are based on static learning materials, which do not take into account the diversity of students. Adaptive and intelligent web-based educational systems have been seen as a solution to individually richer learning environments. These systems try to offer learners personalized education by building a model of the individual’s goals, preferences, and knowledge. Data mining or knowledge discovery in databases (KDD) is the automatic extraction of implicit and interesting patterns from large data collections (Klosgen & Zytkow, 2002). KDD can be used not only to learn the model for the learning process (Hamalainen, Suhonen, Sutinen, & Toivonen, 2004) or student modeling (Tang & McCalla, 2002) but also to evaluate and to improve e-learning systems (Zaïane & Luo, 2001) by discovering useful learning information from learning portfolios (Hwang, Chang, & Chen, 2004). In conventional teaching environments, educators are able to obtain feedback on student learning experiences in face-to-face interactions with students, enabling a continual evaluation of their teaching programs (Sheard, Ceddia, Hurst, & Tuovinen, 2003). Decision making of classroom processes involves observing a student’s behavior, analyzing historical data, and estimating the effectiveness of pedagogical strategies. However, when students work in electronic environments, this informal monitoring is not possible; educators must look for other ways to attain this information. Organizations, which run distance education sites, collect large volumes of data, automatically generated by web servers and collected in server access logs. Web-based learning environments are able to record most learning behaviors of the students, and are hence able to provide a huge amount of learning profile. Recently, there is a growing interest in the automatic analysis of learner interaction data with web-based learning environments (Muehlenbrock, 2005). In order to provide a more effective learning environment, data mining techniques can be applied (Ingram, 1999). Data mining is a step in the overall process of KDD that consists of preprocessing, data mining and postprocessing. Data mining has already been successfully applied in e-commerce (Srivastava, Cooley, Deshpande, & Tan, 2000), and it has begun to be used in e-learning with promising results. Although the discovery methods used in both areas (e-commerce and e-learning) are similar (Hanna, 2004), there are some important differences between them: • Domain. The e-commerce purpose is to guide clients in purchasing while the e-learning purpose is to guide students in learning ( Romero, Ventura, & Bra, 2004). • Data. In e-commerce the used data are normally simple web server access log, but in e-learning there is more information about a student’s interaction ( Pahl & Donnellan, 2003). The user model is also different in both systems. • Objective. The objective of data mining in e-commerce is increasing profit, that is tangible and can be measured in terms of amounts of money, number of customers and customer loyalty. And the objective of data mining in e-learning is to improving the learning. This goal is more subjective and more subtle to measure. • Techniques. Educational systems have special characteristics that require a different treatment of the mining problem. As a consequence, some specific data mining techniques are needed to address in particular the process of learning ( Li and Zaïane, 2004 and Pahl and Donnellan, 2003). Some traditional techniques can be adapted, some cannot. The application of knowledge extraction techniques to educational systems in order to improve learning can be viewed as a formative evaluation technique. Formative evaluation (Arruabarrena, Pérez, López-Cuadrado, & Vadillo, 2002) is the evaluation of an educational program while it is still in development, and with the purpose of continually improving the program. Examining how students use the system is one way to evaluate the instructional design in a formative manner and it may help the educator to improve the instructional materials (Ingram, 1999). Data mining techniques can discover useful information that can be used in formative evaluation to assist educators establish a pedagogical basis for decisions when designing or modifying an environment or teaching approach. The application of data mining in educational systems is an iterative cycle of hypothesis formation, testing, and refinement (see Fig. 1). Mined knowledge should enter the loop of the system and guide, facilitate and enhance learning as a whole. Not only turning data into knowledge, but also filtering mined knowledge for decision making.As we can see in Fig. 1, educators and academics responsible are in charge of designing, planning, building and maintaining the educational systems. Students use and interact with them. Starting from all the available information about courses, students, usage and interaction, different data mining techniques can be applied in order to discover useful knowledge that helps to improve the e-learning process. The discovered knowledge can be used not only by providers (educators) but also by own users (students). So, the application of data mining in educational systems can be oriented to different actors with each particular point of view (Zorrilla, Menasalvas, Marin, Mora, & Segovia, 2005): • Oriented towards students ( Heraud et al., 2004, Farzan, 2004, Lu, 2004, Tang and McCalla, 2005 and Zaïane, 2002). The objective is to recommend to learners activities, resources and learning tasks that would favour and improve their learning, suggest good learning experiences for the students, suggest path pruning and shortening or simply links to follow, based on the tasks already done by the learner and their successes, and on tasks made by other similar learners, etc. • Oriented towards educators ( Ha et al., 2000, Hamalainen et al., 2004, Merceron and Yacef, 2004, Minaei-Bidgoli and Punch, 2003, Mor and Minguillon, 2004, Muehlenbrock, 2005, Pahl and Donnellan, 2003, Romero et al., 2004, Silva and Vieira, 2002, Talavera and Gaudioso, 2004, Tang et al., 2000, Ueno, 2004b and Zaïane and Luo, 2001). The objective is to get more objective feedback for instruction, evaluate the structure of the course content and its effectiveness on the learning process, classify learners into groups based on their needs in guidance and monitoring, find learning learner’s regular as well as irregular patterns, find the most frequently made mistakes, find activities that are more effective, discover information to improve the adaptation and customization of the courses, restructure sites to better personalize courseware, organize the contents efficiently to the progress of the learner and adaptively constructing instructional plans, etc. • Oriented towards academics responsible and administrators ( Becker et al., 2000, Grob et al., 2004, Luan, 2002, Ma et al., 2000, Peled and Rashty, 1999, Sanjeev and Zytkow, 1995 and Urbancic et al., 2002). The objective is to have parameters about how to improve site efficiency and adapt it to the behavior of their users (optimal server size, network traffic distribution, etc.), have measures about how to better organize institutional resources (human and material) and their educational offer, enhance educational programs offer and determine effectiveness of the new computer mediated distance learning approach. There are many general data mining tools that provide mining algorithms, filtering and visualization techniques. Some examples of commercial and academic tool are DBMiner, Clementine, Intelligent Miner, Weka, etc. (Klosgen & Zytkow, 2002). However these tools are not specifically designed and maintained for pedagogical purposes and it is cumbersome for an educator who does not have an extensive knowledge in data mining to use these tools (Zaïane, Xin, & Han, 1998). In order to solve this problem, some specific educational data mining, statistical and visualization tools have been developed to help educators in analyzing the different aspects of the learning process (see Table 1).We have divided this paper into the following sections. We first review some different types of educational systems and how data mining can be applied in each of them. We then describe the data mining techniques that have been applied in educational systems grouping them by task. Finally, we summarize the main conclusions and we draw some future research.

نتیجه گیری انگلیسی

Educational data mining is an upcoming field related to several well-established areas of research including e-learning, adaptive hypermedia, intelligent tutoring systems, web mining, data mining, etc. The application of data mining in educational systems has specific requirements not present in other domains, mainly the need to take into account pedagogical aspects of the learner and the system. Although the educational data mining is a very recent research area there is an important number of contributions published in journals, international congress, specific workshops and some ongoing books (Romero & Ventura, 2006) that show it is one new promising area. Some of the most promising work line is the use of e-learning recommendation agents (Lu, 2004 and Zaïane, 2002). These recommender agents sees what a student is doing and recommends actions (activities, shortcuts, contents, etc.) they think would be beneficial to the student. Recommender agents can also be integrated in evolving e-learning systems in which materials are automatically found on the web and integrated into the system (Tang & McCalla, 2005). In this way, they help educators to detect which parts of existing materials from heterogeneous sources as the Internet are the best to use for composing new courses. Besides recommenders can also be integrated with domain knowledge and ontologies, combining web mining and semantic web in semantic web mining (Markellou et al., 2005). Semantic web mining is a successful integration of ontological knowledge at every stage of the knowledge discovery process (Becker, Vanzin, & Ruiz, 2005). Educational data mining is a young research area and it is necessary more specialized and oriented work educational domain in order to obtain a similar application success level to other areas, such as medical data mining, mining e-commerce data, etc. We believe that some future researches lines are: • Mining tools more easy to use by educators or not expert users in data mining. Data mining tools are normally designed more for power and flexibility than for simplicity. Most of the current data mining tools are too complex to use for educators and their features go well beyond the scope of what a educator may want to do. So, these tools must have a more intuitive and easy to use interface, with parameter-free data mining algorithms to simplify the configuration and execution, and with good visualization facilities to make their results meaningful to educators and e-learning designers. • Standardization of methods and data. Current tools for mining data from a specific course may be useful only to its developers. There are no general tools or re-using tools or techniques that can be applied to any educational system. So, a standardization of data, and the preprocessing, discovering and postprocessing tasks is needed. • Integration with the e-learning system. The data mining tool has to be integrated into the e-learning environment as another author tool. All data mining tasks (preprocessing, data mining and postprocessing) have to be carried out into a single application. Feedback and results obtained with data mining can be directly applied to the e-learning environment. • Specific data mining techniques. More effective mining tools that integrate educational domain knowledge into data mining techniques. Education-specific mining techniques can help much better to improve the instructional design and pedagogical decisions. Traditional mining algorithms need to be tuned to take into account the educational context