Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Data Mining: A prediction for Student's Performance Using Classification Method

Data Mining: A prediction for Student's Performance Using Classification Method World Journal of Computer Application and Technology 2(2): 43-47, 2014 http://www.hrpub.org DOI: 10.13189/wjcat.2014.020203 Data Mining: A prediction for Student's Performance Using Classification Method 1 2,* Abeer Badr El Din Ahmed , Ibrahim Sayed Elaraby Lecturer at Sadat Academy, Computer Science Department, Cairo, Egypt Demonstrator at Higher Institute for Specific Studies, Management Information System Department, Cairo, Egypt *Corresponding Author: [email protected] Copyright © 2014 Horizon Research Publishing All rights reserved. Abstract Currently the amount huge of data stored in dimensions, categorize it and summarize the relationships educational database these database contain the useful which are identified during the mining process. information for predict of students performance. The most Brijesh Kumar Baradwaj and Saurabh Pal (2011) [1] useful data mining techniques in educational database is describes the main objective of higher education institutions classification. In this paper, the classification task is used to is to provide quality education to its students. One way to predict the final grade of students and as there are many achieve highest level of quality in higher education system is approaches that are used for data classification, the decision by discovering knowledge for prediction regarding tree (ID3) method is used here. enrolment of students in a particular course, detection of abnormal values in the result sheets of the students, Keywords Educational Data Mining (EDM), prediction about students’ performance and so on, the Classification, Knowledge Discovery in Database (KDD), classification task is used to evaluate student’s performance ID3 Algorithm. and as there are many approaches that are used for data classification, the decision tree method is used here. Alaa El-Halees (2009) [4] applied the educational data mining concerns with developing methods for discovering knowledge from data that come from educational 1. Introduction environment. used educational data mining to analyze learning behavior. Student’s data has been collected from The advent of information technology in various fields has Database course. After preprocessing the data, we applied lead the large volumes of data storage in various formats like data mining techniques to discover association, classification, records, files, documents, images, sound, videos, scientific clustering and outlier detection rules. In each of these four data and many new data formats. The data collected from tasks, we extracted knowledge that describes students' different applications require proper method of extracting behavior. knowledge from large repositories for better decision making. Mohammed M. Abu Tair and Alaa M. El-Halees (2012) [5] Knowledge discovery in databases (KDD), often called data applied the educational data mining concerns with mining, aims at the discovery of useful information from developing methods for discovering knowledge from data large collections of data [1]. The main functions of data that come from educational domain. used educational data mining are applying various methods and algorithms in order mining to improve graduate students’ performance, and to discover and extract patterns of stored data [2]. overcome the problem of low grades of graduate students The main objective of this paper is to use data mining and try to extract useful knowledge from graduate students methodologies to study student’s performance in end data collected from the college of Science and Technology. General appreciation. Data mining provides many tasks that The data include fifteen years period [1993-2007]. After could be used to study the student performance. In this preprocessing the data, we applied data mining techniques to research, the classification task is used to evaluate student's discover association, classification, clustering and outlier performance and as there are many approaches that are used detection rules. In each of these four tasks, we present the for data classification, the decision tree method is used here. extracted knowledge and describe its importance in educational domain. Sonali Agarwal, G. N. Pandey, and M. D. Tiwari (2012) [6] 2. Related Work describes the educational organizations are one of the Han and Kamber (1996) [3] describes data mining important parts of our society and playing a vital role for software that allow the users to analyze data from different growth and development of any nation. Data Mining is an 44 Data Mining: A prediction for Student's Performance Using Classification Method emerging technique with the help of this one can efficiently Clustering, Regression, Artificial Intelligence, Neural learn with historical data and use that knowledge for Networks, Association Rules, Decision Trees, Genetic predicting future behavior of concern areas. Growth of Algorithm, Nearest Neighbor method etc., are used for current education system is surely enhanced if data mining knowledge discovery from databases. These techniques and has been adopted as a futuristic strategic management tool. methods in data mining need brief mention to have better The Data Mining tool is able to facilitate better resource understanding. utilization in terms of student performance, course development and finally the development of nation's education related standards. Monika Goyal and Rajan Vohra (2012) [7] applied data mining techniques to improve the efficiency of higher education institution. If data mining techniques such as clustering, decision tree and association are applied to higher education processes, it would help to improve students’ performance, their life cycle management, selection of courses, to measure their retention rate and the grant fund management of an institution. This is an approach to examine the effect of using data mining techniques in higher education. Surjeet Kumar Yadav, Brijesh Bharadwaj, and Saurabh Figure 1. The Steps of Extracting Knowledge from Data Pal (2012) [11] used decision tree classifiers are studied and the experiments are conducted to find the best classifier for 3.1. Classification retention data to predict the student’s drop-out possibility. Brijesh Kumar Baradwaj and Saurabh Pal (2011) [12] Classification is the most commonly applied data mining Used the classification task on student database to predict the technique, which employs a set of pre-classified examples to students division on the basis of previous database. develop a model that can classify the population of records at K.Shanmuga Priya and A.V.Senthil Kumar (2013) [13] large. This approach frequently employs decision tree or applied a Classification Technique in Data Mining to neural network-based classification algorithms. The data improve the student's performance and help to achieve the classification process involves learning and classification. In goal by extracting the discovery of knowledge from the end Learning the training data are analyzed by classification semester mark. algorithm. In classification test data are used to estimate the Bhise R.B, Thorat S.S and Supekar A.K. (2013) [14] used accuracy of the classification rules. If the accuracy is data mining process in a student’s database using K-means acceptable the rules can be applied to the new data tuples [1]. clustering algorithm to predict students result. In our case study we used ID3 decision tree to represent Varun Kumar and Anupama Chadha (2013) [15] used of logical rules of student final grade. one of the data mining technique called association rule mining in enhancing the quality of students’ performances at 3.2. Clustering Post Graduation level. Clustering is finding groups of objects such that the Pallamreddy.venkatasubbareddy and Vuda Sreenivasarao objects in one group will be similar to one another and (2010) [16] explained the Decision trees are commonly used different from the objects in another group. In educational in operations research, specifically in decision analysis, to data mining, clustering has been used to group students help identify a strategy most likely to reach a goal and use of according to their behavior. According to clustering, clusters decision trees is as a descriptive means for calculating distinguish student‘s performance according to their conditional probabilities. behavior and activates. In this paper, students are clustered into three groups according to their academics, punctuality, exams and soon [8]. 3. Data Mining Definition and Techniques 3.3. Association rule Data mining refers to extracting or "mining" knowledge Association analysis is the discovery of association rules from large amounts of data [3]. Data mining techniques are showing attribute-value conditions that occur frequently used to operate on large volumes of data to discover hidden together in a given set of data. Association analysis is widely patterns and relationships helpful in decision making [1]. used for market basket or transaction data analysis [9]. The sequences of steps identified in extracting knowledge from data are: shown in Figure 1. Various algorithms and techniques like Classification, 3.4. Decision Trees World Journal of Computer Application and Technology 2(2): 43-47, 2014 45 Decision trees are commonly used in operations research, The basic idea of ID3 algorithm is to construct the specifically in decision analysis, to help identify a strategy decision tree by employing a top-down, greedy search most likely to reach a goal [10]. through the given sets to test each attribute at every tree node. In order to select the attribute that is most useful for classifying a given sets, we introduce a metric - information gain. To find an optimal way to classify a learning set we 4. Data Mining Process need some function which provides the most balanced splitting. The information gain metric is such a function. 4.1. Data Preparations Given a data table that contains attributes and class of the attributes, we can measure homogeneity of the table based on The data set used in this study was obtained from a the classes. The index used to measure degree of impurity is student's database used in one of the educational institutions, Entropy [2]. on the sampling method of Information system department The Entropy is calculated as follows: from session 2005 to 2010. Initially size of the data is 1547 records. In this step data stored in different tables was joined Entropy ∑−P log P j 2 j in a single table after joining process errors were removed. Splitting criteria used for splitting of nodes of the tree is 4.2. Data selection and transformation Information gain. To determine the best attribute for a particular node in the tree we use the measure called In this step only those fields were selected which were Information Gain. The information gain, Gain (S, A) of an required for data mining. A few derived variables were attribute A, relative to a collection of examples S, is defined selected. While some of the information for the variables as: was extracted from the database. All the predictor and response variables which were derived from the database are given in Figure 2. ∑ Gain (S , A ) Entropy (S)− Entropy (S ) v∈Values () A 5. Results and Discussion The data set used in this study was obtained from a student's database used in one of the educational institutions, on the sampling method of Information system department from session 2005 to 2010. Initially size of the data is 1548 records are given in Figure 3. Figure 2. Student Related Variables 4.3. Decision Tree Figure 3. Data Set A decision tree is a flow-chart-like tree structure, where each internal node is denoted by rectangles, and leaf nodes To work out the information gain for A relative to S, we are denoted by ovals. All internal nodes have two or more first need to calculate the entropy of S. Here S is a set of 1547 child nodes. All internal nodes contain splits, which test the examples are 292 " Excellent ", 536 "Very Good", 477 value of an expression of the attributes. Arcs from an internal "Good", 188 "Acceptable" and 54 "Fail". node to its children are labeled with distinct outcomes of the test. Each leaf node has a class label associated with it [11]. Entropy () S = −P log (PP )− log (P ) Excellent Excellent VeryGood VeryGood −− P log (PP ) log (P ) Good 22 Good Acceptable Acceptable 4.4. The ID3 Decision Tree −P log (P ) Fail 2 Fail = 46 Data Mining: A prediction for Student's Performance Using Classification Method To determine the best attribute for a particular node in the Performance = Good, Department = Scientific Mathematics tree we use the measure called Information Gain. The then Final Grade = Very Good. information gain, Gain (S, A) of an attribute A, relative to a Case 2 – If Midterm Marks = Excellent, Lab Test Grade = collection of sample S. Good, Student Participate = No, Attendance = Good, Homework = No, Department = Secondary Technical Gain (, S Midterm ) Entropy (S )− Commercial then Final Grade = Very Good. VeryGood Case 3 - If Midterm Marks = Excellent, Lab Test Grade = Excellent Entropy () S − Entropy (S ) − Excellent VeryGood Good, Student Participate = No, Attendance = Good, Excellent VeryGood Homework = No, Department = Secondary Industrial Technical then Final Grade = Very Good. Acceptable Good Entropy () S −− Entropy (S ) Case 4 - If Midterm Mark = Excellent, Lab Test Grade = Good Acceptable S S Good Acceptable Poor, Attendance = Good then Final Grade = Very Good. Case 5 - If Midterm Mark = Excellent, Lab Test Grade = Fail Entropy () S Average, Attendance = Good then Final Grade = Excellent. Fail Fail Case 6 - If Midterm Mark = Excellent, Lab Test Grade = Average, Attendance = Poor then Final Grade = Very Good. Midterm has the highest gain, therefore it is used as the Case 7 - If Midterm Mark = Very Good, Lab Test Grade = root node as shown in figure 4. Good, Homework = No, Seminar Performance = Good, This process goes on until all data classified perfectly or Student Participate = No then Final Grade = Very Good. run out of attributes. The knowledge represented by decision Case 8 - If Midterm Mark = Very Good, Lab Test Grade = tree can be extracted and represented in the form of Good, Homework = No, Seminar Performance = Good, IF-THEN rules as shown in Table 1. Student Participate = No, Department = Scientific The Table 1 discusses 8 cases: Mathematics then Final Grade = Very Good. Case 1 - If Midterm Mark = Excellent, Lab Test Grade = Good, Student Participate = No, Homework = No, Seminar Figure 4. Midterm as root node = World Journal of Computer Application and Technology 2(2): 43-47, 2014 47 Table 1. Rule Set generated by Decision Tree IF Midterm='Excellent' AND LG='Good' AND SP='No' AND HW='No' AND SEM='Good' Dep='Scientific Mathematics' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Good' AND SP='No' AND ATT='Good' AND HW='No' AND Dep=' Secondary Technical Commercial' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Good' AND SP='No' AND ATT='Good' AND HW='No' AND Dep=' Secondary Industrial Technical' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Poor' AND ATT='Good' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Average' AND ATT='Good' THEN FG='Excellent' IF Midterm='Excellent' AND LG='Average' AND ATT='Poor' THEN FG='Very Good' IF Midterm='Very Good' LG='Good' AND HW='No' AND SEM='Good' AND SP='No' THEN FG='Very Good' IF Midterm='Very Good' LG='Good' AND HW='No' AND SEM='Good' AND SP='No' AND Dep='Scientific Mathematics' THEN FG='Very Good' [6] Sonali Agarwal, G. N. Pandey, and M. D. Tiwari, Data 6. Conclusion Mining in Education: Data Classification and Decision Tree Approach, 2012. In this paper, decision tree method is used on student's database to predict the student's performance on the basis of [7] Monika Goyal ,Rajan Vohra2, Applications of Data Mining in Higher Education, 2012. student's database. We use some attribute were collected from the student's database to predict the final grade of [8] P. Ajith, M.S.S.Sai, B. Tejaswi, Evaluation of Student student's. Performance: An Outlier Detection Perspective, 2013. This study will help the student's to improve the student's [9] Varun Kumar, Anupama Chadha, An Empirical Study of the performance, to identify those students which needed special Applications of Data Mining Techniques in Higher Education, attention to reduce failing ration and taking appropriate action at right time. [10] Hongjie Sun, Research on Student Learning Result System based on Data Mining, 2010. [11] Surjeet Kumar Yadav, Brijesh Bharadwaj, and Saurabh Pal, Mining Education Data to Predict Student’s Retention: A REFERENCES comparative Study, 2012. [1] Brijesh Kumar Baradwaj, Saurabh Pal, Data mining: machine [12] Brijesh Kumar Baradwaj, Saurabh Pal, Mining Educational learning, statistics, and databases, 1996. Data to Analyze Students‟ Performance, 2011. [2] Nikhil Rajadhyax, Rudresh Shirwaikar, Data Mining on [13] K.Shanmuga Priya, A.V.Senthil Kumar, Improving the Educational Domain, 2012. Student’s Performance Using Educational Data Mining, [3] Jiawei Han ,Micheline Kamber, Data Mining: Concepts and Techniques, 2nd edition, 2006. [14] Bhise R.B, Thorat S.S, Supekar A.K, Importance of Data Mining in Higher Education System, 2013. [4] Alaa El-Halees, Mining Students Data to Analyze Learning Behavior: A Case Study, 2008. [15] Varun Kumar, Anupama Chadha, Mining Association Rules in Student’s Assessment Data, 2012. [5] Mohammed M. Abu Tair, Alaa M. El-Halees, Mining Educational Data to Improve Students’ Performance: A Case [16] Pallamreddy.venkatasubbareddy, Vuda Sreenivasarao, The Study, 2012. Result Oriented Process for Students Based On Distributed Data Mining, 2010. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png World Journal of Computer Application and Technology Unpaywall

Data Mining: A prediction for Student's Performance Using Classification Method

World Journal of Computer Application and TechnologyFeb 1, 2014

Loading next page...
 
/lp/unpaywall/data-mining-a-prediction-for-student-s-performance-using-kx5hlWvOoM

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Unpaywall
ISSN
2331-4990
DOI
10.13189/wjcat.2014.020203
Publisher site
See Article on Publisher Site

Abstract

World Journal of Computer Application and Technology 2(2): 43-47, 2014 http://www.hrpub.org DOI: 10.13189/wjcat.2014.020203 Data Mining: A prediction for Student's Performance Using Classification Method 1 2,* Abeer Badr El Din Ahmed , Ibrahim Sayed Elaraby Lecturer at Sadat Academy, Computer Science Department, Cairo, Egypt Demonstrator at Higher Institute for Specific Studies, Management Information System Department, Cairo, Egypt *Corresponding Author: [email protected] Copyright © 2014 Horizon Research Publishing All rights reserved. Abstract Currently the amount huge of data stored in dimensions, categorize it and summarize the relationships educational database these database contain the useful which are identified during the mining process. information for predict of students performance. The most Brijesh Kumar Baradwaj and Saurabh Pal (2011) [1] useful data mining techniques in educational database is describes the main objective of higher education institutions classification. In this paper, the classification task is used to is to provide quality education to its students. One way to predict the final grade of students and as there are many achieve highest level of quality in higher education system is approaches that are used for data classification, the decision by discovering knowledge for prediction regarding tree (ID3) method is used here. enrolment of students in a particular course, detection of abnormal values in the result sheets of the students, Keywords Educational Data Mining (EDM), prediction about students’ performance and so on, the Classification, Knowledge Discovery in Database (KDD), classification task is used to evaluate student’s performance ID3 Algorithm. and as there are many approaches that are used for data classification, the decision tree method is used here. Alaa El-Halees (2009) [4] applied the educational data mining concerns with developing methods for discovering knowledge from data that come from educational 1. Introduction environment. used educational data mining to analyze learning behavior. Student’s data has been collected from The advent of information technology in various fields has Database course. After preprocessing the data, we applied lead the large volumes of data storage in various formats like data mining techniques to discover association, classification, records, files, documents, images, sound, videos, scientific clustering and outlier detection rules. In each of these four data and many new data formats. The data collected from tasks, we extracted knowledge that describes students' different applications require proper method of extracting behavior. knowledge from large repositories for better decision making. Mohammed M. Abu Tair and Alaa M. El-Halees (2012) [5] Knowledge discovery in databases (KDD), often called data applied the educational data mining concerns with mining, aims at the discovery of useful information from developing methods for discovering knowledge from data large collections of data [1]. The main functions of data that come from educational domain. used educational data mining are applying various methods and algorithms in order mining to improve graduate students’ performance, and to discover and extract patterns of stored data [2]. overcome the problem of low grades of graduate students The main objective of this paper is to use data mining and try to extract useful knowledge from graduate students methodologies to study student’s performance in end data collected from the college of Science and Technology. General appreciation. Data mining provides many tasks that The data include fifteen years period [1993-2007]. After could be used to study the student performance. In this preprocessing the data, we applied data mining techniques to research, the classification task is used to evaluate student's discover association, classification, clustering and outlier performance and as there are many approaches that are used detection rules. In each of these four tasks, we present the for data classification, the decision tree method is used here. extracted knowledge and describe its importance in educational domain. Sonali Agarwal, G. N. Pandey, and M. D. Tiwari (2012) [6] 2. Related Work describes the educational organizations are one of the Han and Kamber (1996) [3] describes data mining important parts of our society and playing a vital role for software that allow the users to analyze data from different growth and development of any nation. Data Mining is an 44 Data Mining: A prediction for Student's Performance Using Classification Method emerging technique with the help of this one can efficiently Clustering, Regression, Artificial Intelligence, Neural learn with historical data and use that knowledge for Networks, Association Rules, Decision Trees, Genetic predicting future behavior of concern areas. Growth of Algorithm, Nearest Neighbor method etc., are used for current education system is surely enhanced if data mining knowledge discovery from databases. These techniques and has been adopted as a futuristic strategic management tool. methods in data mining need brief mention to have better The Data Mining tool is able to facilitate better resource understanding. utilization in terms of student performance, course development and finally the development of nation's education related standards. Monika Goyal and Rajan Vohra (2012) [7] applied data mining techniques to improve the efficiency of higher education institution. If data mining techniques such as clustering, decision tree and association are applied to higher education processes, it would help to improve students’ performance, their life cycle management, selection of courses, to measure their retention rate and the grant fund management of an institution. This is an approach to examine the effect of using data mining techniques in higher education. Surjeet Kumar Yadav, Brijesh Bharadwaj, and Saurabh Figure 1. The Steps of Extracting Knowledge from Data Pal (2012) [11] used decision tree classifiers are studied and the experiments are conducted to find the best classifier for 3.1. Classification retention data to predict the student’s drop-out possibility. Brijesh Kumar Baradwaj and Saurabh Pal (2011) [12] Classification is the most commonly applied data mining Used the classification task on student database to predict the technique, which employs a set of pre-classified examples to students division on the basis of previous database. develop a model that can classify the population of records at K.Shanmuga Priya and A.V.Senthil Kumar (2013) [13] large. This approach frequently employs decision tree or applied a Classification Technique in Data Mining to neural network-based classification algorithms. The data improve the student's performance and help to achieve the classification process involves learning and classification. In goal by extracting the discovery of knowledge from the end Learning the training data are analyzed by classification semester mark. algorithm. In classification test data are used to estimate the Bhise R.B, Thorat S.S and Supekar A.K. (2013) [14] used accuracy of the classification rules. If the accuracy is data mining process in a student’s database using K-means acceptable the rules can be applied to the new data tuples [1]. clustering algorithm to predict students result. In our case study we used ID3 decision tree to represent Varun Kumar and Anupama Chadha (2013) [15] used of logical rules of student final grade. one of the data mining technique called association rule mining in enhancing the quality of students’ performances at 3.2. Clustering Post Graduation level. Clustering is finding groups of objects such that the Pallamreddy.venkatasubbareddy and Vuda Sreenivasarao objects in one group will be similar to one another and (2010) [16] explained the Decision trees are commonly used different from the objects in another group. In educational in operations research, specifically in decision analysis, to data mining, clustering has been used to group students help identify a strategy most likely to reach a goal and use of according to their behavior. According to clustering, clusters decision trees is as a descriptive means for calculating distinguish student‘s performance according to their conditional probabilities. behavior and activates. In this paper, students are clustered into three groups according to their academics, punctuality, exams and soon [8]. 3. Data Mining Definition and Techniques 3.3. Association rule Data mining refers to extracting or "mining" knowledge Association analysis is the discovery of association rules from large amounts of data [3]. Data mining techniques are showing attribute-value conditions that occur frequently used to operate on large volumes of data to discover hidden together in a given set of data. Association analysis is widely patterns and relationships helpful in decision making [1]. used for market basket or transaction data analysis [9]. The sequences of steps identified in extracting knowledge from data are: shown in Figure 1. Various algorithms and techniques like Classification, 3.4. Decision Trees World Journal of Computer Application and Technology 2(2): 43-47, 2014 45 Decision trees are commonly used in operations research, The basic idea of ID3 algorithm is to construct the specifically in decision analysis, to help identify a strategy decision tree by employing a top-down, greedy search most likely to reach a goal [10]. through the given sets to test each attribute at every tree node. In order to select the attribute that is most useful for classifying a given sets, we introduce a metric - information gain. To find an optimal way to classify a learning set we 4. Data Mining Process need some function which provides the most balanced splitting. The information gain metric is such a function. 4.1. Data Preparations Given a data table that contains attributes and class of the attributes, we can measure homogeneity of the table based on The data set used in this study was obtained from a the classes. The index used to measure degree of impurity is student's database used in one of the educational institutions, Entropy [2]. on the sampling method of Information system department The Entropy is calculated as follows: from session 2005 to 2010. Initially size of the data is 1547 records. In this step data stored in different tables was joined Entropy ∑−P log P j 2 j in a single table after joining process errors were removed. Splitting criteria used for splitting of nodes of the tree is 4.2. Data selection and transformation Information gain. To determine the best attribute for a particular node in the tree we use the measure called In this step only those fields were selected which were Information Gain. The information gain, Gain (S, A) of an required for data mining. A few derived variables were attribute A, relative to a collection of examples S, is defined selected. While some of the information for the variables as: was extracted from the database. All the predictor and response variables which were derived from the database are given in Figure 2. ∑ Gain (S , A ) Entropy (S)− Entropy (S ) v∈Values () A 5. Results and Discussion The data set used in this study was obtained from a student's database used in one of the educational institutions, on the sampling method of Information system department from session 2005 to 2010. Initially size of the data is 1548 records are given in Figure 3. Figure 2. Student Related Variables 4.3. Decision Tree Figure 3. Data Set A decision tree is a flow-chart-like tree structure, where each internal node is denoted by rectangles, and leaf nodes To work out the information gain for A relative to S, we are denoted by ovals. All internal nodes have two or more first need to calculate the entropy of S. Here S is a set of 1547 child nodes. All internal nodes contain splits, which test the examples are 292 " Excellent ", 536 "Very Good", 477 value of an expression of the attributes. Arcs from an internal "Good", 188 "Acceptable" and 54 "Fail". node to its children are labeled with distinct outcomes of the test. Each leaf node has a class label associated with it [11]. Entropy () S = −P log (PP )− log (P ) Excellent Excellent VeryGood VeryGood −− P log (PP ) log (P ) Good 22 Good Acceptable Acceptable 4.4. The ID3 Decision Tree −P log (P ) Fail 2 Fail = 46 Data Mining: A prediction for Student's Performance Using Classification Method To determine the best attribute for a particular node in the Performance = Good, Department = Scientific Mathematics tree we use the measure called Information Gain. The then Final Grade = Very Good. information gain, Gain (S, A) of an attribute A, relative to a Case 2 – If Midterm Marks = Excellent, Lab Test Grade = collection of sample S. Good, Student Participate = No, Attendance = Good, Homework = No, Department = Secondary Technical Gain (, S Midterm ) Entropy (S )− Commercial then Final Grade = Very Good. VeryGood Case 3 - If Midterm Marks = Excellent, Lab Test Grade = Excellent Entropy () S − Entropy (S ) − Excellent VeryGood Good, Student Participate = No, Attendance = Good, Excellent VeryGood Homework = No, Department = Secondary Industrial Technical then Final Grade = Very Good. Acceptable Good Entropy () S −− Entropy (S ) Case 4 - If Midterm Mark = Excellent, Lab Test Grade = Good Acceptable S S Good Acceptable Poor, Attendance = Good then Final Grade = Very Good. Case 5 - If Midterm Mark = Excellent, Lab Test Grade = Fail Entropy () S Average, Attendance = Good then Final Grade = Excellent. Fail Fail Case 6 - If Midterm Mark = Excellent, Lab Test Grade = Average, Attendance = Poor then Final Grade = Very Good. Midterm has the highest gain, therefore it is used as the Case 7 - If Midterm Mark = Very Good, Lab Test Grade = root node as shown in figure 4. Good, Homework = No, Seminar Performance = Good, This process goes on until all data classified perfectly or Student Participate = No then Final Grade = Very Good. run out of attributes. The knowledge represented by decision Case 8 - If Midterm Mark = Very Good, Lab Test Grade = tree can be extracted and represented in the form of Good, Homework = No, Seminar Performance = Good, IF-THEN rules as shown in Table 1. Student Participate = No, Department = Scientific The Table 1 discusses 8 cases: Mathematics then Final Grade = Very Good. Case 1 - If Midterm Mark = Excellent, Lab Test Grade = Good, Student Participate = No, Homework = No, Seminar Figure 4. Midterm as root node = World Journal of Computer Application and Technology 2(2): 43-47, 2014 47 Table 1. Rule Set generated by Decision Tree IF Midterm='Excellent' AND LG='Good' AND SP='No' AND HW='No' AND SEM='Good' Dep='Scientific Mathematics' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Good' AND SP='No' AND ATT='Good' AND HW='No' AND Dep=' Secondary Technical Commercial' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Good' AND SP='No' AND ATT='Good' AND HW='No' AND Dep=' Secondary Industrial Technical' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Poor' AND ATT='Good' THEN FG='Very Good' IF Midterm='Excellent' AND LG='Average' AND ATT='Good' THEN FG='Excellent' IF Midterm='Excellent' AND LG='Average' AND ATT='Poor' THEN FG='Very Good' IF Midterm='Very Good' LG='Good' AND HW='No' AND SEM='Good' AND SP='No' THEN FG='Very Good' IF Midterm='Very Good' LG='Good' AND HW='No' AND SEM='Good' AND SP='No' AND Dep='Scientific Mathematics' THEN FG='Very Good' [6] Sonali Agarwal, G. N. Pandey, and M. D. Tiwari, Data 6. Conclusion Mining in Education: Data Classification and Decision Tree Approach, 2012. In this paper, decision tree method is used on student's database to predict the student's performance on the basis of [7] Monika Goyal ,Rajan Vohra2, Applications of Data Mining in Higher Education, 2012. student's database. We use some attribute were collected from the student's database to predict the final grade of [8] P. Ajith, M.S.S.Sai, B. Tejaswi, Evaluation of Student student's. Performance: An Outlier Detection Perspective, 2013. This study will help the student's to improve the student's [9] Varun Kumar, Anupama Chadha, An Empirical Study of the performance, to identify those students which needed special Applications of Data Mining Techniques in Higher Education, attention to reduce failing ration and taking appropriate action at right time. [10] Hongjie Sun, Research on Student Learning Result System based on Data Mining, 2010. [11] Surjeet Kumar Yadav, Brijesh Bharadwaj, and Saurabh Pal, Mining Education Data to Predict Student’s Retention: A REFERENCES comparative Study, 2012. [1] Brijesh Kumar Baradwaj, Saurabh Pal, Data mining: machine [12] Brijesh Kumar Baradwaj, Saurabh Pal, Mining Educational learning, statistics, and databases, 1996. Data to Analyze Students‟ Performance, 2011. [2] Nikhil Rajadhyax, Rudresh Shirwaikar, Data Mining on [13] K.Shanmuga Priya, A.V.Senthil Kumar, Improving the Educational Domain, 2012. Student’s Performance Using Educational Data Mining, [3] Jiawei Han ,Micheline Kamber, Data Mining: Concepts and Techniques, 2nd edition, 2006. [14] Bhise R.B, Thorat S.S, Supekar A.K, Importance of Data Mining in Higher Education System, 2013. [4] Alaa El-Halees, Mining Students Data to Analyze Learning Behavior: A Case Study, 2008. [15] Varun Kumar, Anupama Chadha, Mining Association Rules in Student’s Assessment Data, 2012. [5] Mohammed M. Abu Tair, Alaa M. El-Halees, Mining Educational Data to Improve Students’ Performance: A Case [16] Pallamreddy.venkatasubbareddy, Vuda Sreenivasarao, The Study, 2012. Result Oriented Process for Students Based On Distributed Data Mining, 2010.

Journal

World Journal of Computer Application and TechnologyUnpaywall

Published: Feb 1, 2014

There are no references for this article.