diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..5b6a065 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +.Rproj.user +.Rhistory +.RData +.Ruserdata diff --git a/Finalupload.Rproj b/Finalupload.Rproj new file mode 100644 index 0000000..8e3c2eb --- /dev/null +++ b/Finalupload.Rproj @@ -0,0 +1,13 @@ +Version: 1.0 + +RestoreWorkspace: Default +SaveWorkspace: Default +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: Sweave +LaTeX: pdfLaTeX diff --git a/HUDK4050_Notes-JoonyoungPark.csv b/HUDK4050_Notes-JoonyoungPark.csv new file mode 100644 index 0000000..83f49e9 --- /dev/null +++ b/HUDK4050_Notes-JoonyoungPark.csv @@ -0,0 +1 @@ +Key Item Type Publication Year Author Title Publication Title ISBN ISSN DOI Url Abstract Note Date Date Added Date Modified Access Date Pages Num Pages Issue Volume Number Of Volumes Journal Abbreviation Short Title Series Series Number Series Text Series Title Publisher Place Language Rights Type Archive Archive Location Library Catalog Call Number Extra Notes File Attachments Link Attachments Manual Tags Automatic Tags Editor Series Editor Translator Contributor Attorney Agent Book Author Cast Member Commenter Composer Cosponsor Counsel Interviewer Producer Recipient Reviewed Author Scriptwriter Words By Guest Number Edition Running Time Scale Medium Artwork Size Filing Date Application Number Assignee Issuing Authority Country Meeting Name Conference Name Court References Reporter Legal Status Priority Numbers Programming Language Version System Code Code Number Section Session Committee History Legislative Body HH7FVM4V journalArticle 2010 "Bowers, Alex J." "Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping out and Hierarchical Cluster Analysis" "Practical Assessment, Research & Evaluation" 1531-7714 "School personnel currently lack an effective method to pattern and visually interpret disaggregated achievement data collected on students as a means to help inform decision making. This study, through the examination of longitudinal K-12 teacher assigned grading histories for entire cohorts of students from a school district (n=188), demonstrates a novel application of hierarchical cluster analysis and pattern visualization in which all data points collected on every student in a cohort can be patterned, visualized and interpreted to aid in data driven decision making by teachers and administrators. Additionally, as a proof-of-concept study, overall schooling outcomes, such as student dropout or taking a college entrance exam, are identified from the data patterns and compared to past methods of dropout identification as one example of the usefulness of the method. Hierarchical cluster analysis correctly identified over 80% of the students who dropped out using the entire student grade history patterns from either K-12 or K-8. (Contains 5 figures.)" 2010-05 9/13/16 16:06 9/13/16 16:06 9/24/14 19:31 7 15 Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students en ERIC "" http://eric.ed.gov/?id=EJ933686 data; data analysis; Decision Making; Dropouts; Elementary School Students; Grades (Scholastic); Identification; MULTIVARIATE analysis; School Districts; Secondary School Students UPSHGMV9 journalArticle 2014 "Grunspan, Daniel Z.; Wiggins, Benjamin L.; Goodreau, Steven M." Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research CBE-Life Sciences Education ", 1931-7913" 10.1187/cbe.13-08-0162 http://www.lifescied.org/content/13/2/167 "Social interactions between students are a major and underexplored part of undergraduate education. Understanding how learning relationships form in undergraduate classrooms, as well as the impacts these relationships have on learning outcomes, can inform educators in unique ways and improve educational reform. Social network analysis (SNA) provides the necessary tool kit for investigating questions involving relational data. We introduce basic concepts in SNA, along with methods for data collection, data processing, and data analysis, using a previously collected example study on an undergraduate biology classroom as a tutorial. We conduct descriptive analyses of the structure of the network of costudying relationships. We explore generative processes that create observed study networks between students and also test for an association between network position and success on exams. We also cover practical issues, such as the unique aspects of human subjects review for network studies. Our aims are to convince readers that using SNA in classroom environments allows rich and informative analyses to take place and to provide some initial tools for doing so, in the process inspiring future educational studies incorporating relational data." 6/20/14 9/13/16 16:06 9/13/16 16:06 8/20/14 20:21 167-178 2 13 CBE Life Sci Educ Understanding Classrooms through Social Network Analysis en www.lifescied.org "

Which would yield better SNA result and better achievement for final projects: Group defined by professor or group made by students? I feel like group defined by students would yield more extreme performances while students who were grouped by professor would be showing more average performances between group.

Actor=nodes

Unipartite= one mode - one type of actor/ bipartite=twomode - actor in the group

ties between actor/ bidirectional = undirected

density= how much network is connected between each other more the dense

how many ties/who is connected with whom?

homophilly= connected with similar interest with disproponately (based from social selection - finding for A student and social influence - studying with A student)

triad and transitivity

Centerality(degree,closeness,betweeness, eigenvector)

Degree centrality - number of connectionsnot significant result turned out but it will be found out with different methods in the future

" http://www.lifescied.org/content/13/2/167 Week 2 JXPD425U blogPost 2014 "Young, Jeffrey R." Why Students Should Own Their Educational Data The Chronicle of Higher Education Blogs: Wired Campus http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329 8/21/14 9/13/16 16:06 9/13/16 16:06 8/23/14 21:32 "

 Especially in the context like mooc, it becomes meaningless to find the average learner because there are so many different level students coming into the program and it would not be accurate enough to average them. To some it would be too easy, and too others it would be too hard. But since Mooc has to show their result, at their initial stages, they might have to use the averaged data (which would mean nothing).

" http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329 P9N8HC2I journalArticle 1994 "Corbett, Albert T.; Anderson, John R." Knowledge tracing: Modeling the acquisition of procedural knowledge User Modeling and User-Adapted Interaction "0924-1868, 1573-1391" 10.1007/BF01099821 http://link.springer.com.ezp-prod1.hul.harvard.edu/article/10.1007/BF01099821 "This paper describes an effort to model students' changing knowledge state during skill acquisition. Students in this research are learning to write short programs with the ACT Programming Tutor (APT). APT is constructed around a production rule cognitive model of programming knowledge, called theideal student model. This model allows the tutor to solve exercises along with the student and provide assistance as necessary. As the student works, the tutor also maintains an estimate of the probability that the student has learned each of the rules in the ideal model, in a process calledknowledge tracing. The tutor presents an individualized sequence of exercises to the student based on these probability estimates until the student has ‘mastered’ each rule. The programming tutor, cognitive model and learning and performance assumptions are described. A series of studies is reviewed that examine the empirical validity of knowledge tracing and has led to modifications in the process. Currently the model is quite successful in predicting test performance. Further modifications in the modeling process are discussed that may improve performance levels." 12/1/94 9/13/16 16:06 9/13/16 16:06 4/21/13 21:21 253-278 4 4 User Model User-Adap Inter Knowledge tracing en link.springer.com.ezp-prod1.hul.harvard.edu "

How to achieve mastery learning

1) domain knowledge is analyzed and stored in the hierarhical way

2)Learning experience are requisite for tackling the higher concept

cognitive model of skill acqusitioin, goal of mastery of learning in mind

  1. The ACT programming tutor
  2. The cognitive model
  3. Knowledge tracing and mastery learning
  4. Empirical evaluation of Knowledge tracing
  5. Future directions
" "Education (general); empirical validity; individual differences; intelligent tutoring systems; Learning; Management of Computing and Information Systems; mastery learning; Multimedia Information Systems; procedural knowledge; Psychology, general; student modeling; User Interfaces and Human Computer Interaction" ZM5MZSWN conferencePaper 2012 "Siemens, George; Baker, Ryan S. J. d." Learning Analytics and Educational Data Mining: Towards Communication and Collaboration Proceedings of the 2Nd International Conference on Learning Analytics and Knowledge 978-1-4503-1111-3 10.1145/2330601.2330661 http://doi.acm.org/10.1145/2330601.2330661 "Growing interest in data and analytics in education, teaching, and learning raises the priority for increased, high-quality research into the models, methods, technologies, and impact of analytics. Two research communities -- Educational Data Mining (EDM) and Learning Analytics and Knowledge (LAK) have developed separately to address this need. This paper argues for increased and formal communication and collaboration between these communities in order to share research, methods, and tools for data mining and analysis in the service of developing both LAK and EDM fields." 2012 9/13/16 16:06 9/13/16 16:06 1/16/15 3:15 252–254 Learning Analytics and Educational Data Mining ACM "New York, NY, USA" ACM Digital Library "

After reading this article, I felt like LAK should be implemented more in children's learning (until elementary school) and EDM model should be implemented in adolescents and adult learning. I felt like at least people at their adolescent age can define what they know and what they do not know, so they might be better at utilizing the resources rather than children in their elementary schools where still fundamental learning has to be done.

;

difference between LAK and EDM

EDM=rooted in tech field, LA=ed field

EDM= Automation/tech driven, LA= using tech as a tool to enhance human judgement

methodological difference

 EDM = machine learning LA=text&SNA

EDM =individual LA= holistic

Challenge of intergrating the tech and ed

" Collaboration; educational data mining; learning analytics and knowledge 8UQJXKBN journalArticle 2008 "Baker, Ryan S. J. d; Corbett, Albert T.; Roll, Ido; Koedinger, Kenneth R." Developing a generalizable detector of when students game the system User Modeling and User-Adapted Interaction "0924-1868, 1573-1391" 10.1007/s11257-007-9045-6 http://link.springer.com.ezp-prod1.hul.harvard.edu/article/10.1007/s11257-007-9045-6 "Some students, when working in interactive learning environments, attempt to “game the system”, attempting to succeed in the environment by exploiting properties of the system rather than by learning the material and trying to use that knowledge to answer correctly. In this paper, we present a system that can accurately detect whether a student is gaming the system, within a Cognitive Tutor mathematics curricula. Our detector also distinguishes between two distinct types of gaming which are associated with different learning outcomes. We explore this detector’s generalizability, and find that it transfers successfully to both new students and new tutor lessons." 8/1/08 9/13/16 16:06 9/13/16 16:06 1/16/15 16:33 287-314 3 18 User Model User-Adap Inter en link.springer.com.ezp-prod1.hul.harvard.edu "

Students who are gaining the system: students who do not work hard to gain information from the course material itself and just trying to gain it by ""gaming the system"" searching through other aspects of the tutorial, quickly asking for help, answering numbers quickly

-> System can detect this!

accurately detects which students game the system

gaming behavior does not happen abruptly: when bored or hard - possibility of creating learning intervention

expands our knowledge about the behavioral constructs of gaming the system

can generalize between the context

" http://link.springer.com.ezp-prod1.hul.harvard.edu/article/10.1007/s11257-007-9045-6 Behavior detection; Cognitive tutors; Education (general); Gaming the system; Generalizable models; Interactive learning environments; Latent response models; Machine learning; Management of Computing and Information Systems; Multimedia Information Systems; student modeling; User Interfaces and Human Computer Interaction GN9JQVJW book 2015 "Zheng, Alice" Evaluating Machine Learning Models http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page "Data science today is a lot like the Wild West: there’s endless opportunity and excitement, but also a lot of chaos and confusion. If you’re new to data science and applied machine learning, evaluating a machine-learning model can seem pretty overwhelming..." 2015-09 9/13/16 16:06 9/13/16 16:06 12/15/15 18:26 O'Reily Media "Sebastopol, CA" "

Evaluation metrics

There are different metrics for the tasks of classification, regression, ranking, clustering, topic modeling, etc.

Classification, regression, and ranking - example of supervised learning

Classification metrics

    Accuracy

accuracy = # of correct prediction/ # total data points

    Confusion Matrix

False postivie: when doctor diagnose cancer which individual actually do not have

False negative: call for not having a cancer but they actually have one

    Per-Class Accuracy

average of accuracy for each class

if classes have extremely different result inside -> problem

    Log-loss

If the data set is not binary number data, we can use the log-loss. This is gauge of confidence/ decision boundary: 0.5/ information-theoretic measure to gauge the extra noise

    AUC

area under the curve/Curve is ROC (Receiver operating characteristic curve) = sensitivity of the classifier by plotting the rate of true positives to the rate of false positive/ how many correct positive classification can be grained as allow more false positive

Ranking metrics

Binary classification/ used for Internet search & personalized recommendation

    Precision-recall

Also popular classification tasks

Actually two different metric but grouped into one Correct answers comes from overlap of Returned by the ranker/ classifier and Relavant

    Precision recall curve and the F1 score

if either precision or recall is small then F1 score is small

    NDGG

making the top research result meaningful! Normailized discounted cumulative gain

Regression metrics

Predict numeric scores (predict stock, predict user's rating for the item)

    RMSE

root mean square error

    Quantiles of errors

catches the outliers

    Precision- recall curve and F1 Score

    Almost correct predictions

in the range?!?!

Caution: The difference between training metrics and evaluation metrics

Avoid as much as possible/ do not make them work they are not programmed to

Caution: Skewed Datasets - Imbalanced classes, outliers, and rare data

imbalanced data, data skew, outlier -> conversation with pros maybe?

  

" http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page 9GG36PBC blogPost 2015 "Leong, B; Polonetsky, J" Why Opting Out of Student Data Collection Isn’t the Solution EdSurge https://www.edsurge.com/news/2015-03-16-why-opting-out-of-student-data-collection-isn-t-the-solution "In every privacy debate across every industry, the same questions arise about the rights of individuals to “opt-out” of their data being collected or used. So it should come as no surprise that the “when” and “how” of parent and student opt-outs of education data collection or use has become a robust" 3/16/15 9/13/16 16:06 9/13/16 16:06 1/16/16 16:31 "

First of all, it is great that California took the initiative to

" https://www.edsurge.com/news/2015-03-16-why-opting-out-of-student-data-collection-isn-t-the-solution ARS28RVG videoRecording 2015 Educause Why Is Measuring Learning So Difficult? https://www.youtube.com/watch?v=_iv8A1pHNYA Several higher education learning and assessment professionals discuss the difficulties of measuring learning. 8/17/15 9/13/16 16:06 9/13/16 16:06 1/17/16 18:50 YouTube "

Don't know where a student is starting from

cant measure all the ways that people learn

based on proxies, but proxies change based on context not just individual

we have to simplify but may throw away the signal and keep the noise

 

;

In order to get the pure data we to much define and restrict the setting to get the data which could not be generalizable

defining learning experience is difficult (what we are trying to measure) ; even defining what we are measuring can be difficult.

learning is self experience and it needs to be engaged by individual themselves; motivation is the key

competency before learning experience can be different; especially in MOOC settings

learning as achievement or learning as a self efficacy

 

" Assessment; Education; educational assessment; EDUCAUSE; Higher Education; learners; Learning; Teaching and learning 470 seconds QAIHSVGC webpage 2016 "Weinersmith, Zach" Saturday Morning Breakfast Cereal http://www.smbc-comics.com/index.php?id=3978 1/5/16 9/13/16 16:06 9/13/16 16:06 1/18/16 18:17 "

The SMBC comic reminded me of what will happen if we force our students to learn how to code and its’ consequences. But aren’t we already living in a world like this? We live in a world assuming everything is run by accurate, but we are also living in a society where sending a divorce therapist commercial email on one’s wedding anniversary based on the “big data.”

 

" http://www.smbc-comics.com/index.php?id=3978 A9Z2DGUR conferencePaper 2014 "Clow, Doug" Data wranglers: human interpreters to help close the feedback loop Proceedings of the Fourth International Conference on Learning Analytics And Knowledge 2014 9/13/16 16:06 9/13/16 16:06 49–53 ACM "

Wrangle means to attempt to deal with or understand something contend or struggle.

(Contend means to strive in opposition or against difficulties (such as race,competition, debate))

Data wranglers should be actively engage in data analyzing  to help students who are taking the course on real time, not afterwards. Like the ""New classroom"" we visited at the start of this month!

The article keep mentioned about substantial organizational change with real time information input from data wranglers, but now, it would be little bit hard for this to happen because many people do not know the importance of Data Wrangler! Many possible job opportunities out there?!?!

" EHIHJ586 magazineArticle 2015 "Kucirkova, Natalia; FitzGerald, Elizabeth" Zuckerberg is ploughing billions into 'personalised learning' – why? The Conversation http://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940 "Zuckerburg wants to plough billions into personalised learning, but his way may not be the right way." 12/9/15 9/13/16 16:06 9/13/16 16:06 1/18/16 19:14 "

I think if we were to implement the personalized learning through technology, we should redefine the role of school and teachers at school. It is great that all of the children gets the personalized learning, but I think we should consider what would be the end point of implementing different learning materials to students. One of the main role of the school is to teach and make students understand until the certain level of achievement not all students to do exceedingly well in every subject. If this personalized learning is implemented, teachers have to be prepared for all level of experties in his/her subject. This would not replacing human work unless there is complex algorithm to pinpoint the student's weakness.

" https://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940 QAK2RGH6 videoRecording 2015 Georgia Tech Feature Selection https://www.youtube.com/watch?v=8CpRLplmdqE 2/23/15 9/13/16 16:06 9/13/16 16:06 1/18/16 19:18 Youtube "

Interpretability and insight of feature - thinking about the user

curse of dimensionality  more feature, more contents making every one's life easier

conclusion: make some of the data less/Feature less and data more

 

" https://www.youtube.com/watch?v=8CpRLplmdqE Udacity 3:13 ZQRA7ET6 bookSection 2016 "Hanneman, R.A.; Riddle, M." Chapter 1: Social Network Data Introduction to Social Network Methods http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html 1/18/16 9/13/16 16:06 9/13/16 16:06 1/18/16 20:17 "

Social network data

Introduction: What is different about social network data?

Nodes

    Populations, samples, and boundaries

    Modality and level analysis

Relations

    Sampling ties

    Multiple relations

Scales of measurement

A note on statistics and social network data

" http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html IWXICZM8 webpage 2014 "Groelmund, Garrett" RStudio Cheat Sheets RStudio https://www.rstudio.com/resources/cheatsheets/ 8/1/14 9/13/16 16:06 9/13/16 16:06 1/19/16 21:17 http://shiny.rstudio.com/articles/rm-cheatsheet.html VUQED3Z6 conferencePaper 2013 "san Pedro, Maria Ofelia; Baker, Ryan; Bowers, Alex; Heffernan, Neil" Predicting college enrollment from student interaction with an intelligent tutoring system in middle school Educational Data Mining 2013 2013 9/13/16 16:06 9/13/16 16:06 "

Having a higher aiming goal starting from the middle school help students to achieve higher (need to care for their own grades in order to go to the college)

The researchers used ASSISTment system 04~05, 06~07

(seems like ASSISTment used the theory of Vygotsky, to scaffold!)

Student knowledge estimated by Bayesian Knowledge Tracing

the program could predict students' college enrollment 68.6% of the time (boredom, confusion, and slip/careness are he significant determinent in college enrollment

Possibility of interaction log?

" M369Z5EF journalArticle 2012 "Greller, Wolfgang; Drachsler, Hendrik" Translating Learning into Numbers: A Generic Framework for Learning Analytics Journal of Educational Technology & Society 1176-3647 http://www.jstor.org/stable/jeductechsoci.15.3.42 "ABSTRACT With the increase in available educational data, it is expected that Learning Analytics will become a powerful means to inform and support learners, teachers and their institutions in better understanding and predicting personal learning needs and performance. However, the processes and requirements behind the beneficial application of Learning and Knowledge Analytics as well as the consequences for learning and teaching are still far from being understood. In this paper, we explore the key dimensions of Learning Analytics (LA), the critical problem zones, and some potential dangers to the beneficial exploitation of educational data. We propose and discuss a generic design framework that can act as a useful guide for setting up Learning Analytics services in support of educational practice and learner guidance, in quality assurance, curriculum development, and in improving teacher effectiveness and efficiency. Furthermore, the presented article intends to inform about soft barriers and limitations of Learning Analytics. We identify the required skills and competences that make meaningful use of Learning Analytics data possible to overcome gaps in interpretation literacy among educational stakeholders. We also discuss privacy and ethical issues and suggest ways in which these issues can be addressed through policy guidelines and best practice examples." 2012 9/13/16 16:06 9/13/16 16:06 9/3/16 18:55 42-57 3 15 Journal of Educational Technology & Society Translating Learning into Numbers JSTOR

It is a how to for Learning Analytics

6 dimensions to do Learning Analytics correctly

Stake holders

Objective

Data

Instruments

External Limitations

Internal limitations

TEL: Technology Enhanced Learning

We should be aware that not all students put their full effort to TEL learning program which might distort the data analysis. Or do they turn out to mean something?

 

 

UPNATQ2U bookSection 2006 "Kay, Judy; Maisonneuve, Nicolas; Yacef, Kalina; Reimann, Peter" The Big Five and Visualisations of Team Work Activity Intelligent Tutoring Systems 978-3-540-35159-7 978-3-540-35160-3 http://link.springer.com/chapter/10.1007/11774303_20 "We have created a set of novel visualisations of group activity: they mirror activity of individuals and their interactions, based upon readily available authentic data. We evaluated these visualisations in the context of a semester long software development project course. We give a theoretical analysis of the design of our visualizations using the framework from the “Big 5” theory of team work as well as a qualitative study of the visualisations and the students’ reflective reports. We conclude that these visualisations provide a powerful and valuable mirroring role with potential, when well used, to help groups learn to improve their effectiveness." 6/26/06 9/13/16 16:06 9/13/16 16:06 9/3/16 19:10 197-206 Springer Berlin Heidelberg en ©2006 Springer-Verlag Berlin Heidelberg link.springer.com "

Computer supported collaborative learning (CSCL)/ team work do not visualized in the online learning setting

Big five components of team work

leader assigns how many ticket to use and people interact with ticket activity

Facilitate teem problem solving

provide performance expectations and acceptable interaction patterns

Synchronize and combine individual team member contributions see and evaluate information that affects team functioning

Measured with interaction networks

low level of interaciton, low level of monitor

Identifying mistakes and lapses in other team members' actions

Shift workload in the high pressure settings

Recognition of a workload distribution problem in the team

shifting work to under utilized members

It does not mean if it is not there the team is not successful

Identify cues of change, assign meaning to it, develop a new plan to deal with it.

Degree of involvment, completion of milestone

Increase task involvment, information sharing, strategising and goal settings

Coordinating mechanisms

#feedback

Visualization tools

" https://link.springer.com/chapter/10.1007/11774303_20 Artificial Intelligence (incl. Robotics); Computers and Education; Information Systems Applications (incl. Internet); Multimedia Information Systems; User Interfaces and Human Computer Interaction "Ikeda, Mitsuru; Ashley, Kevin D.; Chan, Tak-Wai" X839V4H2 journalArticle 2015 "Konstan, Joseph A.; Walker, J. D.; Brooks, D. Christopher; Brown, Keith; Ekstrand, Michael D." Teaching Recommender Systems at Large Scale: Evaluation and Lessons Learned from a Hybrid MOOC ACM Trans. Comput.-Hum. Interact. 1073-0516 10.1145/2728171 http://doi.acm.org/10.1145/2728171 2015-04 9/13/16 16:06 9/13/16 16:06 9/3/16 20:38 10:1–10:23 2 22 Teaching Recommender Systems at Large Scale ACM Digital Library "" learning assessment; Massively Online Open Course (MOOC) GKQK5NA4 bookSection 2013 "Desmarais, Michel C.; Naceur, Rhouma" A Matrix Factorization Method for Mapping Items to Skills and for Enhancing Expert-Based Q-Matrices Artificial Intelligence in Education 978-3-642-39111-8 978-3-642-39112-5 http://link.springer.com/chapter/10.1007/978-3-642-39112-5_45 "Uncovering the right skills behind question items is a difficult task. It requires a thorough understanding of the subject matter and of the cognitive factors that determine student performance. The skills definition, and the mapping of item to skills, require the involvement of experts. We investigate means to assist experts for this task by using a data driven, matrix factorization approach. The two mappings of items to skills, the expert on one side and the matrix factorization on the other, are compared in terms of discrepancies, and in terms of their performance when used in a linear model of skills assessment and item outcome prediction. Visual analysis shows a relatively similar pattern between the expert and the factorized mappings, although differences arise. The prediction comparison shows the factorization approach performs slightly better than the original expert Q-matrix, giving supporting evidence to the belief that the factorization mapping is valid. Implications for the use of the factorization to design better item to skills mapping are discussed." 7/9/13 9/13/16 16:06 9/13/16 16:06 9/3/16 20:44 441-450 Springer Berlin Heidelberg en ©2013 Springer-Verlag Berlin Heidelberg link.springer.com "

Mapping items to the skill is really hard (as we already know)

Using ALS (Alternate least-square Factorization), compare between expert defined Q matrix and a factor Q matirx

2 or 3 changes in initial matrix lead to toward change in total ALS system

(The article was hard to understand (T.T))

" https://link.springer.com/chapter/10.1007/978-3-642-39112-5_45 alternating least squares matrix factorization; Artificial Intelligence (incl. Robotics); Cognitive modeling; Computers and Education; Educational Technology; Information Systems Applications (incl. Internet); latent skills; Pedagogic Psychology; skills assessment; Student models; User Interfaces and Human Computer Interaction "Lane, H. Chad; Yacef, Kalina; Mostow, Jack; Pavlik, Philip" 7ICJGX2R book 2015 "Matsuda, Noboru; Furukawa, Tadanobu; Bier, Norman; Faloutsos, Christos" Machine Beats Experts: Automatic Discovery of Skill Models for Data-Driven Online Course Refinement http://eric.ed.gov/?id=ED560513 "How can we automatically determine which skills must be mastered for the successful completion of an online course? Large-scale online courses (e.g., MOOCs) often contain a broad range of contents frequently intended to be a semester's worth of materials; this breadth often makes it difficult to articulate an accurate set of skills and knowledge (i.e., a skill model, or the QMatrix). We have developed an innovative method to discover skill models from the data of online courses. Our method assumes that online courses have a pre-defined skill map for which skills are associated with formative assessment items embedded throughout the online course. Our method carefully exploits correlations between various parts of student performance, as well as in the text of assessment items, to build a superior statistical model that even outperforms human experts. To evaluate our method, we compare our method with existing methods (LFA) and human engineered skill models on three Open Learning Initiative (OLI) courses at Carnegie Mellon University. The results show that (1) our method outperforms human-engineered skill models, (2) skill models discovered by our method are interpretable, and (3) our method is remarkably faster than existing methods. These results suggest that our method provides a significant contribution to the evidence-based, iterative refinement of online courses with a promising scalability. [For complete proceedings, see ED560503.]" 2015-06 9/13/16 16:06 9/13/16 16:06 9/3/16 20:48 Machine Beats Experts International Educational Data Mining Society en ERIC "

Online course programming: Who is better? Computer or human?

To test: LFA (Learning factor analysis) vs human engineered model vs eEpiphany

eEpiphany was the best even when using bag of words (F matrix for collected items)

eEpiphany seems to be very effective tool to revise and give students information which course to take, but is this is only the tool that makes helpful for advising course content?

" http://eric.ed.gov/?id=ED560513 Automation; Comparative Analysis; Correlation; data; Formative Evaluation; models; Online Courses; Skills 83JIGF4U conferencePaper 2008 "Cortez, Paulo; Silva, Alice Maria Gonçalves" Using data mining to predict secondary school student performance Proceedings of 5th Annual Future Business Technology Conference 978-90-77381-39-7 http://repositorium.sdum.uminho.pt/handle/1822/8024 "Although the educational level of the Portuguese population has improved in the last decades, the statistics keep Portugal at Europe’s tail end due to its high student failure rates. In particular, lack of success in the core classes of Mathematics and the Portuguese language is extremely serious. On the other hand, the fields of Business Intelligence (BI)/Data Mining (DM), which aim at extracting high-level knowledge from raw data, offer interesting automated tools that can aid the education domain. The present work intends to approach student achievement in secondary education using BI/DM techniques. Recent real-world data (e.g. student grades, demographic, social and school related features) was collected by using school reports and questionnaires. The two core classes (i.e. Mathematics and Portuguese) were modeled under binary/five-level classification and regression tasks. Also, four DM models (i.e. Decision Trees, Random Forest, Neural Networks and Support Vector Machines) and three input selections (e.g. with and without previous grades) were tested. The results show that a good predictive accuracy can be achieved, provided that the first and/or second school period grades are available. Although student achievement is highly influenced by past evaluations, an explanatory analysis has shown that there are also other relevant features (e.g. number of absences, parent’s job and education, alcohol consumption). As a direct outcome of this research, more efficient student prediction tools can be be developed, improving the quality of education and enhancing school resource management." 2008-04 9/13/16 16:06 9/13/16 16:06 9/4/16 1:23 EUROSIS "Porto, Spain" eng openAccess repositorium.sdum.uminho.pt "

Portuguese have been developed for their education, but still at the tail of the Europe's achievement.

Situation for students achieving level for math and portuguese is serious analyze students' data with Business Intelligence and data mining results

Goal: identify students achievement and find key variables, what makes students pass or fail

Used classification and regression (Binary classifcation, 5 level classification, and regression) and RMiner package in R (Decision trees, random forests, Neural Networks, and support vector machines + With without past grades

Predicted well (failures, G1 grade, G2 grade were the highest predictors)

" http://repositorium.sdum.uminho.pt/handle/1822/8024 5th Annual Future Business Technology Conference NSVPSAKH book 2014 "Baker, R" Big Data in Education 2014 9/13/16 16:06 9/13/16 16:06 "New York, NY" "

1.1 Introduction

Explore big data in education

Called educational data mining or learning analytics

Joint goal of exlploring the big data now availabe on learners and learning

To promote new scientific discoveries and to advance learning sciences

To promote better assessmetns of learners along multiple dimensions

to promote better realtime support for learners

A few words for data miners

may find some classic algorithms aren't well represented like neural networks - they haven't been all that heavily used in EDM and one reason is that overfitting is a plague in the highly context based and not that big data set we use

It is big enough but not in the level of google big

Types of EDM/LA method

Prediction

Develop a model which can infer a single aspect of the data (predicted variables) from some combination of other aspects of the data (predictor variables)

Structure discovery

Find structure and patterns in the data that emerge naturally

Relationship mining

Discover relationships between variables in a dataset with many variables

Discovery with models

Preexisting model/ applied to data and used as a component in another analysis

Why not popular until now?

not enough data hard to scale

now.. mooc!

PSLC Datashop - Actions: entering an equation, manipulating a vector, typing a phrase, requesting help

-responses: error feedback, strategic hints

-Annotations: correctness, time, skill/ concept

 

 

 

 

 

;

1.3 Classifiers, Part 1

Classification

there is something you want to predict (""the label"")

the thing you want to predict is categorical

-answer is one set of categories not a number

-like 0,1

-help requst/ worked example requests/ attempt to solve

-will dropout/ won't drop out

-will enroll in mooc ABCDEFE

Classification: Associated with each label are a set of ""features"" which may be you can use to predict the label

the basic idea of a classifier is to determine which features, in which combination can predict the label

Domain specificity

Specific algorithms work better for specific domains and problems

step regression: not stepwise regression, used for binary classification

fits linear regression fuction- with an arbitrary cut off/ selects parapmeters, assigns a weight to each parameter, computes a numerical value

not prefered by statisticians but it is okay in EDM

Logistic regression

Given a specific set of values of predictor variables

Fits logistic fuction to data to find out the frequency/ odds of a specific value of the dependent variable

good for cases where changes in value of predictor variables have predictable effects on probability of predicted variable class

Logistic and step regression are good when interactions are not particularly common

can be given interaction effects through automated feature distillation

Instead, use decision tree! can be adjusted to split based on more or less evidence, to prune based on more or less predictive power

;

1.4 Classifiers, Part 2

Classifiers

Decision rules- set of if -then rules which you check in order (if, else if )

Many algorithms

Differences are in terms of how rules are generated and selected/ most popular subcatergory repeatedly  creates decision trees and distills best rules

Generating rules from decision tree

1. Create decision tree

2. If there is at least one path that is worth keeping go to 3 else go to 6

3. Take the best single path from root to leaf and make that path a rule

4. Remove all datapoints classified by that rule from data step

5. Go to step 1

6. Take all remaining datapoints

7. find the most common value for those datapoints

8. make an ""otherwise"" rule using that

Relatively conservative - leads to simpler model than most decision trees

Very interpretive model

K*

Predicts a data point from neighboring data points - weights points more strongly if they are near by

good when data is very divergent, might be the only tools to use in some cases (detecting emotions from the log file)

but you need whole data set to perform this method

bagged stumps - lots of trees with only the first feature/relatively conservative close variant is random forest

~~~so far classifiers are conservative; find simple models, don't over fit/ these algorithm more suitable for educational data mining, which means educational data has a lot of noise~~~

SVM: support vector machines

Conducts dimensionality reduction on data space and then fits hyper plane which splits classes

creates very sophisticated models

great for text mining

great for sensor data

not optimal for the most other educational data

Genetic algoriithms

uses mutation, combination, and natural selection to search space of possible models/ can produce inconsistent answers

Neural networks

composes extremely complex realationships through combining perceptrons

svm genetic algorithm and neural networks are great for some but not for al

 

;

6.1 Learning curves

visualization

Displaying information in a meaningful fashion

induce the viewer to think about what it means

avoid distorting what the data have to say

Encourage the eye to compare different pieces of data

Reveal the data at several levels

Visualization

A big area

Assumptoins

The student is practicing the same skill several time in approximately the same fashion/ similar methods and considerations apply to situations where the student is recalling the same knowledge several times

LISP learning curve

Learning curves- power law of learning; performance (both speed and accuracy) improves with a power function

Called power law- speed and accuracy both follow a power curve; radical improvement at first which slows over time towards an asymptote/ passing the asymptote usually involves developing entirely new strategy

Fosbury flop-jump

Making inference from learning curves

via visual inspection of the curve form

corbett&anderson- two skills go at the same time

uses

to study and refine item-skill mappings in educational software

 

 

;

6.3 Scatter plots

visualization

scatter plots

Heat Maps

Parameter Space Maps

Scatterplot - don't scale large data

Heat map - by density, intensity

Parameter space maps - special case of heat maps

used to look at the goodness of various parameters, particularly for BKT

Average graph- literally average graph

;

6.4 State Space Diagrams

Visualizations of all the states that the learning system can have during a problem

State= complete characterization of situations

Also referred to as student learning pathways or interaction networks

game

stage1 stage2 stage3 (Refraction)

keep changing different states and find the success state

Uses

study specific student trajectories

see which paths end of being productive

see which paths are rare (despite being productive)

make recommendation hints to students based on their path

;

7.1 Clustering

Clustering - a type of structure discovery algorithm

want to find what structure there is among the data points and don't know anything about a priori  about the structure

group together!

k means clustering algorithms - simple

centeroids (randomly selected)

vonoi diagram - distribute equally

refit centerioids

convergence do it until fit centeroids

 

;

7.2 Validation and selection

how we choose value for K?

Distortion (Mean squared deviation)

Take each point P

find the centeroid of P's cluster C

Find the distance D from C to P

Square D to get D'

Sum all D' to get distortion

Distance from A to B in two dimensions

Works between randomized restarts/ does not work for choosing cluster size

Distance to nearest cluster center should almost always be smaller with more clusters

It only isn't when you have bad luck in your randomization

Cross validation cannot solve this problem - Different problem than prediction modeling; you are determining whether any center is close to a given point

Solution - Penalize models with more clusters, according to how much extra fit would be expected from the additional clusters

Using an information criterion

Assess how much fit would be spuriously expected from a random N centroids ( without allowing  the centroids to move)

Assess how much fit you actually had

Find the differnce

So how many clusters?

Try several values of K

Find best fitting set of clusters for each value of K

choose best point

 

 

" 1 SE5DIAI6 videoRecording 2016 Georgia Tech Cross Validation https://www.youtube.com/watch?v=sFO2ff-gTh0 9/9/16 9/13/16 16:06 9/13/16 16:06 9/9/16 19:37 Youtube

Predicting values in the testing set

if too much fitting in the train set it would not be generalizable since we kind of assume test set is the representative of our world

IID independently identically distributed - Fundamental assumption for a lot of organism

take some sample from the training data and act like if it is test data and run in order to predict

training data divided into folds pick the lowest error one

https://www.youtube.com/watch?v=sFO2ff-gTh0 8CJ6XRPB webpage 2016 Passing the Privacy Test as Student Data Laws Take Effect (EdSurge News) EdSurge https://www.edsurge.com/news/2016-01-12-passing-the-privacy-test-as-student-data-laws-take-effect "On January 1, 2016, “ SOPIPA”—the recently passed California student data privacy law that defines how edtech companies can use student data became effective. About 25 other states have passed similar laws that are already in effect, or will become effective. At the same time, more than 200 sc" 1/12/16 9/20/16 15:24 9/20/16 15:24 9/20/16 15:24 "" /Users/JoonyoungPark/Library/Application Support/Zotero/Profiles/dasdr1oy.default/zotero/storage/9IW3G4JV/2016-01-12-passing-the-privacy-test-as-student-data-laws-take-effect.html G53DSCAQ webpage Cluster https://www.cs.uic.edu/~wilkinson/Applets/cluster.html 10/21/16 4:56 10/21/16 4:56 10/21/16 4:56 "

K-means clustering

n data points to k number of clusters by gathering near mean data points

define it's goal geometrically k(K-1)/2 cutting planes

identify cluster centroids and divide first pick the center randomly, need to update by adding more coordinates of the new point

Every time re assigned, so it needs to be updated every trial

even if it is chosen randomly, need to pick a point for more centered one

We need to know K to determine K

one color for normal data only!

downfall of k-means clustering: find the center point even if the data is silly and wrong

" /Users/JoonyoungPark/Library/Application Support/Zotero/Profiles/dasdr1oy.default/zotero/storage/ZD9IKMGK/cluster.html CGAS5FTV webpage data-wrangling-cheatsheet - data-wrangling-cheatsheet.pdf https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf 12/22/16 1:40 12/22/16 1:40 12/22/16 1:40 "

This Cheatsheet really saved our semester.

Several note to myself

1. The data set cheat sheet is using is about flower so it is not the actual code you can use/ will upload preuploaded data on R.

2.Gather...Spread...

Need to upload Dplyr/ Tidyr first in the library so you gather the data to spread it as how you want your data to look like. If there are list of students and teachers data and want to look them by their grade, you can use spread function/ Gather is different way; gathering columms into row for one site data analysis (that histogram data assignment7)

" /Users/JoonyoungPark/Library/Application Support/Zotero/Profiles/dasdr1oy.default/zotero/storage/CHS3U2UU/data-wrangling-cheatsheet.html \ No newline at end of file diff --git a/Stack overflow one question.png b/Stack overflow one question.png new file mode 100644 index 0000000..86a1f07 Binary files /dev/null and b/Stack overflow one question.png differ