Sunday, December 21, 2014

DALMOOC Episode 10: Is that binary for 2? We've reached recursion!

Hey!  We've made it! It's the final blog post about #dalmooc... well... the final blog post with regard to the paced course on Edx anyway :)  Since we're now in vacation territory, I've decided to combine Weeks 9 and 10 of DALMOOC into one week.   These last two weeks have been a little light on the DALMOOC side, at least for me.  Work, and other work-related pursuits, made my experimentation with LightSIDE a little light (no pun intended).  I did go through the videos for these two weeks and I did pick out some interesting things to keep in mind as I move through this field.

First, the challenges with this sort of endeavor: First we have data preparation. This part is important since you can't just dump from a database into programs like LightSIDE. Data needs some massaging before we can do anything with it.  I think this was covered in a previous week, but I think it needs to be mentioned again since there is no magic involved, just hard work!

The other challenge mentioned this week was labeling the data. Sometimes you get the labels from the provider of the data, as was the case with the poll example used in one of the videos for week 9. To do some machine learning the rule of thumb, at least according to dalmooc, is at least 1000 instances of labeled data are needed to get some machine learning  - more or less labelled data would be needed depending on individual circumstances.  For those of you keeping track at home Carolyn recommends the following breakdown:
200 pieces of labelled data for development
700 pieces of labelled data for cross-validation
100 pieces of labelled data for final testing

Another thing to keep in mind, and I think I've mentioned this in previous weeks, is that Machine learning won't do the analysis for you (silly human ;-) ).  The important thing here is that you need to be prepared to do some work, some intepretation, and of course, to have a sense of what your data is. If you don't know what your data is, and if you don't have a frame through which you are viewing it, you are not going to get results that are useful. I guess the old saying garbage in, garbage out is a good thing that we need to be reminded of.

So, DALMOOC is over, and where do we go from here?  Well, my curiosity is a bit more piqued. I've been thinking about what to do a dissertation on (entering my second semester as a doctoral student) and I have all next summer to do some work on the literature review.  I still am thinking about something MOOC related, some of my initial topics seem to already be topics of current inquiry and of recent publications, so I am not sure where my niche will be.  The other fly in the ointment is that the course I regularly teach seems to have fewer students in it, so a Design Based Research on that course (that course as a MOOC I should say) may not be an option in a couple of years. Thus, there is a need for Plan B: I am actually thinking of going back to my roots (in a sense) and looking at interactions in a MOOC environment.  The MRT and I have written a little about this, looking at tweets and discussions forums, so why not do something a little more encompassing?  I guess I'll wait until the end of EDDE 802 to start to settle on a topic.

What will you use your newly found DALMOOC skills on?





blog comments powered by Disqus