Friday, February 12, 2010

Can LiveMocha Do This?

OK, so I'm picking on LiveMocha in this post, but it might as well be any of the companies providing language training software. Why does nobody provide us with even some basic raw data on language learning? I'm sure they've got some.

As someone with a statistics background, I'm a huge fan of just about any statistics site. I follow Nate Silver's political postings just to wade through the numbers. More recently, I followed a link to OkTrends, the statistics blog for OK Cupid. On the one hand, I'm impressed by the level of online interaction OK Cupid creates. On the other, like most dating sites, it just strikes me as terribly shallow. (I'm sure people can point to much shallower out there.) Despite all this, the people who run it include a bunch of statistics geeks who just enjoy figuring out how they can run the numbers on postings and messages on their site.

It should be fascinating, either for anyone who enjoys statistics, or for someone who has a strange voyeuristic interest in young people checking each other out. Among the subjects discussed are The Correlation of Race and Messages Received and 4 Myths of Profile Pictures. The latter taught me that the intensely twee overhead photo is typically called the "MySpace Shot", something I guess everyone else already knew.

And if that isn't sufficiently devoid of applicability, there's always floatingsheep, which engages in odd but entertaining map mashups. While I'm not sure that any of this is particularly scientific or provides surprising insights, it does show what you can do if you happen to be sitting on top of a large collection of datasets and have a reasonable amount of free time and imagination.

And this brings me back to the question about LiveMocha. One of the big promises of computer-aided language learning (small letters - as opposed to formal CALL) was that we would finally have large amounts of data regarding second language acquisition. And we're still waiting. Up to now, so much analysis of language instruction has been based on ad hoc case studies with inconsistent and small datasets. I admit, nobody is going to be able to put together a PhD thesis on what's available from online language learning now, but there ought to be something measurable, and at least of general interest..

How many people start an online language learning course and finish it? How quickly do people usually finish a lesson? What is the correlation of course completion and number of online social connections, or frequency of online messaging? If course completion isn't a good measure of language learning, then how quickly do users upgrade their language knowledge from "beginner" to "intermediate"? How many people intentionally start a course in a language just to see what it's like, but didn't pursue it? (I know I did.)

The point is that the folks who manage the LiveMocha website should be able to answer these questions. Possibly I'm the only person who really cares, but I'd be willing to bet that others out there would like to know what learning strategies are showing more success than others, what patterns there are in ways to learn a language, and just simply how other people user online language learning tools.

2 comments:

  1. Very interesting post, Eric. Yes indeed we do draw some very interesting conclusion from our data. The number of variables is daunting at times but also thrilling for a data junkie (like me!)

    Tough for us to disclose much publicly, as it is core to our business, but it's cool to see someone else share my excitement vicariously.

    -Clint

    ReplyDelete
  2. I agree, I have waded through a tone of language learning sites, but have always wondered if there was/is any statistics performed on these sites to actually improve it. What sort of correlations can be drawn from native speakers of different languages, which languages are easier to learn etc. I mean I know alot of data is mean to proprietary, but at least some of it should be shared for the benefit of the community, no?

    - Patrick

    ReplyDelete