OK, so I'm picking on LiveMocha in this post, but it might as well be any of the companies providing language training software. Why does nobody provide us with even some basic raw data on language learning? I'm sure they've got some.
As someone with a statistics background, I'm a huge fan of just about any statistics site. I follow Nate Silver's political postings just to wade through the numbers. More recently, I followed a link to OkTrends, the statistics blog for OK Cupid. On the one hand, I'm impressed by the level of online interaction OK Cupid creates. On the other, like most dating sites, it just strikes me as terribly shallow. (I'm sure people can point to much shallower out there.) Despite all this, the people who run it include a bunch of statistics geeks who just enjoy figuring out how they can run the numbers on postings and messages on their site.
It should be fascinating, either for anyone who enjoys statistics, or for someone who has a strange voyeuristic interest in young people checking each other out. Among the subjects discussed are The Correlation of Race and Messages Received and 4 Myths of Profile Pictures. The latter taught me that the intensely twee overhead photo is typically called the "MySpace Shot", something I guess everyone else already knew.
And if that isn't sufficiently devoid of applicability, there's always floatingsheep, which engages in odd but entertaining map mashups. While I'm not sure that any of this is particularly scientific or provides surprising insights, it does show what you can do if you happen to be sitting on top of a large collection of datasets and have a reasonable amount of free time and imagination.
And this brings me back to the question about LiveMocha. One of the big promises of computer-aided language learning (small letters - as opposed to formal CALL) was that we would finally have large amounts of data regarding second language acquisition. And we're still waiting. Up to now, so much analysis of language instruction has been based on ad hoc case studies with inconsistent and small datasets. I admit, nobody is going to be able to put together a PhD thesis on what's available from online language learning now, but there ought to be something measurable, and at least of general interest..
How many people start an online language learning course and finish it? How quickly do people usually finish a lesson? What is the correlation of course completion and number of online social connections, or frequency of online messaging? If course completion isn't a good measure of language learning, then how quickly do users upgrade their language knowledge from "beginner" to "intermediate"? How many people intentionally start a course in a language just to see what it's like, but didn't pursue it? (I know I did.)
The point is that the folks who manage the LiveMocha website should be able to answer these questions. Possibly I'm the only person who really cares, but I'd be willing to bet that others out there would like to know what learning strategies are showing more success than others, what patterns there are in ways to learn a language, and just simply how other people user online language learning tools.