Wednesday, December 12, 2012

Conclusion

We didn't meet our original goal of beating the LENSTOOL metric, but realistically, this was more of a learning experience for us than a competition. By viewing it primarily as a learning experience, we can confidently say that it has been a resounding success. Below are some of the things we learned:

1. Machine learning is not a "one approach fits all" type of field. None of the techniques we learned during the digit recognition competition were helpful, and recognizing this took a couple of weeks. It was a disheartening realization at first, but recognizing forced us to think more creatively than we might have otherwise.

2. It's possible to devise an effective ML algorithm without lots of training data. We didn't figure out how to do this, but some people evidently did (as shown in the leaderboard screenshot below).


We're excited to find out about the approaches that successful competitors used after the competition ends; these might serve as inspiration to us for future ML endeavors.

3. It's helpful to have an advanced degree when solving a problem for which canonical machine learning algorithms are not applicable. Looking at the profiles of top competitors, we saw that most of them had advanced degrees in areas like computer science and data mining; one guy even had a PhD in particle physics.

4. Our best legitimate approach was the very first one, which combined two different benchmark methods. It was the simplest method by far - perhaps the old adage is true, that simpler is better. 

5. Spiral learning is a legitimate educational philosophy! This project reminded us both of "Modeling and Simulation", a required course for first-year students at Olin - we took it 3 years ago in Fall 2009 (class website). In that class we were taught the governing equations for certain real world systems, and learned the programming/modeling skills necessary to explain and predict the behavior of those systems. This competition enabled us to tackle a similar problem, but to a much higher degree - this time, we figured out the governing equations ourselves! It was really satisfying to use the same approach that we used freshman year, this time with three years of experience and confidence behind each of us.

Overall, we didn't do as well as we'd hoped, but we definitely learned a lot from the competition. It was a great experience to work on a real-world problem, with data that had to be deciphered and with new concepts that had to be learned. We were able to attack a difficult problem in a field which we had no prior knowledge about, and we were able to do so in creative ways. More than anything, this competition has given us the confidence and desire to tackle more data science problems in the future.

No comments:

Post a Comment