In summary: out of the four algorithms we tried, k-nearest neighbors was the most accurate (95%), followed by our neural network (93%). Our logistic classifier was the next most accurate (81%), followed closely by our linear average classifier (80%). We tried parameter optimization for the logistic classifier and the neural network, and received 0.2% and 2% accuracy boosts, respectively.
Did we try all we could have tried? No. If we were really serious about the performance of our algorithms, there’s a lot we could have investigated - better optimizations, performance enhancements, and more advanced techniques. For us, the purpose of this competition was to introduce us to machine learning, and this was definitely accomplished. We implemented 4 different algorithms from scratch, learned how to test algorithmic performance, and even tackled some machine learning system design with parameter optimization. I’d say our goal was more than accomplished. For our next project, we’ll be sure to try much more advanced techniques, think more carefully about system design, and try to test out any hypothesis we may have for improving our system.
Here are some major things we learned along the way:
- More complicated does not mean better. We saw this with the k-nearest neighbor classifier - which was conceptually simple and easy to implement. It ended up outperforming all our other algorithms, despite other algorithms being more complex (especially neural networks).
- The 80-20 rule really applies. We coded our first algorithm in under an hour, and right away that gave us 80% accuracy. The rest of our time was spent trying to improve on this. When we tried optimizing parameters, we didn’t get much improvement either. However, there’s a huge difference between 95% accuracy and 80% accuracy. Consider how the USPS uses handwriting recognition to read addresses off of letters. Imagine if their classifier only had 80% accuracy - that would be a lot of letters delivered to wrong addresses. Even if it had 99% accuracy - if they deliver millions of letters a day that’s tens of thousands of letters that get incorrectly sent. If precision matters, every tenth or hundredth of a percent counts.
- Visualizing results is helpful. When dealing with big data, little bugs can go unnoticed, especially if they don’t seem to have much effect on the system. Sometimes the best solution is to slow down and watch as the computer trains or tests each sample individually. With digit recognition, it helped a lot to display the picture of each training or test sample as the algorithm classified it.
If we were to make our recognizer more accurate, here’s some things we would try:
- Optimize better. We could try increasing the number of different values we test over for each parameter, or we could try optimizing over different parameters.
- Image preprocessing. We tried a bit of this at the end, but without much success. To decrease the variation between samples, we could rotate each sample to be facing the same direction, remove any border on the samples, etc. Less variation means more accurate detection.
- Introduce artificial training samples. More training samples means more accuracy. Often this will produce better results than trying to improve the algorithm. Though we used all the Kaggle training samples, we could have made some of our own. One way to do this is to make modified copies of the existing samples. For example, it is possible that some digits had thicker line widths than others. If we take each sample, make 3 copies of it with varying line widths and use all of these samples for training, we could multiply the our training sample size by 4. All of the artificial examples would be plausible as real examples, so this could increase our accuracy.
Well, that’s all we have on handwriting recognition. Next up: identifying dark matter halos!!!
(And in case you’re wondering, we are currently placed 429th for the handwriting recognition competition. It’s a popular one!)
No comments:
Post a Comment