Schooling a Machine

In our day to day life we face many tasks whether they are expected or unforeseen. Unsurprisingly, when we face unforeseen challenges we are more prone to struggles and failure, but when it comes to tasks we were ready for, usually we are well more equipped to complete them with flying colors. Although familiar tasks are easier for us to complete, as we continue to see unfamiliar tasks more and more, we will our struggles to complete will decrease over time.

In other words, we are conditioned to learn from our experiences. Since childhood, we have learned to walk, talk, eat, and dress up among man other actions, but rarely did we ever pay attention to the process of mastering these simple but vital tasks. Once we entered school, we began facing new challenges everyday from class activities and home work to assessments and team projects. All these actions were facilitated by teachers to help us achieve our end goal of acing final exams at the end of the year. Even though our very first final exam was an unfamiliar concept to us, all the knowledge we accumulated throughout the year prepared us, and future exams became less foreign and gradually expected over time.

The purpose of learning and training the human body is to perform better and attain the most desirable results possible. So this gives rise to the question do machines also need to be trained to perform tasks? The easy answer is yes, but the process of training machine models before using them to predict future outcomes is anything but. This is because data scientists have to assume the role of a teacher by actively checking if their machine student is properly learning the data we given to it. In the event that the machine model does not learn from the inputted data, data scientists must modify their approach to training the model (choosing the right estimator) until their student is prepared to flawlessly execute the task it is programmed for.

Now let us revisit the school analogy once more. There are three students in a class — we can call them A, B and C. A is not interested in leaning, B absorbs whatever information is taught to him but only at its face value, and C efficiently learns and understands the concepts she is taught. When a test is conducted on taught material, B would gets the highest score, A would get the lowest out of the three, and C would receive a decent score but less than student B’s as picture above. However, this outcome would not be the same when a concept based test is conducted. In this event, C would perform the best, B the second second, and A still the least. When applying this example to machines, models similar to A are said to be under fit models, while B’s are called over fit, and C’s are best fit. Since the tasks the machines will be required to execute will require them to apply their knowledge, the C model is the most ideally suited for most purposes because it is capable of using its stored knowledge to adapt to different situations. Even though B received the highest test score on a strictly knowledge based test, its learning would prove extremely rigid when the time to apply it came.

To avoid under fitting and over fitting, we split the data set into training data and testing data. The split can be 80% to 20% (the default) or 70% to 30%, but generally for any model to give a solid performance, the training data set should be greater than testing data set.

What about the event where there is completely unforeseen data that will be required for the task? For this scenario, we use something called a validation data set, which is data held back from the original training set. The purpose of the validation data set is to measure a model’s capability of adapting to change while also fine-tuning the model’s parameters to give it a better chance at success. The ultimate goal of using these three delineated data sets is to generate a machine model resembling student C. Often times in school, teachers have told us something along the lines of “It is more important to understand concepts than memorize them.” Who knew the machines would also be held to such a similar standard!

Blog Posts