Hey everyone! In this blog I share my experience with Google Science Fair. By the way, from next post on, we'll be talking about technical stuff. Mostly web development, android and occasionally physics...
So, somewhere in April, I was taking a Machine Learning class. In the class, we were learning about Linear Regression. So the instructor told that the problem that we have is to fit a line to some data and we then use the estimated line to predict future data. The obvious solution seemed to be to find individual lines between every two points using simultaneous equations and take their average. However, we were taught a method called Gradient Descent.
I immediately noticed a few flaws in it. Firstly, we loop too many times. Secondly, It is heavily sensitive towards points that don't follow the trend. In other words, most of the points show a pattern, but some rare points don't follow the pattern. A good algorithm shouldn't be sensitive towards these points called 'outliers'. But gradient descent was.
When I tried my method it had lesser loops, hence was faster; and it wasn't sensitive towards outliers when enough data was available.
I quickly registered for Google Science Fair to showcase my new method and started building my project site. But before that, I had to make sure that my algorithm didn't already exist. Upon doing some research, I found a method called Theil-Sen Estimator that was a lot similar to mine. At first I was discouraged. I thought all my hard work was a waste and I should probably withdraw my name from Google Science Fair. But after close examination I found that my algorithm was sensitive towards outliers when the data wasn't enough, and that's good because when the data available is less you don't want to be ignoring points. But Theil-Sen Estimator ignore points if data is less. That was a big big advantage and I thus submitted my project.
So moral of the story, don't get discouraged immediately when you see an obstacle. Analyse it and then decide.
So, somewhere in April, I was taking a Machine Learning class. In the class, we were learning about Linear Regression. So the instructor told that the problem that we have is to fit a line to some data and we then use the estimated line to predict future data. The obvious solution seemed to be to find individual lines between every two points using simultaneous equations and take their average. However, we were taught a method called Gradient Descent.
I immediately noticed a few flaws in it. Firstly, we loop too many times. Secondly, It is heavily sensitive towards points that don't follow the trend. In other words, most of the points show a pattern, but some rare points don't follow the pattern. A good algorithm shouldn't be sensitive towards these points called 'outliers'. But gradient descent was.
When I tried my method it had lesser loops, hence was faster; and it wasn't sensitive towards outliers when enough data was available.
I quickly registered for Google Science Fair to showcase my new method and started building my project site. But before that, I had to make sure that my algorithm didn't already exist. Upon doing some research, I found a method called Theil-Sen Estimator that was a lot similar to mine. At first I was discouraged. I thought all my hard work was a waste and I should probably withdraw my name from Google Science Fair. But after close examination I found that my algorithm was sensitive towards outliers when the data wasn't enough, and that's good because when the data available is less you don't want to be ignoring points. But Theil-Sen Estimator ignore points if data is less. That was a big big advantage and I thus submitted my project.
So moral of the story, don't get discouraged immediately when you see an obstacle. Analyse it and then decide.
No comments:
Post a Comment