Projects

January 31, 2020

Disaster Response Pipeline

In this project, I analyze disaster data from Figure Eight to build a model that classifies disaster messages.
Data set provided by Figure Eight contains real messages that were sent during disaster events. I will be creating a machine learning pipeline to categorize these events so that I can classify these messages into different category.
In this project I have built a web app where I can input a new message and get classification results in several categories. The web app will also display visualizations of the data.

Analysis on Stack Overflow Annual Developer Survey data from 2017 to 2020

When I saw stackoverflow survey I was very curious to know importance of formal education to become professional developer. Also know if India is good place for developers.

I took the data from stackoverflow survey from 2017 to 2020 and tried to answer the following questions?

Is Formal Education necessary to become Professional Developer?
As a Software engineer, Is it better to work in India or move to Western countries?
Which country has the most number of developers in last 4 years? and where does India stand in terms of total number of developers?
Will you earn more salary if you contribute to open source?

Flower Image Classifier

A 102 category dataset consisting of 102 flower categories, commonly occuring in the United Kingdom. Each class consists of 40 to 258 images. The images have large scale, pose and light variations.\ I have used fastai library and pytorch for this project.\ I got accuracy of 97.84%\ For detailed explaination of code step by step visit my Blog

Steering Wheel Angle Prediction

This my engineering final year major project along with 3 other team mates. We have published paper in International Research Journal of Engineering and Technology (IRJET) Volume 7, Issue 3, March 2020 S.NO: 924 Our published paper\ We have successfully shown that CNNs are able to understand the entire learning process lane and road following without manual decomposition into road or lane marking detection, path planning, and control.\ A small amount of training data from less hours of driving was sufficient to train the virtual car to operate in diverse conditions, on highways, local and residential roads in sunny, cloudy, and rainy conditions[4]. The CNN is able to extract the meaningful and useful road features from a very sparse training signal(only steering).

Survival Estimates lymphoma patients

This is Survival Estimates that Vary with Time.\ Analyze surivival estimates for a dataset of lymphoma patients considering Censored Data and Kaplan-Meier Estimates\ For detailed explaination of code step by step visit my Blog

Dog vs Cat breed classifier

I used the Oxford-IIIT Pet Dataset by O. M. Parkhi et al., 2012 which features 12 cat breeds and 25 dogs breeds. My model will need to learn to differentiate between these 37 distinct categories.\ I got accuracy of 94% which is just 6% error with just few lines of code, when compared to state of art model in 2012 paper which had 56% accuracy.\ For detailed explaination of code step by step visit my Blog here

Estimating Treatment Effect

Evaluating Treatment Effect Models\ Comparing predicted and empirical risk reductions, Computing C-statistic-for-benefit Interpreting ML models for Treatment Effect Estimation\ Implement T-learner

Risk score model for Diabetic Retinopathy

Building a risk score model for retinopathy in diabetes patients using logistic regression.

Stochastic Gradient Descent

This project is basically building SGD from scratch by learning all basics need to know.\ Even though we have high level api to do all the functions for us, still it is important to learn few important(if not all) concepts.\ SGD taking each step is shown using animation.

House price prediction

I have solved Kaggle problem House price regression using 2 methods.

First method\ Traditional and most important way of solving, Apply machine learning techniques like preprocessing and all other feature engineering steps. And then applied random forest and also tried xgboost
Second method apply neural network to solve problem. for this I tried fastai library

Comment Classification problem

solved using fastai library

Written by Kiran U Kamath
You can follow me on
Twitter Linkedin