Coursera Projects
These are projects I've worked on for Coursera courses. Most of them are pretty minor projects and are included here for my organizational purposes mainly, except for a few like the Data Science Capstone.
A series of 10 courses offered by Johns Hopkins University. The course is taught in R and covers a range of topics related to working as a data scientist. Only courses with project work I've stored online are listed.
Course 3: Getting and Cleaning Data
- Getting And Cleaning Data Project - Mainly involved transforming a categorical variable with integer values into one that was more description, plus subsetting the data to pick out interesting features and performing some rudimentary analysis.
Course 5: Reproducible Research
- Reproducible Research Project 1 - Simple analysis and Rmarkdown writeup of wearable tech data.
- Storm Analysis - Final project of class. Took data from the NOAA storm database and drew conclusions about which weather events had the largest public health and economic impacts. This was written up as an Rmarkdown document.
Course 6: Statistical Inference
- Simulation Exercise - The final project was unusual in that we were supposed to validate the CLT by showing that the sample mean of the exponential distribution was approximately normally distributed with a large enough sample. I interpreted this in a strict sense, and showed that it was not normally distributed at high significance using a K-S test.
Course 7: Regression Models
- mtcars Regression - Used regression to answer several questions related to whether automatic transmissions affected fuel efficiency using the mtcars dataset from Motor Trends. Using ANOVA, I found that weight and horsepower sufficiently explained the fit, as adding further variables did not significantly affect the fit.
Course 8: Practical Machine Learning
- WeightLiftingPrediction> - The course project involved using data from wearable tech used to track exercise to predict whether the wearer was “correctly” performing exercises. I used random forests, stochastic gradient boosting, and linear discriminant analysis, and was able to achieve 99% accuracy with my best model.
Course 9: Developing Data Products
Data Science Capstone (Under construction)