Data Science Projects

I would like to showcase a few of the many data science projects that I have worked on here. I can’t really share many details on what I worked on at several software startups here publically, but happy to describe some details in interviews.

R Shiny Dashboard Date: 2019

Team size: 2
Description: This is an interactive dashboard with a geographical map of mental health survey results, built with R shiny.
Link: interactive dashboard

Software Package in R and Python Date: 2019
Team size: 3
Description: This is a software package I built with my classmates for data cleaning, in R and python. It identifies missing data and tells you the locations, and has functions to replace with a value of your choice, or delete these rows or columns. It provides more summary functions that the usual .describe() pandas function.
Link: python package R package

Statistical Analysis Pipeline within Docker Date: 2019
Team size: 2
Description: Reproducible 2-way ANOVA analysis pipeline of viral DNA using Docker and Make files. We carried out a Two-Way ANOVA (Factorial Analysis) to compare the main effects and interaction effects between biological pathways and ocean depth levels on the abundance of viral DNA sequences
Link: github repo

Statistical Analysis and R shiny app Date: 2018 + 2019 2 weekends
Team size: 4
I participated in UBC’s Hackseq in 2018 and 2019. This project worked with the Teracyc ocean project data, which consists of over 300 water samples from around the world, analyzed for viral and bacterial DNA. This was then mapped to biochemical metabolic families by the Hallam lab at UBC. Our hackathon team then visualized this data in our viral voyager shiny app.
My 2019 group project was creating an antimicrobial resistance clinical analysis pipeline.

More recent projects I am working on include building a chatbot, and a geospatial analysis of supply chain routes.

Written on July 5, 2020