Final project

Materials from class on Saturday, July 11, 2020

Overview

  • Review the project requirements. Project is required for credit-seeking students and optional for guests
  • Sign up for the final project at the BIOS691 Final project sign-up form, due 06/19/2020
    • Teams of two are encouraged - both teammates should agree and sign up
  • Project proposal submission due 06/26/2020
  • Final project submission due 07/12/2020
  • Ask questions by e-mail

Data

Proposal Submission (due 06/26/2020)

  • Register on GitHub. Learn about GitHub in the section 4 - Git and GitHub of the Reproducible research tools
  • Create a repository for your project
    • If teamwork, add your teammate as a collaborator
  • Add README file describing the selected project. Address the following points:
    • Data type (numerical, text, or images), link to the data source
    • Problem type (binary or multiclass classification, regression)
    • Proposed network architecture (feed-forward, CNN, RNN, etc.), testing multiple architectures encouraged

Solution

  • Create your code in RMarkdown
  • Describe and implement data download and processing
  • Select, justify, and implement a network architecture suitable for the data
  • Train and evaluate the performance of the network
  • Tweak network architecture, add regularization, if needed, to improve model performance
  • Explain all steps, describe the achieved performance

Many Kaggle competitions allow for late submissions in the form of text files with, e.g., samples and predicted classes. Such submissions will be benchmarked against previous submissions, and you will be able to compare how your model performs with respect to previous solutions.

Final Submission (due 07/12/2020)

  • Add your code solving the selected competition to your GitHub repository
    • Your code should run (knit) with minimal modifications, e.g., adjusting path
    • If teamwork, ensure that both participants committed to the repository