Overview

Goals

  • The course project offers you the opportunity to showcase your mastery of the course learning goals, encompassing both soft and hard skills, by collaborating with other students to formulate and answer a research question of your choosing based on suitable dataset that you would proudly present to a prospective employer.
  • The course project is a chance to challenge yourself and work with minimal guidance to explore new territories such as interactive plots, text analysis, and web apps.

Expectations

Below are few expectations from your project:

  • Professionalism–you are expected to produce a professional data-based report in comparable quality to New York Times and Five Thirty Eight reports–below are some examples:
  • Accessible to individuals with little data literacy
  • Self-contained, ie, all project related files such as code and data are in a single repo and relative file paths are always used except for files hosted on the web which are referenced using absolute file paths.
  • Published online via the provided GitHub Assignment repository

Team Size

You are expected to work in a team of 3-4 students.

Domain Area

Each team is free to select any domain area for their project, ranging from agriculture and biology to education and e-sport. The sole requirement is that the data analysis must be sufficiently sophisticated to engage in a meaningful discussion about it with a prospective employer.

Structure

Your data story should include the following, all within context.

Motivation, research question, & background

  • What question are you trying to answer?
  • Why is it important or interesting?
  • what background information is necessary?
  • What assumptions, terms, and/or acronyms need to be clarified?

Data

  • Data collection
    • What was collected?
    • When was it collected?
    • Why was it collected?
    • How was it collected originally?
    • Who collected it?
  • Data acquisition
    • Where / how did you get the data?
    • What is the source?
  • Data understanding
    • How much data do you have?
    • What types of measurements?
    • Anything you needed to clean before getting started?

Data insights

It’s your job to explicitly identify and discuss key insights. Don’t simply present the audience with some code and output and expect them to do that work. Specifically address the following questions:

  • What are the important takeaways from the data? What was interesting?
  • Why do these takeaways matter?
  • Was there anything surprising?
  • Overall, what do you want the audience to walk away with? What do you want them to understand about your data and research questions?!?

Conclusions / Big Picture

  • How do the insights connect to answer your research question?
  • What improvements might someone make to your analysis?
  • Are there any limitations or weaknesses of your data / analysis?

Deliverables

Due Dates

Check the class Moodle page.

Where to Submit

Most of project-related deliverables are submitted via the provided GitHub Classroom Assignment linked in Moodle.

Project Group GitHub Classroom Assignment

The project is a group (not individual) GitHub Classroom Assignment meaning that all your team members will share the same (online) GitHub repository. This means that conflict will arise whenever multiple members edited the same files and tried to push their changes to GitHub. To be on the safe side, the teams should create separate file for each member to their own analysis in. Afterward, the team members should come together how to put the different pieces together to form the final report.

I am not expecting the GitHub collaboration process to go smooth. So, if you encounter any issue and you could not resolve it on your own in a short period of time, let the instructor know as soon as possible to avoid any delay in your main task–the data analysis.

Proposal

Each team must submit a proposal for their project in the form of a rendered HTML Quarto page added as appendix to project website. The proposal must include the followings

  • a title
  • the names of the team members
  • a short description of the project
  • the reasons/inspirations behind choosing this project
  • a rough implementation and responsibility plan, ie, what needs to be accomplished and who will do what when. Think about the list of deliverables when building the plan.
Reflect

When done, each member needs to reflect on this part.

Sketch/Illustration

Each team must submit a rough sketch/illustration, also known as a low-fidelity prototype, depicting the expected data analysis process. Hand-drawn sketches/illustrations, which can be scanned, are encouraged. The purpose of the sketch is to provide visual insight into the process that will be followed when analyzing the data. The sketch should be linked from a rendered HTML Quarto page added as appendix to project website.

Reflect

When done, each member needs to reflect on this part.

Progress Presentation

Each team will be required to present their progress to class multiple times throughout the semester to solicit feedback.

Demo Presentation

Each team will be required to demo their project to class to solicit feedback before recording the final version that goes with the report.

Code, Report, Video, Presentation

Each team must push the code of their project to GitHub. Each repository must include a README.md file that includes clear instructions on:

  • requirements to run your analysis, eg, required R version and packages
  • how to run the analysis
  • any known limitations that the analysis is currently suffer from, eg, known bugs or cases that the analysis can not currently handles
  • resources that were referenced while doing the analysis
  • screenshots of the top part of the generated report
Examples of Well-Strcutured README files

The awesome README GitHub page lists examples of GitHub repository with well-structured README files. Please, check some of them for inspirations–Aimeos TYPO3 extension project repository is a good example.

The report should be a rendered version of the one or more Quarto files used for the analysis. The rendered version should be part of the project website. See the structure section for details.

The video should be no longer than 5 minutes and should walk the viewer through the analysis process and the results. The video should be thought of as alternative mode of the written report. The video should be uploaded to the team’s GitHub repository and linked in the report.

The presentation should be the one (or similar to the one) used in the video demonstration. A copy of the presentation should be uploaded to the team’s GitHub repository and linked in the report.

Evaluation and Reflection

Each team member will be required to evaluate and reflect on their own performance, as well as the performance of each of their teammates. Additionally, each student will be asked to evaluate the projects of the other teams.

Important Notes

Teamwork

Working in a team is change to improve one communication skills as well as know their teammates better. However, working in a team sometimes pose some challenges. To ensure successful project outcome, below are few expectations:

  • Active participation–be present, attend classes, do your work, keep your team informed about any unexpected events
  • Active listening–show interest in other team members’ ideas
  • Inclusive environment–invite teammates to participate
  • Each member must be in all main aspects of the project, including coding, reporting, and presentation. It is not acceptable for a single member to solely handle one aspect, such as coding, while another focuses solely on the report, and another solely on the presentation.

Code Backup

When working on your data analysis, ensure that you commit your code frequently to GitHub, accompanied by meaningful commit messages, and push your changes regularly. This practice helps prevent any unforeseen issues or loss of progress.

Code Styling

You should follow the COMP112: Code Styling Guidelines. This practice will ensure your code is more readable and enhance its maintainability.