Project management
Information
The estimated time to complete this training module is 4h.
The prerequisites to take this module are:
- the open data module.
- the terminal module.
- the Git and GitHub module.
Contact Hao-Ting Wang if you have questions on this module, or if you want to check that you completed successfully all the exercises.
Resources
This module was presented by Elizabeth Dupre during the QLSC 612 course in 2020, with content adapted from presentations by Chris Gorgolewski.
The slides are available here.
The video of her presentation is available below:
Exercise
Your understanding of project management
Prepare your answers in an online document (e.g. using hackmd.io).
- Let's pick a license…
- you want to share code and make sure it can be used as widely as possible, but you still get credited. Which license do you pick and why?
- you want to share data, get credited, allow for modifications but not commercial usage. Which license do you pick and why?
- Pick a public dataset and explain if it is FAIR. You can pick from the list below if you need inspiration.
- Find an example of a neuroimaging paper described on the open science framework (or somewhere else), with 1. code available? 2. Documentation for data analysis available? 3. Data available? For each aspect, summarize briefly the standards followed (if any).
Create a project template for your Brainhack School project
We have learned about the community standard of data sharing in the previous exercise and how that improves communutcation and collaboration with researchers. The same principle can be applied to your analysis code. Using a consistent project template will help when your project grow in the long run. For this section, please prepare your answer as an GitHub repository. We will follow the first lesson Set up your project in the Good Research Code Handbook
π Please skip the Install a project package section. We recommend to revisit this part after completing the python packaging module.
Let's start with a working example. Write down in an online document (e.g. using hackmd.io).
- Check out this repository which follows the principle of the project set up in the Good Research Code Handbook. It is slightly different.
- Read through the lesson Set up your project.
- The lesson introduced this project layout:
βββ data βββ docs βββ results βββ scripts βββ src βββ tests βββ .gitignore βββ environment.yml βββ README.md βββ setup.py
However, the layout in this project
fmriprep-denoise-benchmark
is slightly different. Here's it's layout excluding some more complex files.βββ content βββ fmriprep_denoise βββ inputs βββ results βββ scripts βββ .gitignore βββ LICENSE βββ README.md βββ requirements.txt βββ setup.py
Can you map out directories and files that serves the same purpose as the ones in the lesson? What extra file is in
fmriprep-denoise-benchmark
? Which directory is missing?
After seeing one example, let's create a directory for your project and sync to GitHub. The end result will be a logically organized project skeleton that's synced to version control. The instructor should be able to verify your progress through 1. a public GitHub repository and 2. clone and test your project on their own machine.
Pick a short and descriptive name for your project….
- Create a folder in your home directory.
- You should have created a GitHub account when you completed the installation module. Now we want to create a GitHub repository of the same name, for convenience.
- Now you should see some suggestions on the GitHub page to help you initialise the project repository. Which one should you use?
The next step is to create a virtual environment and an associated file describing the environment, so you can reinstall everything if you need to start fresh.
- Create a conda environment of the same name of your project.
- Activate this environment and install the same packages you did in the installation module.
- Export this environment and save it as a file named
environment.yml
at the root of your project repository. - Now commit and push this environment file to your GitHub repository.
Next, let's create a
.gitignore
file with the help of GitHub magic.- On your GitHub repository page, click on
Add file
-create new file
. - Now enter
.gitignore
into the file name. You will see a drop down menu appear on the right hand side. - Pick the python template. Commit this file.
- Now use the terminal and pull this change to your local directory.
π‘ With the same principle, you can also create a
LICENSE
file through GitHub.- On your GitHub repository page, click on
Next we will populate the directory with a project skeleton and some basic descriptions.
- Create the directories described in the lession with command
mkdir
. - Try to add these changes to git. You should notice empty directories are not added, this is because git will only push files.
- To work around this issue, let's create a file named
.gitkeep
in each of the empty directory using commandtouch
. - Now try to add and commit these files again.
- Create the directories described in the lession with command
(Optional) Creating a project package Creating a project package can be difficult for the first time, but the pay off is substantial: your project structure will be clean, you wonβt need to change Python’β‘s path, and your project will be pip installable. This is the standard approach in the data science world. This is a minimal demo for a installable package. We encourage you to revisit Install a project package after completing the python packaging module.
- Create a file under
src/
namedhelloworld.py
, with one lineprint('hello world')
- Create a file under
src/
named__init__.py
. - Create a
setup.py
file in the root directory with the following lines:from setuptools import find_packages, setup setup( name='src', packages=find_packages(), )
- Finally, activate the project virtual environment you created earlier. Install your package:
pip install -e .
- To varify the completion of this part, you should be able to run the following code from a python kernel as long as the virtual environment is activated:
import src.helloworld >>> hello world
- Create a file under
Follow up with Hao-Ting Wang to validate you completed the exercise correctly.
π π π you completed this training module! π π π
More resources
If you are curious to learn more about BIDS, check the BIDS specifications. There will also be a training module on BIDS in week 2. We encourage you to revisit Install a project package after completing the python packaging module.
Some documentation on standards for project organization:
- the entire “the Turing way” documentation is relevant, but the section on project design is the most important for this training module.
- the YODA principles
- the Good Research Code Handbook has lots of resources to help you learn industry standard research code management.
Finally, a blog post on choosing a license for open science projects by Titus Brown.