Project File Organization

When starting a computational project, following a well-documented structure will save the creators and future patrons a lot of troubles, eg, when wanting to replicate findings or locate a particular file.

The structure below is inspired by 2009 William Stafford Noble’s paper A Quick Guide to Organizing Computational Biology Projects. On a related topic, see 2014 Greg Wilson et al paper Best Practices for Scientific Computing.

+- data: contains the fixed data sets.
   +- raw
   +- processed, if needed

+- result: contains the computational experiments performed on the data sets 
           in the 'data' directory.
   - NOTEBOOK: record the progress in details. The entries in this notebook 
               should be dated and relaltively verbose with links or embeded 
               images or tables displaying the results of the experiment 
               performed.  Also, this notebook should also record any 
               observation, conclusion, or ideas for future work.  In case 
               the experiment fail, document how you know that experiment 
               failed to make it clear for those who may read this notebook 
               later.  You may also trascribe notes from converstaions and 
               emails.  This notebook can reside online (e.g., Google Doc) 
               to give access to collaborator about the current status of 
               the project.
   +- 2018.04.24
   +- 2018.04.25
   +- <chronological order>

+- doc: contains a seperate sub-directory for each maniuscript.
   +- paper 1
   +- paper 2
   +- ...

+- src: contains the source code for the project.