During a project, good file organization can help in a variety of ways, such as

  • less searching for the right file backups of data reduce the risk of data loss well-documented work
  • knowing what you did, how you did it, when you did it 
  • creating file formats that you can be used now and in the future 
  • easier reporting on progress to funders, team compliance with university and funder requirements
  • data structured in ways that facilitate analysis and integration.

 

Best Practices

File Formats

Selecting the optimal file format(s) for your data will help ensure that your data will be accessible for future use (your own, and for others). When selecting tools for your data, pay special attention to the output formats of your data. Use these best practices to reduce the chances of data loss from software or data obsolescence.

  • Open, machine-readable, and non-propriety data are preferable
    • If data must be in a proprietary format, include a readme file that includes details about the software/hardware needed to open files and ensure that it can easily be converted to open, non-proprietary format
  • Share multiple formats if format used by research community is typically proprietary (eg. MonaLisa_v1.psd AND MonaLisa_v1.tiff)
  • If compression is necessary, use lossless format

File Naming and Organization

File Naming Conventions

  • Create meaningful names relevant to content, independent of location
  • Avoid very long file names
  • Use underscores (this_is_the_file_name) or “camel case” (ThisIsTheFileName) for separating terms
  • If you include a date, use one of these formats:  YYYY_MM_DD, YYYY-MM-DD  or  YYYYMMDD  to facilitate sorting
  • To facilitate sorting, consider the potential number of files and include place holder digits in the name (e.g., for up to one hundred files, begin with …001…)
  • Avoid using spaces and special characters, i.e. ~ ! # & @ ( ) { } [ ] ‘ “ | % $ ; ^
  • Include versioning where needed
  • Be consistent
  • Example: Survey21 _Smith_2015_06_01.txt — a survey in a text file with participant 21, conducted by Smith on June 1st, 2015.

File Version Control