Skip to main content

Research Data Management: Organization & Format

Data Organization

Why should you organize your data?

The organizational structure of your data can help secondary users of your data find, identify, select, and obtain the data they require.

How do you organize your data?

For best results, data structure should be fully modeled top-to-bottom/beginning-to-end in the planning phase of a project.

Here are some considerations when planning the organization of your data.

You'll want to devise ways to express the following:

  • the context of data collection: project history, aim, objectives and hypothesis. 
  • data collection methods: sampling, data collection process, instruments used, hardware and software used, scale and resolution, temporal and geographic coverage and secondary data sources used
  • dataset structure of data files, study cases, relationships between files
  • data validation, checking, proofing, cleaning and quality assurance procedure carried out
  • changes made to data over time since their original creation and identification of different versions of data files
  • information on access and use conditions or data confidentiality
(adapted from UKDA)

Format

Why does it matter how you format your data?
 
To maximize the share-ability and re-usability of your data, you will want to carefully consider the format in which your data is saved.  Careful selection of data format can also help you down the road by limiting the chances of your data becoming obsolete when a proprietary format is no longer supported.  
 

What format should you use?

Formats which are more than likely to be accessible in the long-term are:

The University of Washington Libraries provides a list of preferred file formats

Campus Services

If you have questions about the content, data management planning, our services or would like to request a consultation with a librarian, please find your subject specialist here.

File Naming & Structure

Why is file naming important?

Think of a file name as a unique identifier for each of your files. Following a naming convention allows you to simplify the organization of your files and locate your files with ease, as well as making it easier for others to understand and reuse your data. This is particularly important when you are working on a collaborative project.  

How should you name your file?

Here are some recommended best practices for naming your files:

  • Use names that are brief but descriptive
  • Avoid spaces and special characters (like *, #, % etc.)
  • Come up with a naming convention adhered to by everyone using the files
  • Identify versions of files using dates and version numbering in file name
  • Use three letter file extensions to ensure backwards compatibility (ex: .doc, .tif, .txt)
  • Do not use letter case to identify different files (ex. datasetA.txt vs. dataseta.txt) 

How should files be structured?

Folder structure for your files can assist in the unique identification of the files contained within them.  Consider the structure of the folders containing your data files before you begin to collect your data.  Ideas for how to organize your folders include:

Data type (text, images, models, etc.)
Time (year, month, session, etc.)
Subject characteristic (species, age grouping, etc.)
Research activity (interview, survey, experiment, etc.)

Consider these examples of file naming and folder structure:

File001.txt   vs.
201206blood_ID0234.txt

MyDocuments\Research\Sample12.jpg   vs
C:\\NEHGrant01234\WWI\Images\London_001.jpg

Tools & Resources - Formats & File Naming