Documenting your data in a research project means writing down what you do during the project! Documentation captures your work on the dataset so that others could understand what you did and be able to reproduce your work if needed. Your documentation should include both your step-by-step processes (what you did) as well as placing the project in a larger context (why you did it). Your documentation should include the following elements at a minimum:
This helps ensure that all data generated and/or collected is easier to understand, analyze, and reuse. A good practice is to consider whether your documentation would address the following situations:
Possible elements to consider documenting include:
Do you ever change something in your data, such as adding or deleting certain variables, and you tell yourself that you'll definitely remember when and why you made those changes without writing it down? The truth is, we rarely remember! And that is definitely not a negative trait about yourself - we simply have so much going on and we don't always remember each and every decision we make every day! It's completely understandable. This is where version control comes in! Version control is a documentation technique that entails keeping track of the changes you make to your data over time so you can easily go back and view those changes, having a clear record of how your data has evolved. It can include putting a "v1", "v3", "v8", etc. at the end of your file names to note changes in the file over time, and it can also include using specialized tools such as Git that keep track of granular changes in your document.