Any management is easier if you have a good handle on what, exactly, you're managing and who is involved. You know your data, but reflect on how aspects of your data might affect how you organize and share them. For example, if there are relationships between or among datasets, you may want to indicate this in your file system or naming convention. Consider the
Another thing to keep in mind that will likely affect access and storage to your data is whether it contains confidential or sensitive information. Confidential information is personally identifying (eg., names, DOBs). Sensitive information is any data that, if released to the public, would have an adverse effect (eg., the location of a nest of an enangered bird species).
Lastly, make sure you consult any funding agency policies-- you may need to share and/or preserve or your data.
Consider the members of the research team. Who will be responsible for keeping track of where data are and who has access to them? Who will decide whether to share, preserve, or archive the data? Sorting out the roles of your research team in regard to data management is key. That might be easy if you're working alone, but is potentially quite difficult in larger research groups.
Consider what organizing principle will work best for keeping your data in control and easy to find. You can use any number of things--dates, methods, objects, data type, etc--to group files and create a meanignful directory tree. What will work for one study or experiment may not work for others. It's a good idea to think about this before you start collecting data. More importantly, be consistent and use your plan!
The following guidelines will help you create useful fileneames:
Keeping track of what data you have, what has been done to the data, and where the data are is vitial to managing data. Documentation is not glamorous. In 2 years (or, if you're like me, 2 weeks), future you might be baffled by the arrangement of materials, forget where things are, or even what you have. Make sure to
Codebooks and Data Dictionaries
By describing data files, codebooks make data understandable and usable in the future. Codebooks can vary considerably from study to study and discipline to discpline. ICPSR's "What is a Codebook?" and Guide to Codebooks and Princeton University Data and Statistical Services' "How to Use a Codebook" outline uses for and typical features of codebooks. Usually associated with databases, data dictionaries define the fields and relationships among fields and tables. But data dictionaries (and thesauri) can also be used ot create a glossary of terms or define values in a study.
The overarching principles of consistency and application (i.e. just doing it the same way every time) are the most important here and apply in metadata as well.
People annoyingly define metadata as "data about data" or slightly more specifically "structured data about data." By describing things, metadata helps us identify, discover, assess, and manage those objects. Put another way, metadata can serve as surrogates for objects and helps us find and understand those objects. As does documentation, metadata makes objects meaningful for the future by providing important context.
An example of a metadata record is a catalog entry for a book, which has a number of values for a number of fields. For example, Catcher in the Rye is a value for the 'title' field. Metadata standards are collections of fields that have been created to describe certain types of objects. (There are a lot of different types of things out there!) It's always a good idea to use a standard whenever available.
A number of standards have been developed for different types of data, usually for data associated with domains of knowledge or disciplines. Often, these standards will have fields that provide necessary context for understanding a dataset. Ask around in your department or research group to see if there's a standard used in your field.
Metadata works best when its consistent. Don't use synonyms to describe the same thing: choose a term! (Libarians and catalogers often use controlled vocabularies to keep things consistent, but you won't necessarily need to do that. We like this explanation of controlled vocabularies and some related concepts.)
If you're having difficulty with metadata, consult with the Library!
Always back up your data. Always. But aside from that, consider how best to store your data to ensure it is accessible and protected. If your data contains confidential or sensitive data, use encryption.
Good data management practices facilitate data preservation, which may be required by your funding body. The goal is to make sure that the objects are understandable (using documentation and description) and useful. This last part requires that you keep in mind the file formats in which you might save your data. It's good practice to
It's also a good idea to know who is in charge of the data. If you're privately preserving that data (i.e. not depositing it in an archive), make sure you know where it is!
University of Toronto Libraries
130 St. George St.,Toronto, ON, M5S 1A5
About web accessibility. Tell us about a web accessibility problem.
About online privacy and data collection.
© University of Toronto. All rights reserved. Terms and conditions.