By: Joelen Pastva, Head, Collection Management and Metadata Services
The oft-cited definition of metadata as "data about data" seems so simple to understand, and yet it obscures the fact that the concept in the wild can sometimes be difficult to pin down. Metadata is both ubiquitous and invisible. Essential for some, but an afterthought for others. How is this so? And perhaps more importantly, why does it matter? A researcher's knowledge of a project and access to its outputs are things that can be taken for granted. But both are highly dependent on the time during which that project was active, and all of its surrounding context, including collaborators, protocols, methods, software, etc.
Metadata provides a structured way to capture information that is essential in making research and its associated outputs discoverable, reusable, and sharable for demonstrating impact and investigating new projects. But the responsibility of supplying metadata often falls to the researcher, and it can often seem like unnecessary work. The following are some practical use cases to demonstrate why metadata is worth the effort.
As incentives have grown to make research and datasets publicly available due to funder mandates and journal policies, so too has the need to make them findable. Although it may seem sufficient to deposit data and other outputs in an appropriate repository, doing so without sufficient metadata serves little purpose because the files remain essentially hidden. Repositories vary in the metadata options they provide, but a good rule of thumb is to be as descriptive as possible so that it is clear what your data is and how it can be used. A README file can often do the trick in serving as a guide to the context surrounding the data, including file naming conventions and tools used. Taking full advantage of other descriptive fields such as project title, keywords, co-author names, and grant information will allow for the data to be discovered more broadly, extending the reach of the project in its field. Many field-specific standards exist which provide common data elements and thesauri that can be helpful in guiding best practices for improved interoperability across systems.
The lives of data and scholarship don't end once a project is finished. Sharing data and research outputs encourages their citation and reuse in validating results, performing future investigations, and identifying new opportunities for collaboration. This adds value to research dollars already spent, and enables innovation and advancement in the field. Reuse would not be possible without metadata, which supplies the description and context necessary to allow outsiders to approach works and data with a clear understanding of how they are to be understood and used.
Aside from enabling proper citation, well-described and attributed work with clearly articulated relationships to funding agencies and affiliated institutions is incredibly important in helping to track activities of interest for the promotion and tenure process. Research information management systems increasingly rely on automated processes to harvest data about research activities to measure impact. Incomplete metadata can make it more difficult to locate activities, or to make the connections necessary to understand the impact of research teams and grant funds.
Contact your Galter liaison librarian to learn more about how rich metadata for your research outputs and datasets can vastly improve the reach of your work. With a little extra planning, metadata can be easily integrated into project planning, article and data submission, and long-term preservation activities to ensure that your research remains accessible, findable, and usable for the greatest impact.
Updated: September 24, 2021