Research data refers to the evidence that underpins the answer(s) to research question(s), hypothesis testing, and to validate findings and reproducibility regardless of its form (e.g., print, digital, or physical). These might be quantitative measurements and information, or qualitative statements collected by researchers in the course of their work by experimentation, observation, modelling, interviews, or other data-collection methods, or information derived from existing evidence. Data may be raw or primary (e.g., direct from measurement or collection) or derived from primary data for subsequent analysis or interpretation (e.g., following quality checks, gap filling or as an extract from a larger data set), or derived from existing sources where others may hold the rights. Data may be defined as a ‘relational’ or ‘functional’ component of research, thus signalling that its identification and value lies in whether and how researchers use it as evidence for claims. Some examples of types of research data include measurement, videos, surveys, interviews, photos, samples, transcriptions, translations, models, algorithms, protocols, and standards.
Raw data, also referred to as ‘source data’ or ‘primary data’, is data that has not been processed or analysed upon collection from the source. For this reason, raw data is difficult to comprehend or use in a meaningful way.
Conversely, processed data,, is data that has been cleaned, organised, and subjected to various operations or transformations to make it more usable and informative. As opposed to raw data, processed data is more structured, consistent, and easier to analyse. Most commonly, it is processed data that is made available in a data repository.
A dataset is a collection of data files. It can include both data and the means to generate, interpret, analyse, or validate it, as well as documents, files, etc.
Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and share-alike.
Data can be made openly accessible by depositing it in a data repository that allows unrestricted access to its content. The UM has its own data repository called drUM and its aim is to collect, archive, organise, and disseminate data produced by UM researchers.
Besides ensuring its immediate visibility, researchers can benefit from an increased impact of their data. Since the data is accessible without any restrictions or limitations, the entire research cycle is significantly accelerated.
The FAIR Data Principles are widely-adopted guiding principles that promote data reuse. Published in 2016, they include Findability (data and metadata should be formatted in a way that allows both human users and software to be easily retrievable), Accessibility (data should be easily accessible), Interoperability (data should allow for integration with other data as well as interoperability with various applications and workflows), and Reusability (both data and metadata should be sufficiently described to facilitate replication for different purposes).
Research Data Management (RDM) is a term that describes the organisation, storage, documentation, preservation, and sharing of data collected and used in a research undertaking. It involves the everyday management of research data during the lifetime of a research undertaking (e.g., using consistent file-naming conventions which describe the type of data within the file, the initials of the Principal Investigator, and the date). It also addresses collection strategies, backup and storage of data, data documentation, and ethical and legal requirements related to data, including, but not limited to, data anonymisation, data protection, data sharing, data archiving, and data destruction.
The Data Management Plan (DMP) is a plan that outlines how data is managed from the point of collection at the start of a research undertaking, all the way through to its analysis and elaboration of results, and how it will be used beyond the original research undertaking. Typically, a DMP will cover areas such as data types, formats and volumes of data collected, metadata, quality control, scientific integrity, specifics concerning access, and information concerning publications (as may be applicable).
A Principal investigator is a researcher responsible for a research undertaking, of any size, conducted for, on behalf of, or in association with the University; on University premises; or using University facilities.