Proper data storage is crucial for ensuring that it is stored securely, accessible when needed and shared in a way that maximises its value. To store one’s data properly, consider the following actions:
Identify a suitable data storage location. Recommended storage media include cloud services, the university network drive or designated data repositories. Cloud services enable one to collaborate with other partners beyond the university. It is important to ensure that the service provider is trustworthy and that regular backups are made. The university network drives are suitable for collaborating with other researchers within the university community. Analysed data can be stored and shared via thematic or institutional repositories.
Managing the different versions and copies of one’s research data carefully is extremely important. Suggestions include the protection of raw data, distinguishing between temporary and master copies of one’s data, backing-up one’s master copy in a physically distinct location, and setting up a strategy for identifying your latest version control.
Raw data is protected by being stored in a separate folder that is set to ‘read only’. Actual analyses should be performed on a working copy of one’s data. Since one’s working files may be constantly changing and to keep track of the latest version, it is essential to select one place where the master copies of one’s data are located. To rule out the possibility of losing one’s master copy, it is recommended that back-ups of one’s master data files are stored in different locations. To identify a particular version of a file or folder, the use of an extension to the file name with ordinal numbers indicating major and minor changes (eg 'v1.00', 'v1.01', 'v2.06') is recommended. In a version control table (or file history or log file), one can document what is new or different in each major version that one keeps.
Presuming that during the course of one’s research numerous files are generated, all with different content, coming up with a logical and standardised folder structure and file naming convention before starting a research project is imperative.
For the folder structure, anticipate the type of files to be produced and envision the folders for these files. The structure should be scalable to enable expansion. A well-arranged folder structure in which folders and sub-folders are hierarchical and follow each other logically is invaluable in quickly navigating one’s data and finding what one requires. It can be very helpful to draw up one’s folder structure in a diagram in one’s DMP.
For file naming, clear coded names built from elements such as project name, project number, name of research team/department, measurement type, subject, date of creation and version number is recommended. Only use characters from the sets A-Z, a-z, 0-9, hyphen, underscore, and dot. Don't use special characters such as &%$#), as different operating systems can assign different meanings to these characters. An example of a file name could be: ‘MicroArray_NTC023_20230416.xls (content description, project number, date: international standard). File name conventions could also be included in the DMP.
To facilitate retrievability and accessibility, it is essential to generate the appropriate metadata. The scope of the metadata is to provide useful information about one’s data. Bear in mind that metadata should be structured, machine readable and interoperable.
As not all the existent file formats are widely accessible or future-proof, it is recommended to use a standard format for one’s stored files. The format is indicated by the file extension at the end, such as .wmv, .mp3, or .pdf. The following characteristics will help to ensure access:
Different measures to secure one’s files include the protection of data files, computer system security and physical data security. Information in data files can be protected by controlling access to restricted materials with encryption. Avoid sending personal or confidential data via email or through File Transfer Protocol (FTP). Subsequently, transfer it as encrypted data e.g. via SURFfilesender and WeTransfer. Also, if needed, data should be destroyed in a consistent and reliable manner. Note that deleting files from hard disks only removes the reference to it, not the file itself. Overwrite the files to scramble their contents or else use secure erasing software. The computer one uses to consult, process and store your data, can be secured by using a firewall to protect one’s data from viruses, installing anti-virus software, installing updates and upgrades for one’s operating system and software, and using secured wireless networks. Furthermore, use passwords on all one’s devices and do not share them with anyone. If necessary, also secure individual files with a password. With simple measures, one can also ensure the physical security of one’s research data. These include locking one’s computer/laptop, locking the door when out of one’s office, not leaving unsecured copies of one’s data lying around, and keeping non-digital material which should not be seen by others in a locked cabinet or drawer.
Other useful tools to support research data during this phase include: