Frequently Asked Questions
- It says that the upload of my data file was ‘completed with errors’ and there is now a red cross next to my file entry in the list of files. What went wrong?
Ingest of tabular data files can be tricky. The Dataverse software, which SODHA relies upon, is known to give false positives when checking for errors during ingest, especially with .xlsx files (see the ‘Tip’ box below).
If the file you uploaded is in Microsoft Excel’s .xlsx format, there is a good chance that the ingest was actually succesful. Dataverse wrongly detects errors when
The ingest is successful most of the time despite the error message, as shown by comparing MD5 values (see the next ‘Tip’ box for more information).
If Dataverse claims that the ingest of your file(s) was ‘completed with errors’, we suggest that you follow these steps:
- Try uploading your file(s) again a few times;
- If Dataverse still claims that the ingest was ‘completed with errors’, try comparing the MD5 value for your file on www.sodha.be with the MD5 value that you get when you submit your file to an online MD5 calculator (see the ‘Tip’ box just below for more information about MD5 and MD5 calcultors);
- If neither of those steps solved the problem, contact us at firstname.lastname@example.org and we will gladly help if we can.
MD5 fingerprints are unique values produced by an algorithm used to scan (data) files. Even changing a single comma in a file will return a different MD5 value.
The Dataverse software automatically produces MD5 values for each file that is deposited on the SODHA platform. To find a file’s MD5 value, click on its name in the Files section of your dataset.
You can compare the MD5 value that your file received after ingest on the SODHA platform with the MD5 value returned by online calculators such as OnlineMD5. You can also use the built-in MD5 calculators of Windows, Mac and Linux, though the operation can be a little more complex.
- After downloading tabular data in .tab format, the resulting file(s) contained only one tab. Where did the contents of the other tabs go?
Unfortunately, processing of tabular data files still poses a few problems. When a tabular data file in a non-open format is deposited on SODHA, the Web application automatically creates a second version of the file in open .tab format. Users can then download the data either in the original format (e.g., .xls, .xslx, .sav, .sas…) or a more accessible version in .tab (= .tsv, tab-separated values) format.
However, it has been observed, typically with .xls and .xlsx files, that when the original data file has more than one tabs, the ensuing .tab version retains only the first tab, and the other tabs fail to transfer into the second version of the file:
In that case, users are advised either to work with the file in its original format or, if that is not possible, to download the data file in its original file format and then to convert it into .tsv or .csv.
- What is the definition of ‘data’ in the context of SODHA? How are ‘data’ and ‘metadata’ differentiated? Is the distinction between primary and secondary data relevant?
In the context of SODHA, ‘data’ designates most if not all data collected and/or reused in the framework of research projects. In the SODHA Deposit Agreement [ DE – EN – FR – NL ] (art. 3), we refer to the definition provided by the EU Directive on open data. Although this is a very far-reaching definition, this allows for new forms of data to arise (e.g. in the recent years, social media and social network data), which also need archiving and which, ideally, can be published in reusable form. We also want to preserve the researchers’ agency and let them to a large extent determine what constitutes a data collection that should be considered for publication.
‘Metadata’ designates information about data, or, in other words: documentation. Strictly speaking, the SODHA Deposit Agreement [ DE – EN – FR – NL ] characterizes ‘metadata’ as ‘the content of all fields of the archives management system that must be filled in to describe the dataset upon deposit’.
Although the distinction between data and metadata is not always clear, there is a basic layer of information about data (metadata) which is required to disseminate of datasets and make them findable online. For example, the title and the authors of a data collection, the subjects covered, the dates of collection, etc. are information that potential reusers need to determine if a certain published dataset is relevant to their research or not.
- Can data files with sensitive information (e.g. personal data) be deposited in SODHA?
Unfortunately, SODHA cannot at this time accept sensitive data (typically though not exclusively, personally identifiable information [PII]) as specified in the SODHA Dataset Publishing Policy [ DE – EN – FR – NL ]. SODHA currently lacks the means to guarantee the safe preservation of and secured access to sensitive data.
However, the State Archives of Belgium are planning to develop a ‘digital vault’ to offer a solution to Belgian scientists who need to store sensitive data collections. Updates on the project will be communicated as soon as possible.
In the meantime, researchers are welcome to document their data (though they might not be able to deposit the data for safety reasons) by creating metadata records (datasets without data files) on the SODHA platform. In this way they can signal the existence of their data even though the files may not be accessed for the time being. See this section of the FAQ for more information.
- I would like to deposit one or several large data files. Can SODHA handle very large deposits? Is there a file size limit?
The SODHA platform can handle file volumes up to maximum 2.5 GB per file. If you would like to deposit files that are larger than this, you can either split the contents of the file between subfiles to make the deposit possible, or you can contact us so that we can work out a solution.
- Can you delete my file/dataset?
If the file that you would like to delete has not been published yet, you can do it yourself in the Files tab of your dataset. Check the box for the corresponding file(s) then click on Edit and select Delete:
If you need a published file or dataset to be deleted, you must contact the SODHA administrators at email@example.com and explain why this has to be done.
A word about deleting files
Ideally, what has been published should remain available, in keeping with the philosophy of open science. However, it can happen that files must be deleted because of intellectual property claims or because it was later discovered that a data file still contains personally identifiable information (PII).
Please note that, as stated in the SODHA Deposit Agreement [ DE – EN – FR – NL ], art. 12-14, part of the metadata relevant to data that were deleted must remain accessible, alongside notice that said data were previously accessible.
- What file formats are recommended for depositing data?
- My data are under an embargo. Where do I specify this?
Two things must be done to properly embargo your data:
- Restrict access to the embargoed data file(s);
- Mention the embargo and its duration in the dedicated field in the Terms tab of your dataset record, Availability Status:
It is advised that, once your embargo reaches its end, you check your dataset for any necessary change in terms of files or metadata and that you then submit it for publication using the Submit for Review button:
- Can I change my datasets’ terms?
You can change the terms of your dataset anytime, but please note that, for these changes to be taken into account, you will need to go through the publication process again. This means that you will need to approve the SODHA Deposit Agreement [ DE – EN – FR – NL ] once again, so any change that you make in your dataset must also be compliant with the agreement.
- How can I know who exactly owns (the rights to) a dataset?
It is often difficult to determine who exactly owns (the rights to) a dataset, whether it is an organization, an individual or a group of individuals. As someone who wants to reuse data published by SODHA, what you need to know is that, except when there is a framework agreement, SODHA enters into agreements only with depositors (via the SODHA Deposit Agreement [ DE – EN – FR – NL ]).
When depositors no longer work for the entity that employed them at the time when they deposited a dataset in SODHA, the responsibility of answering queries and (when applicable) managing requests for access to restricted files falls to the research center to which the depositor used to be affiliated.
Either way, don’t hesitate to contact depositors by using the Contact Depositor button on a dataset’s webpage:
- Who reviews datasets that were submitted for publication?
As stated in the SODHA Dataset Publishing Policy [ DE – EN – FR – NL ], SODHA administrators work exclusively for the State Archives of Belgium. This policy is meant to prevent possible conflicts of interest.
- How do I edit my dataset’s metadata?
If you need to change, correct or update your dataset’s metadata, simply go to the webpage of your dataset and either click on Edit then on Metadata, or view the Metadata tab and click on Add + Edit Metadata :
Once you are finished modifying your dataset’s metadata, you will have to submit your dataset for publication once again by clicking on Submit for Review.
- Is it possible to create datasets without data files? (e.g. if my data have already been deposited elsewhere.)
Yes, SODHA gladly welcomes the creation of metadata records though the associated data might not be (yet or directly) available.
Mostly two situations are likely:
- The data cannot be made accessible yet.
- The data has already been deposited elsewhere, so it would be redundant to make a second data deposit.
Even if the data have already been published, re-publishing the metadata can increase their online visibily. SODHA makes a point of referencing original archives (see the dedicated field in the Terms section of a dataset’s metadata).
Please contact us if your dataset
has already received a digital object identifier (DOI)!
If your dataset has already received a DOI, there is a good chance that the metadata can be automatically transferred into SODHA. This way, you won’t have to manually copy everything from one system to another.
- I don’t understand the meaning of certain metadata elements.
If you don’t understand the label of certain metadata elements, don’t hesitate to hover over the infobubble for those elements. You will be shown an infobubble with a definition of the element:
You can also consult our Metadata and Terms Guide in you need additional clarifications.
- How can I give editing rights for a dataset to another user?
If you would like another user to be able to modify your datasets (edit metadata and terms, add or remove files), you should use the Contact button on the homepage or send an email to firstname.lastname@example.org and explain which user(s) must be given editing rights for which dataset(s). The SODHA administrators will then grant editing rights to the user(s) in question.
Make sure you do this only for people whom you fully trust, as granting editing rights to other users will allow them to download the data file(s) you may have already uploaded.
- I want to deposit several datasets that share a number of features, but I don’t want to copy/paste the recurring metadata all the time. Is there a way to create dataset templates?
Yes, dataset templates can be created on the SODHA platform. Contact us at email@example.com and explain to us which information you would like to encode in a template so that you don’t have to enter it repeatedly.
- How can I get a machine-readable export of a dataset’s metadata?
To obtain a copy of the metadata that describe a dataset, access the webpage of the dataset, select the Metadata tab and click on Export Metadata. You can choose which format you want for the metadata export.
- How does SODHA manage dataset versions?
Access to data
- Who grants access to restricted data?
If depositors restrict access to one or several files of their dataset, they will receive e-mail notifications when someone requests access to those files. See this page for more information.
- I requested access to a restricted file/dataset a while ago but I didn’t get any feedback. What do I do?
If you clicked on Request Access some time ago and you didn’t get any feedback, you can poke the depositor for this dataset by using the Contact Depositor button the webpage of the dataset.
If you are still unable to reach them, you can also try contacting the research center which the the author of a dataset is affiliated to (usually mentioned in the field Author). See also the Producer field.
- How can I restrict access to my dataset’s files?
Check the boxes that correspond to the files for which you want to restrict access, then click on the topmost Edit button and select Restrict:
You will then be the only authorized user who can download the files.
The padlock icon indicates that the file cannot be downloaded. For you, the depositor, it appears in green and open because depositors can download their own files. To other users, it will appear closed and in red:
Please note that, as stated in the SODHA Deposit Agreement [ DE – EN – FR – NL ] and the SODHA Access and Reuse Policy [ DE – EN – FR – NL ], when depositors submit for publication datasets that contain one or several files with restricted access, they must fill the field Terms of Access in the Terms section of their dataset’s metadata:
- How can I change my personal information? How can I change my password?
If you need to change your personal information or your password, click on your user name in the top right corner of any webpage on the SODHA website and select Account Information:
Next, click on Edit Account and select either Account Information or Password.
- Where can I find an overview of my research data?
To see all the datasets that you deposited in SODHA, click on your name in the toolbar and select My Data:
You will then have access to a list of all the datasets you deposited, along with search facets.
To see all the datasets in which you are mentioned, use the Advanced Search functionality, which you can access on the homepage:
You can then enter your name in the Author > Name field.