Frequently Asked Questions


Data


In the context of SODHA, ‘data’ designates most if not all data collected and/or reused in the framework of research projects. In the SODHA Deposit Agreement (Consid. 3), we refer to the definition provided by the EU Directive on open data. Although this is a very far-reaching definition, this allows for new forms of data to arise (e.g. in the recent years, social media and social network data), which also need archiving and which, ideally, can be published in reusable form. We also want to preserve the researchers’ agency and let them to a large extent determine what constitutes a data collection that should be considered for publication.

‘Metadata’ designates information about data, or, in other words: documentation. Strictly speaking, the SODHA Deposit Agreement characterizes ‘metadata’ as ‘the content of all fields of the archives management system that must be filled in to describe the dataset upon deposit’.

Although the distinction between data and metadata is not always clear, there is a basic layer of information about data (metadata) which is required to disseminate of datasets and make them findable online. For example, the title and the authors of a data collection, the subjects covered, the dates of collection, etc. are information that potential reusers need to determine if a certain published dataset is relevant to their research or not.

List of files with exclamation mark signs

Ingest of tabular data files can be tricky. The Dataverse software, which SODHA relies upon, is known to give false positives when checking for errors during ingest, especially with .xlsx files (see the ‘Tip’ box below).

Tip

If the file you uploaded is in Microsoft Excel’s .xlsx format, there is a good chance that the ingest was actually succesful. Dataverse wrongly detects errors when

  • an .xlsx file contains a user-encoded linebreak (Alt + Enter in Excel);
  • an .xlsx file contains only one row of data;
  • a cell in the first line of an .xlsx file contains a very long sequence of characters.

The ingest is successful most of the time despite the error message, as shown by comparing MD5 values (see the next ‘Tip’ box for more information).


If Dataverse claims that the ingest of your file(s) was ‘completed with errors’, we suggest that you follow these steps:

  1. Try uploading your file(s) again a few times;

  2. If Dataverse still claims that the ingest was ‘completed with errors’, try comparing the MD5 value for your file on www.sodha.be with the MD5 value that you get when you submit your file to an online MD5 calculator (see the ‘Tip’ box just below for more information about MD5 and MD5 calcultors);

  3. If neither of those steps solved the problem, contact us at sodha@arch.be and we will gladly help if we can.

Tip

MD5 fingerprints are unique values produced by an algorithm used to scan (data) files. Even changing a single comma in a file will return a different MD5 value.

The Dataverse software automatically produces MD5 values for each file that is deposited on the SODHA platform. To find a file’s MD5 value, click on its name in the Files section of your dataset.


You can compare the MD5 value that your file received after ingest on the SODHA platform with the MD5 value returned by online calculators such as OnlineMD5. You can also use the built-in MD5 calculators of Windows, Mac and Linux, though the operation can be a little more complex.


Unfortunately, SODHA cannot at this time accept sensitive data (typically though not exclusively, personally identifiable information [PII]) as specified in the SODHA Dataset Publishing Policy. SODHA currently lacks the means to guarantee the safe preservation of and secured access to sensitive data.

However, the State Archives of Belgium are planning to develop a ‘digital vault’ to offer a solution to Belgian scientists who need to store sensitive data collections. Updates on the project will be communicated as soon as possible.

In the meantime, researchers are welcome to document their data (though they might not be able to deposit the data for safety reasons) by creating metadata records (datasets without data files) on the SODHA platform. In this way they can signal the existence of their data even though the files may not be accessed for the time being. See this section of the FAQ for more information.

The SODHA platform can handle file volumes up to maximum 2.5 GB per file. If you would like to deposit files that are larger than this, you can either split the contents of the file between subfiles to make the deposit possible, or you can contact us so that we can work out a solution.

If the file that you would like to delete has not been published yet, you can do it yourself in the Files tab of your dataset. Check the box for the corresponding file(s) then click on Edit and select Delete:

File interface with Delete option highlighted

If you need a published file or dataset to be deleted, you must contact the SODHA administrators at sodha@arch.be and explain why this has to be done.

A word about deleting files

Ideally, what has been published should remain available, in keeping with the philosophy of open science. However, it can happen that files must be deleted because of intellectual property claims or because it was later discovered that a data file still contains personally identifiable information (PII).

Please note that, as stated in the SODHA Deposit Agreement, art. 12-14, part of the metadata relevant to data that were deleted must remain accessible, alongside notice that said data were previously accessible.


Please see the dedicated page about recommended file formats.



Legal aspects


Two things must be done to properly embargo your data:

  1. Restrict access to the embargoed data file(s);

  2. Mention the embargo and its duration in the dedicated field in the Terms tab of your dataset record, Availability Status:

Terms tab with field Availability Status highlighted

Please note that, as specified in the SODHA Access and Reuse Policy, embargos can last no longer than 1 (one) year.

It is advised that, once your embargo reaches its end, you check your dataset for any necessary change in terms of files or metadata and that you then submit it for publication using the Submit for Review button:

Draft Unpublished dataset with Submit for Review button highlighted

You can change the terms of your dataset anytime, but please note that, for these changes to be taken into account, you will need to go through the publication process again. This means that you will need to approve the SODHA Deposit Agreement once again, so any change that you make in your dataset must also be compliant with the agreement.

It is often difficult to determine who exactly owns (the rights to) a dataset, whether it is an organization, an individual or a group of individuals. As someone who wants to reuse data published by SODHA, what you need to know is that, except when there is a framework agreement, SODHA enters into agreements only with depositors (via the SODHA Deposit Agreement).

When depositors no longer work for the entity that employed them at the time when they deposited a dataset in SODHA, the responsibility of answering queries and (when applicable) managing requests for access to restricted files falls to the research center to which the depositor used to be affiliated.

Either way, don’t hesitate to contact depositors by using the Contact Depositor button on a dataset’s webpage:

Dataset information with Contact Depositor button highlighted

As stated in the SODHA Dataset Publishing Policy, SODHA administrators work exclusively for the State Archives of Belgium. This policy is meant to prevent possible conflicts of interest.



Metadata


If you need to change, correct or update your dataset’s metadata, simply go to the webpage of your dataset and either click on Edit then on Metadata, or view the Metadata tab and click on Add + Edit Metadata :

Dataset information with Add plus Edit Metadata button highlighted

Once you are finished modifying your dataset’s metadata, you will have to submit your dataset for publication once again by clicking on Submit for Review.

Yes, SODHA gladly welcomes the creation of metadata records though the associated data might not be (yet or directly) available.

Mostly two situations are likely:

  1. The data cannot be made accessible yet.

In that case, please consult the SODHA Access and Reuse Policy about embargo conditions. See also the question just above in the present FAQ.

  1. The data has already been deposited elsewhere, so it would be redundant to make a second data deposit.

Even if the data have already been published, re-publishing the metadata can increase their online visibily. SODHA makes a point of referencing original archives (see the dedicated field in the Terms section of a dataset’s metadata).

Please contact us if your dataset
has already received a digital object identifier (DOI)!

If your dataset has already received a DOI, there is a good chance that the metadata can be automatically transferred into SODHA. This way, you won’t have to manually copy everything from one system to another.

If you don’t understand the label of certain metadata elements, don’t hesitate to hover over the infobubble for those elements. You will be shown an infobubble with a definition of the element:

Data deposit form with infobubble of a field appearing

You can also consult our Metadata and Terms Guide in you need additional clarifications.

If you would like another user to be able to modify your datasets (edit metadata and terms, add or remove files), you should use the Contact button on the homepage or send an email to sodha@arch.be and explain which user(s) must be given editing rights for which dataset(s). The SODHA administrators will then grant editing rights to the user(s) in question.

Make sure you do this only for people whom you fully trust, as granting editing rights to other users will allow them to download the data file(s) you may have already uploaded.

Yes, dataset templates can be created on the SODHA platform. Contact us at sodha@arch.be and explain to us which information you would like to encode in a template so that you don’t have to enter it repeatedly.

  • How can I get a machine-readable export of a dataset’s metadata?

To obtain a copy of the metadata that describe a dataset, access the webpage of the dataset, select the Metadata tab and click on Export Metadata. You can choose which format you want for the metadata export.

Dataset Metadata tab with Export Metadata button highlighted

  • How does SODHA manage dataset versions?

Please see the SODHA Dataset Version Management Policy.



Access to data


  • Who grants access to restricted data?

If depositors restrict access to one or several files of their dataset, they will receive e-mail notifications when someone requests access to those files. See this page for more information.

  • I requested access to a restricted file/dataset a while ago but I didn’t get any feedback. What do I do?

If you clicked on Request Access some time ago and you didn’t get any feedback, you can poke the depositor for this dataset by using the Contact Depositor button the webpage of the dataset.

If you are still unable to reach them, you can also try contacting the research center which the the author of a dataset is affiliated to (usually mentioned in the field Author). See also the Producer field.

  • How can I restrict access to my dataset’s files?

Check the boxes that correspond to the files for which you want to restrict access, then click on the topmost Edit button and select Restrict:

File add and edit interface with Restrict option highlighted

You will then be the only authorized user who can download the files.

File with restricted access as seen by depositor

The padlock icon indicates that the file cannot be downloaded. For you, the depositor, it appears in green and open because depositors can download their own files. To other users, it will appear closed and in red:

File with restricted access as seen by other, non-admin users

Please note that, as stated in the SODHA Deposit Agreement and the SODHA Access and Reuse Policy, when depositors submit for publication datasets that contain one or several files with restricted access, they must fill the field Terms of Access in the Terms section of their dataset’s metadata:

Terms tab with Terms of Access field highlighted



User information


  • How can I change my personal information? How can I change my password?

If you need to change your personal information or your password, click on your user name in the top right corner of any webpage on the SODHA website and select Account Information:

Personal account dropdown menu opened with Notifications option highlighted

Next, click on Edit Account and select either Account Information or Password.

  • Where can I find an overview of my research data?

To see all the datasets that you deposited in SODHA, click on your name in the toolbar and select My Data:

Personal account dropdown menu opened with My Data option highlighted

You will then have access to a list of all the datasets you deposited, along with search facets.

To see all the datasets in which you are mentioned, use the Advanced Search functionality, which you can access on the homepage:

Homepage with Advanced Search button highlighted

You can then enter your name in the Author > Name field.