Find the answers to your questions regarding this repository service, data publishing and preservation here.
What is the Data Repository service?
The Data Repository service is a data publication platform for research datasets. Any data that has been used during research projects and is referred to in published articles can be stored on the platform in order to long-term preserve it and make it available to others.
You can start depositing right away by starting with the first step of the online deposit workflow and start uploading your data (requires registration). If you need to publish large datasets or have special requirements for your publication, please contact us.
Who can use the Data Repository?
The Data Repository is open to all researchers and scientists who are affiliated to research institutions, universities as well as to individual researchers. To deposit research data registration is required, as well as for access to and downloading files of specific privately-shared datasets.
Why would I publish my dataset?
By storing and publishing your dataset in a repository you can guarantee the data will be available and preserved for a long time and make it findable and reusable for other researchers. For example, as soon as you have published an article, it is worthwhile to make the corresponding research data that was used and produced during the research project available to other researchers or anyone else. These people are then able to replicate the steps and reproduce the conclusions made in the published research article.
Depending on the conditions in your funding contract, publishing in a trusted repository might successfully conclude your research project as you are complying to the requirements stated by the financier and/or publisher concerning the handling of your research data.
Can I publish any data I have or find elsewhere online?
No, you cannot (re)publish any data that does not belong to you, since you need to have the rights and ownership of the data and in case the data is not yours, explicit permission of the original owner is required. Also, the service is meant for publication of research data, any other data will be rejected.
So what does this all cost?
Uploading and publishing small datasets can be done for free, but requires registration and therefore approval of the service administrators. Larger datasets exceeding several gigabytes (GB) needs data contracts that can be requested and negotiated by contacting us.
There is no limit on the number of deposits and/or collections you create.
What is needed for publication?
In order to publish a dataset you need one or more files that can be published and adhere to the basic metadata requirements, like providing the title and description of your publication. It is important that you are allowed to publish the data and, if applicable, own the intellectual property corresponding to the data and metadata you are publishing.
What is metadata and why should I add it to my publication?
Metadata is the accompanying information that describes and enriches the dataset and its files. It makes sure the dataset can be understood, discovered, correctly cited and authorised for use to other parties. Metadata is added during the creation of the publication and should not be altered after the publication is finalised.
In general, several types of metadata are distinguished: technical, descriptive and additional metadata. Technical metadata is automatically gathered and assigned by the repository, like file and system information and persistent identifiers. Descriptive and possibly additional metadata need to be provided by the data producer and/or owner of the dataset. Basic (mandatory) metadata includes among others the title, author, description, date and license of the dataset. If there is more descriptive metadata available which is useful to add to the publication, this can be added during the ingest of the data.
Many metadata are assigned to the publication using a schema, usually called the metadata schema. Basic metadata is defined in the default metadata schema, but communities can also create their own schemas that can be used for publication of datasets.
What is a persistent identifier?
A persistent identifier (PID) is a unique Digital Object identifier and consists of a prefix number followed by a unique string of characters, for instance a UUID. Prefixes are assigned by internationally recognised DONA (Digital Object Numbering Authority) agencies and hosted at a PID service provider. The Data Repository creates PIDs that are based on the Handle system. The repository's principle PID provider for allocation and resolution of PIDs is hosted at SURFsara. You can read more about PIDs on the website of Persistent Identifiers for eResearch.
How can I organise my data?
There are several ways to organise your data. Generally, files that belong to a single dataset can be published in a single deposit. If your dataset is very large and can be divided naturally or easily into different deposits, this is certainly possible. These deposits can thereafter be grouped into a collection with its own metadata. From this collection another user can easily find each deposit in that collection.
If you have many collections and deposits to make, it can be tedious to do this through the website. Please contact the advisors of SURFsara in order to obtain assistance in ordering your dataset deposits.
My dataset is very very large, what can I do to publish it?
If your dataset exceeds several gigabytes or even terabytes, it can be difficult or even impossible to upload the data through a browser to the Data Repository. This largely depends on the size of your dataset, the speed of your network connection and other technical limitations, such as browser timeouts. Also, the repository currently does not allow file sizes larger than 2 GB and a total size of 10 GB and 10 files for a new deposit.
To avoid these problems, please contact the advisors of SURFsara to set up efficient data transfers to a separate storage location. Once transfers have completed, the advisors will help creating the deposits and collections.
If you have any other questions not mentioned here, please contact us. We will get back to you as soon as possible.