Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Data Repository service is a data publication platform for research datasetsdata sets. Any data that has been used during research projects and is referred to in published articles can be stored on the platform in order to long-term preserve it and make it available to others.

...

The Data Repository is open to all researchers and scientists who are affiliated to research institutions, universities as well as to individual researchers. To deposit research data registration is required, as well as for access to and downloading files of specific closed datasetsdata sets.

Why would I publish my datasetdata set?

By storing and publishing your dataset data set in a repository you can guarantee the data will be available and preserved for a long time and make it findable and reusable for other researchers. For example, as soon as you have published an article, it is worthwhile to make the corresponding research data that was used and produced during the research project available to other researchers or anyone else. These people are then able to replicate the steps and reproduce the conclusions made in the published research article.

...

So what does this all cost?

Larger datasets data sets need data publication contracts that can be requested and negotiated by contacting us. There is no limit on the number of deposits and/or collections you create.

What is needed for publication?

In order to publish a dataset data set you need one or more files that can be published and adhere to the basic metadata requirements, like providing the title and description of your publication. It is important that you are allowed to publish the data and, if applicable, own the intellectual property corresponding to the data and metadata you are publishing.

Furthermore, for large publications the following information is needed:

  • How many data sets are being published?
  • Which file formats are used and what kind of data is contained?
  • How many files are included in the data sets and what is their file size distribution?
  • How many downloads are expected in the short and long term?
  • Which publications (articles) are related to the data set publications?
  • For which duration the data set needs to be available?

In order to get the data to our premises, it is useful to know where your data is currently residing.

What is metadata and why should I add it to my publication?

Metadata is the accompanying information that describes and enriches the dataset data set and its files. It makes sure the dataset data set can be understood, discovered, correctly cited and authorised for use to other parties. Metadata is added during the creation of the publication and should not be altered after the publication is finalised.

In general, several types of metadata are distinguished: technical, descriptive and additional metadata. Technical metadata is automatically gathered and assigned by the repository, like file and system information and persistent identifiers. Descriptive and possibly additional metadata need to be provided by the data producer and/or owner of the datasetdata set. Basic (mandatory) metadata includes among others the title, author, description, date and license of the datasetdata set. If there is more descriptive metadata available which is useful to add to the publication, this can be added during the ingest of the data.

Many metadata are assigned to the publication using a schema, usually called the metadata schema. Basic metadata is defined in the default metadata schema, but communities can also create their own schemas that can be used for publication of datasetsdata sets.

What is a persistent identifier?

...

There are several ways to organise your data. Generally, files that belong to a single dataset data set can be published in a single deposit. If your dataset data set is very large and can be divided naturally or easily into different deposits, this is certainly possible. These deposits can thereafter be grouped into a collection with its own metadata. From this collection another user can easily find each deposit in that collection.

If you have many collections and deposits to make, it can be tedious to do this through the website. Please contactthe advisors of SURFsara in order to obtain assistance in ordering your dataset data set deposits.

My dataset data set is very very large, what can I do to publish it?

If your dataset data set exceeds several terabytes, it can be difficult or even impossible to upload the data through a browser to the Data Repository. This largely depends on the size of your datasetdata set, the speed of your network connection and other technical limitations, such as browser timeouts. Also, the repository currently does not allow file sizes larger than 2 GB and a total size of 10 GB and 10 files for a new deposit.

...

How can I start publishing my datasetdata set?

First contact us to explain and discuss your publication needs. Once you have approved your request and you have been given access, you can start depositing right away by starting with the first step of the online deposit workflow and start uploading your data.

If you need to publish large datasets data sets or have special requirements for your publication, alsocontact us.

...