Certification support seminar on FAIR data
By Josefine Nordling and Henrik Jakobsen
The group within EOSC-Nordic focusing on FAIR data arranged a plenary webinar on certification support early September for a selected group of Nordic and Baltic data repositories. The webinar aimed to inform about the available and relevant certifications with FAIR elements added, where Trust enabling measures are key by adhering to various best practices and standards.
The webinar also addressed the concrete support measures the project can realistically offer the data repositories who choose to participate in the certification process or parts of it. Lastly, the webinar provided the participants with a successful certification adoption story, which brought forward several inspirational takeaways to repositories considering taking part in the certification process.
The FAIR data work package leader Andreas Jaunsen from Nordforsk introduced the project and its work so far. During the project’s first year, the task force working with FAIR practices has assessed the FAIR maturity levels of around 100 data repositories. Ten randomly selected datasets per each data repository were included in the machine-actionable FAIR maturity assessment exercise. The assessment is based on 22 maturity indicators, which addresses all the letters of FAIR. Around 25 % of the selected data repositories did not pass the minimum requirement of having the datasets identified by a globally unique identifier (GUID) and were thus discarded from the sample. The task force found that around two-thirds of the assessed data repositories scored medium-low, and 12 % scored high in the FAIR maturity evaluation. The FAIR maturity assessments will be re-evaluated regularly, as data repositories might improve their scores throughout the project’s lifetime. Some general learnings from this exercise are that all data repositories should be registered in the Registry of Research Data Repositories, re3data.org, for discoverability reasons. All datasets should be identified by a GUID (preferably a persistent identifier). The concept of FAIR digital objects for published datasets should be implemented. There also needs to be a specific license agreement in place covering all datasets. Read more about this work in the newly published deliverable D4.1 An assessment of FAIR-uptake among regional digital repositories.
Certifications + FAIR requirements & support modes
Mari Kleemola from the Finnish Social Science Data Archive (FSD) introduced the FAIR ecosystem, the TRUST principles, and FAIR certification in the first part of the webinar. According to Mari, certification requires an effort. Still, it ultimately serves as a self-assessment and competence-building exercise, allowing the repository to identify key areas for improvement while also demonstrating the level of FAIRness and trustworthiness to other stakeholders, users, and the designated community as a whole. Birger Jerlehag from the Swedish National Data Service (SND) subsequently described how the EOSC-Nordic project WP4 could support participating repositories in continuously becoming more FAIR or strive towards acquiring the CTS. Support modes include, e.g., the documentation of workflows, user licenses, how to establish long time digital preservation strategies or meet the needs of the designated community. Birger explained how a short questionnaire would be sent out to all webinar participants, aiming to map the repository maturity levels and identify areas where assistance is needed.
Lessons learned from one particular certification process
Trond Kvamme from the Norwegian Centre for Research Data (NSD) shared their certification adoption story to shed some light on what a certification process in practice means — the benefits it brings and inspires others to go down the same path. Their Research Data department initiated the certification process in 2014. NSD acquired the Data Seal of Approval (DSA) certification in January 2015, resulting from two self-assessment exercises of reviewing requirements and an iterative process with a board of reviewers. Partially with a CTS database’s help, including documented implementation processes of other data repositories, they could renew the certification with little effort in 2018, when they acquired the Core Trust Seal (CTS) certification.
The drivers for applying for certification were to achieve enhanced processes for internal affairs and standardised documentation practices. However, motivational factors were also to show their trustworthiness as a data archive for their users and stakeholders and to raise general awareness of research data management practices.
The whole certification process proved to be relatively successful for NSD. They can now identify process alignment areas with the OAIS model, which has led to improved process documentation and more efficient processes with a direct linkage to the organisational policies and strategy. As a result, they have now initiated a revision of their strategy. Furthermore, the certification process proved to be very helpful in cases where technical challenges are concerned, as it provided a framework and context needed to overcome those challenges.
All in all, the certification process has brought many advantages to NSD, with higher quality technical tools and processes in place and better-aligned services. A lesson learned is to perform self-assessments to regularly keep the documentation, processes, and services up-to-date and be better equipped when dealing with FAIR requirements.
Main questions and discussions points during the webinar
The second half of the webinar consisted of a Q&A-session. The participants had the opportunity to pose questions to the FAIR data work package leader Andreas Jaunsen and the other presenters, mainly on FAIR scores, technical issues, and certification. In this section, we run through some of these main questions.
The questions revolved around FAIR data and technical aspects of the examination of repositories. For example, participants wanted to know how to gain access to FAIR Scores, where to locate the FAIR Evaluator, how to react if their repository had not been evaluated, and how accurate and comparable the FAIR Scores are to be perceived. Andreas Jaunsen answered that the FAIR Scores are visible in the D4.1 deliverable, which is now available on the EOSC-Nordic web page. Alternatively, the participants could contact him directly via email. Additionally, the FAIR Maturity evaluator is accessible via Github.
Secondly, all repositories taking part in this webinar should have been evaluated in the project’s earlier stages. Should this not be the case, the participants were informed to contact Andreas. It was also noted that recordings from FAIRification webinar held on the 22nd of April presenting the FAIR Scores results are available on our YouTube channel.
Thirdly, in terms of the FAIR Scores’ accuracy and comparability among the sampled repositories, Andreas Jaunsen told that the FAIR Maturity Evaluator executes 22 specific FAIR metrics, which are technically and accurately defined. The only manually conducted part of the survey has been the selection of datasets for a given repository. The results from the FAIR Maturity Evaluator, however, are considered correct and reliable. A list of FAIRification recommendations to address failed FAIR evaluation tests can be found in EOSC-Nordic Knowledge Hub.
The second set of questions dealt primarily with certification. Participants asked, for example, what the first steps are to become certified and improve FAIRness, what the institutional challenges are with the CTS certification, and how much time and resources it takes.
To the first question, Mari Kleemola responded that the repositories should first and foremost get in contact with this work package. The repositories should also assess the 16 requirements, evaluate how they fit needs, ask themselves how well they understand them and have processes, documentation, and practices in place. The focus should be on the quality of the services and their sustainability rather than direct certification. Concerning an immediate improvement of FAIRness and FAIR metrics, repositories are advised to arrange metadata and data to be machine-actionable and testable. To find more inspiration, we recommend you to read FSD’s recent experiences on how to become more FAIR here. Moreover, please contact the project or the webinar organisers and also note that EOSC-Nordic WP4 will host several hackathons or FAIRification events in the next months and years.
Considering the institutional challenges with FAIR certification, Mari Kleemola answered that certification requires commitment from the institution manager and that the main challenge is to be able to describe the repository’s organisation model properly. Birger Jerlehag stressed that the repository should also assure that the relevant agreements and contracts are in place (especially if you are outsourcing) before proceeding with a process like this. Challenges also relate to sustainability, e.g., lacking plans for situations like obsolescence of file formats or need of migration, or for (unexpected) ending of funding.
Finally, with the CTS certification, Mari Kleemola explained that the required time and resources depend on the organisation’s documentation, both internal and public, and practices. If the repository does not have these things in place and needs to create a lot of documentation, it will inevitably take more time. The formal CTS certification process from the moment of submitting to receiving the certification takes on average about 4-6 months, depending on how many review rounds are needed. The administrative fee for CoreTrustSeal certification is €1000,00. Given that certification is valid for three years, the annual cost is €333,00. Waivers and volume discounts are available – please see further information on the CoreTrustSeal website. Indeed, CTS is not “one-size-fits-all”. If your repository is not geared towards long-term preservation, it can still be useful to self-assess vis-a-vis the CTS requirements, get feedback from EOSC-Nordic, and improve your processes. The WP4 also works with other EOSC-related projects and the CTS Board as the goal is also to develop the CTS framework to answer the needs of various repositories better.
Wrap-up & looking ahead
Is your data repository interested in taking part in the certification process, receiving support in improving aspects of the data management practices, or knowing more about the project’s support modes? Feel free to contact the FAIR data work package leader Andreas Jaunsen at andreas.jaunsen(at)nordforsk.org