PaNOSC updates research data policy framework to be FAIR
Ten years after the PaN-data policy, PaNOSC has released an updated FAIR Data Policy framework for research data, which describes the common framework for management of scientific data at photon and neutron facilities, and will be used by PaNOSC and ExPaNDS‘ community to ensure research data is FAIR.
The participating facilities are used by researchers from universities, publicly-funded research entities, and industry. A common outcome of all these facilities includes the generation of raw data from each experiment, which is then analysed by the research team. The results of publicly-funded research are expected to be published in peer-reviewed scientific journals and made publicly available. In case of proprietary research, beam time is purchased by the experimental team and the results usually remain confidential.
The original PaNdata framework has been strongly influenced by the OECD “Principles and guidelines for access to research data from public funding”. The current framework builds on the PaNdata framework by including the recommendations of the “Turning FAIR data into reality” report by the European Commission’s FAIR Expert Group. The report outlines how to interpret the FAIR principles, a set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable. This framework, like the previous one, strives for a careful balance between competition and collaboration in science.
By definition, to be FINDABLE, any data object should be uniquely and persistently identifiable; a data object is ACCESSIBLE by machines and humans under the conditions explained in this policy; data use a formal, accessible, shared, and broadly applicable language for knowledge representation in order to be INTEROPERABLE; the data object has a plurality of accurate and relevant attributes (usage license, provenance, community standards) to be REUSABLE.
Having an open access data policy with data in well-defined formats has many benefits:
- It makes previously measured data available for further analysis without the necessity to repeat the experiment.
- It promotes data use, cross-disciplinary research and machine learning.
- Raw data becomes open to scrutiny by other researchers, which ensures scientific integrity and reproducibility of experiments.
- Scientists can mine data in previously unknown ways or reapply new methods to existing data.
The data format is an essential part of making data inter-operable and machine readable. Fortunately the photon and neutron community has a standard data format (Nexus/HDF5) that has been adopted by a majority of photon and neutron sources, and is supported by some detector suppliers and more and more data analysis software. The data format recommended therefore by PaNOSC for the raw data is NeXUS/HDF5, which in addition to the detector data includes sample, instrument and scientific metadata. The full strength of this digital approach will be reached when all data from the detector to the final publication are included in a digital object which is machine readable, giving full advantage to the experimental team and the scientific community.
The PanOSC Data Policy framework has been prepared in three phases: (1) a first draft was prepared based on the PaNdata data policy framework during the breakout sessions of WP2 at the first Annual Meeting of PaNOSC, and then (2) a series of ten review meetings was conducted with the experts from PaNOSC and ExPaNDS to review and enhance the contents; and finally (3) the framework was evaluated according to the FAIR Data Maturity Model, which led to further improvements of the data policy.