As a solution consultant, I often get this question. I’d like to give my clients a simple YES or NO answer. However, as a former IT project manager and a data governance adventurer, I can tell you that when we assessed whether a system followed data integrity principles or not, we rarely answered just by a Yes or a No.
So is your solution FAIR compliant?
Because the FAIR principles are quite diverse and most of the time, it is easy to determine whether we fulfil some criteria. However, knowing if we can tick ALL the FAIR boxes is another story and takes generally a (fair) number of workshops with many system and data specialists who:
- Have a deep knowledge of how the system works
- Have a good understanding of the FAIR guidelines
- Are able to define how those guidelines are applicable to the system
Some of the most critical data in the lab are equipment measurements. Producing these measurements takes time, technical abilities, material and instrument resources, and almost all important scientific decisions are based on them. Hence, it is often a top priority to be sure that this data is managed properly, following the FAIR principles.
Putting my solution consultant hat back on, I would like to finally give a proper answer that is a bit more than just Yes. For this article, I have taken a deeper look at how the ONE Lab solution specifically tackles the FAIR principles for equipment measurements and will give you a digest of what I learned.
Back to FAIR basics
I cannot write about FAIR compliance without mentioning the FAIR Guiding Principles (1).
Keep those in mind, as I will explain how ONE Lab addresses each principle in the following sections.
To be Findable
F1. (meta)data are assigned a globally unique and persistent identifier
F2. data are described with rich metadata (defined by R1 below)
F3. metadata clearly and explicitly include the identifier of the data it describes
F4. (meta)data are registered or indexed in a searchable resource
As we cannot do anything about the data unless we are able to find it, I would say that this step is probably the most important.
In ONE Lab, measurements are assigned a unique identifier as well as the various metadata used to describe them. (F1/R1)
The main metadata used to describe measurements are:
- The equipment used to produce them
- The creator of the measurement itself
- The context in which it was created (e.g. a specific step in a procedure or an experiment)
- The sample on which the analysis was performed (if applicable)
Each type of descriptors (equipment, creator, context and sample) has also its own set of metadata (F2).
Measurements data and metadata are stored in a database (the Hub) whose architecture and data model follow the F3 and F4 principles. As I will explain in detail in the next step, this database is easily searchable by using our RESTful API.
To be Accessible
A1. (meta)data are retrievable by their identifier using a standardized communications protocol
A1.1 the protocol is open, free, and universally implementable
A1.2 the protocol allows for an authentication and authorization procedure, where necessary
A2. metadata are accessible, even when the data are no longer available
Now that we know where to find our measurements, we can do something with them only if we are able to access them.
ONE Lab has what we call the Hub API. Hub API is a standard RESTful API with a resolver which allows us to retrieve virtually any data in the database, based on its unique identifier (A1). Because ONE Lab is a Web application, our RESTful API allows interaction with the database by using standard technologies at an enterprise level (A1.1/A1.2).
As raw measurements data (as well as formatted and calculated data) are directly stored in the database, the data and its descriptors are available even after the original data has been removed from its source system (A2). For instance, if the original file stored in the computer operating the equipment is removed, it would still be accessible in ONE Lab along with its associated metadata. (Most of the time, our customers have archiving systems such as SDMS – Scientific Data Management System – to act as an additional security level.)
To be Interoperable
I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (meta)data use vocabularies that follow FAIR principles
I3. (meta)data include qualified references to other (meta)data
Knowing how to access our measurements, we can now start playing around with our data. One of the first thing we would like to do is probably enrich it with other in-house databases or even compare it to public data.
Often, this can be made difficult by the lack of interoperability of our data, which is every data scientist’s nightmare. Here again we are making things easier by allowing you to leverage industry standards to describe measurements, equipment and recipes. The data and associated metadata can also be extracted as JSON from our API to be then processed by machines (I1).
While measurements and equipment definitions can be compatible with the Allotrope Foundation Ontologies (AFO) for vocabularies of measurement terms, our customers are also free to use their own vocabularies that conform to standards that may be important in their industry (2) (I2). Our recipes follows the S88 standard for procedure authoring and we also propose an integration with QUDT ontologies for units management.
Our vocabularies and vocabulary entries can be linked to external references (such as identifiers provided by FAIR compliant standards). (Meta)data have also additional descriptors such as a mapping to system ontologies allowing to define specific relationships between (meta)data (I3).
To be Reusable
R1. meta(data) are richly described with a plurality of accurate and relevant attributes
R1.1. (meta)data are released with a clear and accessible data usage license
R1.2. (meta)data are associated with detailed provenance
R1.3. (meta)data meet domain-relevant community standards
Finally, once you have found, retrieved and processed your data, you want to do it all over again! That is why your data needs to be reusable.
As explained before, measurements data are described by a plurality of metadata (equipment, creator, context, and sample). A critical layer of context is also added by S88 compliant recipes describing laboratory processes. This brings real meaning to results as it allows understanding data in the context of the whole process and not just as records coming from the instrument. Associating measurements to processes make them even more reusable as it allows for big picture decision-making (regarding an overall process for instance) (R1).
They are also accessible from different entry points in ONE Lab that fit different user roles:
- From equipment logbooks for metrologists
- From tasks’ review screen for lab managers
- From tasks plans for lab scientists
Of course, the flexibility offered by the Hub API allows building other types of interfaces to access measurements data to fit other data consumers’ needs (R1). A general practice that we see often is to push the data to a data lake, making it available to a broader audience and in addition to data coming from other data sources.
Measurements data also follows a lifecycle involving policy signatures configured depending on their level of quality. As we are talking about scientific company data, their usage license is generally globally defined at the enterprise level. ONE Lab allows setting up permissions and collaborative spaces (private, public) accordingly (R1.1).
Thanks to the metadata that comes with equipment measurements in ONE Lab, we are able to trace back to what equipment produced the raw data, where it was retrieved, how it was processed and in which context it was generated, making its “origin story” clear to the user (R1.2).
Data standards such as the ones from the Allotrope Foundation or QUDT play an important part in making the data interoperable but also reusable. ONE Lab enables the application of Allotrope and QUDT ontologies to measurements (and their metadata). By leveraging those standards, we guarantee that our measurements will be sustainably described and that the datasets produced over time will be comparable (R1.3).
Is ONE Lab FAIR compliant?
Returning to my first statement, as you have seen, this question requires a rather long explanation. However, regarding critical data such as Equipment measurements data, we have demonstrated that ONE Lab provides the framework to implement FAIR guidelines. With its architecture, RESTful API, and rich data model compatible with industry standards, ONE Lab can help you bring more data integrity to your lab and ultimately make better-informed decisions today and tomorrow.
Next time you’re wondering “is ONE Lab FAIR compliant?”, the right answer will be: Yes.
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18