https://confluence.simplprogramme.eu/display/SIMPL/High+Level+Architecture
The data layer encompasses the building blocks to enable the exchange of data assets and applications. The SMP
offers data consumers the means to access different types of data from different providers, enabling
interoperability between providers and consumers. The layer contains the services to share as well as manage
data and applications, and perform basic analysis
Two prominent capabilities are the Application sharing and Data sharing. These contain the building blocks required for providers and consumers to exchange both data and applications. The capabilities create the connection between stakeholders to share data and application resources. The data sharing capability encompasses several functions including management of various types of data sharing, from transferring a single, few megabyte file, to transferring a terabyte-sized data dump. The Simpl-Open architecture foresees a simple data transfer mechanism and two special types of data transfer: bulk transfer and data streaming. A datastore connector handles the connection to the backend data store of the data provider, which can vary from a simple file storage to a relational database system. Additionally, this layer addresses a number of scenarios where Data processing tools will be desirable to process data as near as possible to the source. Among these tools, data anonymization tools support data providers and consumers to protect the privacy of data owners. On top, Data governance tools are offered by Simpl-Open for consumers and providers who can verify the integrity and quality of the required datasets.
Sharing applications is similar to sharing data. Indeed, at their core, applications are no more than a collection of data that is marked to be executable. However, specifics of application sharing come in terms of formats and the fact that usually multiple files need to be combined correctly to be able to run the application. It also adds additional considerations to handle the security of executable code and the trust that consumers have in the provided application. Simpl-Open will define the procedures to use for sharing applications. The application sharing capability considers three types of applications: full-fledged software packages, isolated algorithms, and machine learning models. Each type of tool comes with different specifics and runtime environment requirements that Simpl-Open should adhere to.
Additional building blocks of the data layer orchestrate data resources across Simpl-Open actors. The Data orchestration and Distributed execution capabilities allow actors to pool together data from different sources and manage partial sets of data across infrastructure providers when executing distributed applications. The combination of these capabilities allows consumers to gather data from different providers and spread it over distributed infrastructure where data is fed into an application.