CloudButton Serverless Data Analytics Platform

Objectives


Our main goal is to create CloudButton: a Serverless Data Analytics Platform. CloudButton will democratize big data by overly simplifying the overall life cycle and programming model thanks to serverless technologies. To demonstrate the impact of the project, we target two settings with large data volumes: bioinformatics (genomics, metabolomics) and geospatial data (LiDAR, satellital).

High Performance Serverless Run-time

We will create the first FaaS compute run-time for Big Data analytics, overcoming the current limitations of existing serverless platforms.

Mutable Shared Data Middleware for Serverless Computing

We will create Distributed Mutable Data Structures leveraging RedHat Infinispan In-Memory Data Grid. Our middleware will provide language-level constructs for data persistence, dependability and concurrency control to serverless functions.

CloudButton Toolkit

Serverless Cloud Programming Abstractions that can express a wide range of existing data-intensive applications with minimal changes. We will develop new tools and methodologies to port existing data-intensive applications from the HPC, data analytics and machine learning domains to the CloudButton toolkit.



CloudButton toolkit


Open Source software results for the CloudButton project.


Lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs in the cloud

Faasm

High-performance stateful serverless runtime based on WebAssembly

Crucial

Stateful Distributed Applications over Function-as-a-Service Platforms


Infinispan

Data grid platform and highly scalable NoSQL cloud data store

Lithops-METASPACE

Lithops-based Serverless implementation of the METASPACE spatial metabolomics annotation pipeline

Serverless shell

Shell scripting for serverless


News


Use cases


Genomics

Serverless technologies can overcome scaling limitations of research centres computational resources, improving the scalability and productivity when processing large datasets.

Metabolomics

Expand the analysis of metabolomics raw data and boost external access and efficient re-use of open data.

Geospatial

Conduct geospatial analyses in order to increase productivity, scalability and performance of relevant environmental applications using open access LiDAR and satellite data.

Results


Deliverables

D1.1 Public Project Website

DOWNLOAD

D2.1 Experiments and Initial Specifications

DOWNLOAD

D2.2 Data Management Plan, 1st version

DOWNLOAD

D2.3 CloudButton Architecture Specs and Early Prototypes

DOWNLOAD

D2.4 Data Management Plan, 2nd version

DOWNLOAD

D2.5 Reference Implementation of Architectural Building Blocks

DOWNLOAD

D2.6 Data Management Plan, 3rd version

DOWNLOAD

D3.1 Initial specs of the Serverless Compute and Execution Engine

DOWNLOAD

D3.2 Serverless Compute Engine Design and Prototypes

DOWNLOAD

D3.3 Serverless Compute Engine Reference Implementation

DOWNLOAD

D4.1 Initial prototype for stateful serverless computation

DOWNLOAD

D4.2 Specification and partial support for degradable objects

DOWNLOAD

D4.3 Full implementation of the BLOSSOM middleware

DOWNLOAD

D5.1 CloudButton Initial API Definition

DOWNLOAD

D5.2 CloudButton Prototype of Abstractions, Fault-tolerance and Porting Tools

DOWNLOAD

D5.3 CloudButton Toolkit Reference Implementation

DOWNLOAD

D6.1 Communication plan

DOWNLOAD

D6.2 Communication report

DOWNLOAD

D6.3 Final dissemination, exploitation, and adoption report

DOWNLOAD

Publications - Green/Gold Open Access


Partners


The CloudButton consortium is a well-balanced team of industrial and academic partners:

CloudButton contributes to Big Data Value Public-Private Partnership activities:

BDVA Public-Private Partnership

About


Project title CloudButton: Serverless Data Analytics Platform
Grant agreement ID 825184
Partners Universitat Rovira i Virgili (Spain)
IBM Israel - Science and Technology LTD (Israel)
Imperial College London (United Kingdom)
Institute Mines-Télécom (France)
Red Hat Limited (Ireland)
European Molecular Biology Laboratory (Germany)
Atos Spain SA (Spain)
The Pirbright Institute (United Kingdom) [Participation ended]
Answaretech SL (Spain) [Participation ended]
Fundación Matrix, Investigación y Desarrollo Sostenible (Spain)
The James Hutton Institute (United Kingdom)
Duration January 2019 - March 2022
Overall budget 4,277,507.50€
Programme Horizon 2020 Framework Programme (H2020) >
    PRIORITY 'Industrial leadership' >
       Leadership in enabling and industrial technologies >
          Information and Communication Technologies (ICT)
Topic ICT-12-2018-2020 - Big Data technologies and extreme-scale analytics
Funding scheme RIA - Research and Innovation action