CloudButton Serverless Data Analytics Platform

Objectives


Our main goal is to create CloudButton: a Serverless Data Analytics Platform. CloudButton will democratize big data by overly simplifying the overall life cycle and programming model thanks to serverless technologies. To demonstrate the impact of the project, we target two settings with large data volumes: bioinformatics (genomics, metabolomics) and geospatial data (LiDAR, satellital).

High Performance Serverless Run-time

We will create the first FaaS compute run-time for Big Data analytics, overcoming the current limitations of existing serverless platforms.

Mutable Shared Data Middleware for Serverless Computing

We will create Distributed Mutable Data Structures leveraging RedHat Infinispan In-Memory Data Grid. Our middleware will provide language-level constructs for data persistence, dependability and concurrency control to serverless functions.

CloudButton Toolkit

Serverless Cloud Programming Abstractions that can express a wide range of existing data-intensive applications with minimal changes. We will develop new tools and methodologies to port existing data-intensive applications from the HPC, data analytics and machine learning domains to the CloudButton toolkit.

News


Use cases


Genomics

Serverless technologies can overcome scaling limitations of research centres computational resources, improving the scalability and productivity when processing large datasets.

Metabolomics

Expand the analysis of metabolomics raw data and boost external access and efficient re-use of open data.

Geospatial

Conduct geospatial analyses in order to increase productivity, scalability and performance of relevant environmental applications using open access LiDAR and satellite data.

Results


Deliverables

D1.1 Public Project Website

DOWNLOAD

D2.1 Experiments and Initial Specifications

DOWNLOAD

D2.2 Data Management Plan, 1st version

DOWNLOAD

D3.1 Initial specs of the Serverless Compute and Execution Engine

DOWNLOAD

D4.1 Initial prototype for stateful serverless computation

DOWNLOAD

D5.1 CloudButton Initial API Definition

DOWNLOAD

D6.1 Communication plan

DOWNLOAD

D2.3 CloudButton Architecture Specs and Early Prototypes

DOWNLOAD

D2.4 Data Management Plan, 2nd version

DOWNLOAD

D3.2 Serverless Compute Engine Design and Prototypes

DOWNLOAD

D4.2 Specification and partial support for degradable objects

DOWNLOAD

D5.2 CloudButton Prototype of Abstractions, Fault-tolerance and Porting Tools

DOWNLOAD

D6.2 Communication report

DOWNLOAD

Publications - Green/Gold Open Access

On the FaaS Track: Building Stateful Distributed Applications with Serverless Architectures

DOWNLOAD

FaaS Orchestration of Parallel Workloads

DOWNLOAD

ServerMix: Trade-Offs and Challenges of Serverless Data Analytics

DOWNLOAD

Please, do not decentralize the Internet with (permissionless) blockchains!

DOWNLOAD

On the correctness of Egalitarian Paxos

DOWNLOAD

Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing

DOWNLOAD

Triggerflow: Trigger-based Orchestration of Serverless Workflows (Conference paper)

DOWNLOAD

Benchmarking parallelism in FaaS platforms

DOWNLOAD

Decentralize the feedback infrastructure!

DOWNLOAD

Serverless End Game: Disaggregation enabling Transparency

DOWNLOAD

Serverless Predictions: 2021-2030

DOWNLOAD

Triggerflow: Trigger-based Orchestration of Serverless Workflows (Journal)

DOWNLOAD

Transparent Serverless execution of Python multiprocessing applications

DOWNLOAD

Trade-Offs and Challenges of Serverless Data Analytics

DOWNLOAD

Stateful Serverless Computing with Crucial

DOWNLOAD

Outsourcing Data Processing Jobs with Lithops

DOWNLOAD

EGEON: Software-Defined Data Protection for Object Storage

DOWNLOAD

MLLess: Achieving Cost Efficiency in Serverless Machine Learning Training

DOWNLOAD

Bringing scaling transparency to Proteomics applications with serverless computing

DOWNLOAD

Serverless Elastic Exploration of Unbalanced Algorithms

DOWNLOAD

A milestone for FaaS pipelines; object storage vs VM-driven data exchange

DOWNLOAD

Primula: a Practical Shuffle/Sort Operator for Serverless Computing

DOWNLOAD

Leaderless State-Machine Replication: Specification, Properties, Limits

DOWNLOAD

State-machine replication for planet-scale systems

DOWNLOAD

The serverless shell

DOWNLOAD

J-NVM: Off-heap Persistent Objects in Java

DOWNLOAD

Efficient Replication via Timestamp Stability

DOWNLOAD

OffsampleAI: artificial intelligence approach to recognize off-sample mass spectrometry images

DOWNLOAD

Spatial Metabolomics and Imaging Mass Spectrometry in the Age of Artificial Intelligence

DOWNLOAD

Using Biological Signals for Mass Recalibration of Mass Spectrometry Imaging Data

DOWNLOAD

[Degree thesis] A compressed file partitioner for scalable Genomics analysis with Serverless technology

DOWNLOAD

[Degree thesis] Porting Genomics pipelines to the Cloud. Serverless Computing as an avenue for scalable variant calling

DOWNLOAD

[Degree thesis] Serverless OCaml Genomic Pipeline Parallelisation Engine

DOWNLOAD

[Master's thesis] Machine Learning on a Serverless Architecture

DOWNLOAD

[Master's thesis] Painless Data Analytics in the Cloud. Grouping data in serverless architectures

DOWNLOAD

[Master's thesis] Study of the Feasibility of Serverless Access Transparency for Python Multiprocessing Applications

DOWNLOAD

Partners


The CloudButton consortium is a well-balanced team of industrial and academic partners:

CloudButton contributes to Big Data Value Public-Private Partnership activities:

BDVA Public-Private Partnership

About


Project title CloudButton: Serverless Data Analytics Platform
Grant agreement ID 825184
Partners Universitat Rovira i Virgili (Spain)
IBM Israel - Science and Technology LTD (Israel)
Imperial College London (United Kingdom)
Institute Mines-Télécom (France)
Red Hat Limited (Ireland)
European Molecular Biology Laboratory (Germany)
Atos Spain SA (Spain)
The Pirbright Institute (United Kingdom) [Participation ended]
Answaretech SL (Spain) [Participation ended]
Fundación Matrix, Investigación y Desarrollo Sostenible (Spain)
The James Hutton Institute (United Kingdom)
Duration January 2019 - March 2022
Overall budget 4,277,507.50€
Programme Horizon 2020 Framework Programme (H2020) >
    PRIORITY 'Industrial leadership' >
       Leadership in enabling and industrial technologies >
          Information and Communication Technologies (ICT)
Topic ICT-12-2018-2020 - Big Data technologies and extreme-scale analytics
Funding scheme RIA - Research and Innovation action