Big Data Management and Analytics: Spatio-Temporal Split Vertical Federated Learning

  • Jose Antonio Lorencio Abril

Research output: ThesisMaster's Thesis

Abstract

Data are central to any Machine Learning (ML) application but often remain scattered in different parties' databases, hindering the development of effective and reliable models. The reluctance to share valuable data assets due to competitive concerns and strict privacy laws, such as the General Data Protection Regulation (GDPR) in Europe, add complexity to data-sharing. This is further complicated when dealing with spatio-temporal data, which can potentially reveal individual identities through movement patterns when merged with other data sources creating a barrier to enhancing ML training processes through broader data sharing.

Federated Learning (FL) has been proposed as a solution to address the challenge of data-sharing limitations by designing a secure way to collaboratively train an ML model without the need to share the raw data. FL variants include Horizontal Federated Learning (HFL), which aims at obtaining ML models collaboratively from data partitioned in their sample space among different clients, and Vertical Federated Learning (VFL), in which the partition is in the feature space.


In this Master Thesis, we demonstrate the potential of FL to address privacy-preserving forecasting of space-time series. We propose a VFL framework in which we suppose that different clients hold different forecasting tasks and data relevant to each other, and show through a set of experiments how the proposed framework effectively leverages spatial and temporal information to perform privacy-preserving forecasts.
Original languageEnglish
QualificationMaster of Science
Awarding Institution
  • Université Paris-Saclay
  • Université libre de Bruxelles
  • Polytechnic University of Catalonia (UPC)
Supervisors/Advisors
  • Graser, Anita, Supervisor
  • Seghouani, Nacéra, Advisor, External person
Award date6 Sept 2024
Publication statusPublished - Sept 2024

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 11 - Sustainable Cities and Communities
    SDG 11 Sustainable Cities and Communities

Research Field

  • Multimodal Analytics

Keywords

  • Machine learning
  • Federated learning
  • geographical information systems

Fingerprint

Dive into the research topics of 'Big Data Management and Analytics: Spatio-Temporal Split Vertical Federated Learning'. Together they form a unique fingerprint.

Cite this