The Final Meeting is a two half-days finishing event of the DFG Priority Progamme "Algorithms for Big Data". It is open to the public. The program includes talks of selected participants of the priority program and external experts in this and neighboring fields.

 

Dates:

Begin: Thursday June 9, at 13:00 p.m.

End: Friday June 10, 2022 around 13:00 p.m.

 

Venue:

Campus Westend, Foyer PA Buiding (Google Maps) of Goethe-Universität Frankfurt, see here for directions.

 

Further Information:

It is planned to have sessions of invited speakers as well as SPP members with shorter talks.

The confirmed invited speakers are:

Schedule

Thursday

12:30 13:00 Registration and Coffee
13:00 13:20 Opening Remarks
Prof. Dr. Ulrich Meyer
13:20 14:05

Invited Talk 1: Learned indexes
Prof. Dr. Paolo Ferragina, University of Pisa


Key-value stores and search engines are posing a continuously growing need to efficiently store, retrieve and analyze massive sets of (possibly long) keys under the many and different requirements posed by users, devices, and applications. Such a new level of complexity could not be properly handled by known data structures, so that academic and industrial researchers started recently to devise new approaches that integrate classic data structures (such as B-trees) with various kinds of machine-learning models and tools, originating what are currently called “Learned Data Structures”. In this talk, I’ll survey the evolution of compressed and learned data structures, that is, structures offering efficient access to compressed data, discuss their theoretical and experimental performance, and point out new challenges worth of future research.

14:05 - 14:30 Contrib Talk 1: Project Massive Text Indices
Dr. Florian Kurpicz

 

A text index is a data structure that provides additional information for a given text to speed up answering different types of queries, e.g., pattern matching queries that ask if (or how often, or where) a pattern occurs in the text.
In this talk, I present the main results concerning scaling text Indices that we have developed in the project "Massive Text Indices". In addition, I give an overview of recent results that also go beyond the scope of classical text indices.

14:30 - 15:00 Break
15:00 15:45 Invited Talk 2: Quarks und Bytes
Prof. Dr. Volker Lindenstruth, FIAS & U Frankfurt/M

 

At the Large Hadron Collider (LHC) at CERN in Geneva very large scale experiments are conducted. In these experiments each nucleus-nucleus collision can generate thousands of new elementary particles. For the understanding of the dynamics and the underlying physics it is paramount to identify those particles. They are being measured with a variety of detectors in order to identify their various properties. In addition the particles traverse a large magnetic field in order to determine their momentum. Typically those detectors measure space points, which have to be connected to the particles trajectory, which generates a combinatorial background with makes the computing cost prohibitively expensive. Novel algorithms have been developed which are based on cellular automata and Kalman filters in order to minimize the computational overhead. Those algorithms exhibit a linear dependence of the tracking time on the number of space points. Further the Kalman filter was optimized to equally perform very fast. To date those algorithms are the baseline for many new detector developments. The ALICE experiment at the LHC has undergone a significant upgrade and will start again with beam time in a few weeks. The data rates exceed 600 GB/s and all data of the experiment has to be processed on-line. An appropriate compute farm with 16.000 CPU cores and 2000 GPUs has been deployed, which is capable to handle the task. This processed data is stored in a 100 PB high performance mass storage system from where further analysis is performed. This data is also distributed world wide for regional analysis. The presentation will outline the research field and the data processing related challenges.

15:45 16:10

Contrib Talk 2: Graph-Based Methods for Rational Drug Design
Prof. Dr. Petra Mutzel


Rational drug design deals with computational methods to accelerate the development of new drugs. Among other tasks, it is necessary to analyze huge databases of small molecules. Since a direct relationship between the structure of these molecules and their effect (e.g., toxicity) can be assumed in many cases, a wide set of methods is based on the modeling of the molecules as graphs with attributes.

In this talk, we will survey our main results concerning structural molecular similarity searches and molecular clustering and put them into the wider context of graph similarity search. In particular, we discuss algorithms for computing graph similarity w.r.t. maximum common subgraphs and their extension to domain specific requirements.

We will end our talk with future work planned in the domain of chemistry as well as neurosciences.

16:10 16:30 Break
16:30 17:15 Invited Talk 3: Big Data for Big Challenges: Using Social Media Advertising Data to Advance the SDGs
Dr. Ingmar Weber

 

In 2015, in a rare global consensus, all 193 UN member states adopted the 2030 Agenda for Sustainable Development to tackle the world’s most pressing challenges. This agenda is organized around 17 Sustainable Development Goals (SDGs), ranging from No Poverty, to Gender Equality, and Justice and Strong Institutions. But goals are meaningless without a way to track their progress and to identify areas most in need of help. In this talk, I will present joint work with different UN agencies and academic collaborators on using social media advertising data to monitor international development, and humanitarian crises. In a nutshell, Facebook and other advertising platforms provide so-called “audience estimates” on how many of their users match certain targeting criteria, such as being located in a particular area, being of a particular age and gender, and having a particular device type. The answers to these count oracle questions are provided free of charge, effectively creating a digital census over almost 3 billion users on Facebook’s network. This digital census can be used to map poverty, track digital gender gaps, or to monitor migration, including during the current Ukraine crisis. I will describe some of these use cases, and will discuss challenges related to using noisy big data.

17:15 17:40

Contrib Talk 3: Is there a moral obligation for algorithm designers? From the choice of random graph models to fairness in machine learning
Prof. Dr. Katharina Zweig

In the SPP, my project was concerned with better algorithmic methods for computing the expected number of neighbors of any two given nodes in a large graph. We looked at ideas in software and hardware and got some surprising insights. This project was the final part of a research question that I started to work on in 2007; while this has been of the most interesting intellectual challenge that I have worked on as an algorithm designer, the usage of the random graph model creates some societal questions as well. In this talk, I will report on our findings in the SPP project and then sketch some societal questions that arise due to algorithmic design decisions. Finally, I will discuss the question of whether there is a moral obligation for algorithm designers to anticipate the possible consequences of their design choices - and give a tentative answer to it.

  Social Event/Dinner at Campus Westend

Friday

09:00 - 09:45 Invited Talk 4: Recent Advances in Fully Dynamic Graph Algorithms: A Practitioner's View
Dr. Kathrin Hanauer, U Wien

 

Graphs are a powerful means to model various interrelationships and in many scenarios, they are almost naturally subject to change. Whereas a number of algorithms for a dynamic setting have been designed and analyzed in theory in the last years, such algorithms and their environments are far less understood in practice.

In this talk, we will exemplarily consider recent dynamic algorithms for two fundamental graph problems from a practical perspective. Besides, we will take a look at the differences and challenges when evaluating dynamic algorithms experimentally, as opposed to static algorithms, as well as our potential learning outcomes.

09:45 - 10:30 Contrib Talk 3: Project Fast Inexact Combinatorial and Algebraic Solvers for Massive Networks
Dr. Alexander van der Grinten
Contrib Talk 4: Scalable Generation of Random Graphs
Dr. Manuel Penschuck
10:30 - 11:00 Break
11:00 - 11:25 Contrib Talk 5: Project Scaling Up Generic Optimization
Prof. Dr. Joachim Giesen, Uni Jena, & Prof. Dr. Soeren Laue, TU Kaiserslautern

 

The goal of generic optimization (GENO) is to generate optimization solvers from formal problem specifications. We present our apporach towards generic optimization and discuss algorithmic challenges and solutions to scale this approach to different compute backends like multicore CPUs, GPUs, clusters, and databases.

11:25 - 12:10 Invited Talk 5: Workflow-efficient Algorithms
Prof. Dr. Christian Scheideler, U Paderborn

 

By "workflow-efficient algorithms" I understand algorithms that are minimizing work (in terms of computational work or messages) and span (i.e., the length of the critical path in the computation) by possibly making use of hybrid architectures and specialized hardware. Workflows are mostly known from business processes, and there it is natural to make use of parallelism and units with different, specialized capabilities. This also appears to be the only way out to achieve continued scalability for computing since clock frequencies and miniaturization appear to have reached their limits. Appropriate models and architectures for workflow-efficient approaches have appeared in many contexts like MapReduce, (Adaptive) Massively Parallel Computation (MPC), processing-in-memory (PIM) architectures, cloud-assisted computing, or even oracle-based computing and quantum computing, and hardware developments like CPUs with hybrid architectures, GPUs, TPUs, FPGAs, and RDMA allow for many more approaches in this direction. In my presentation, I will list some simple examples of workflow-efficient algorithms from parallel and distributed computing to motivate research on workflow-efficient algorithms in the future.

12:10 - 12:15 TCS4Future: Pledge for sustainable research in theoretical computer science
Prof. Dr. Johannes Fischer
 
 
Human activity over the last decades has been changing our planet's climate by releasing greenhouse gases into the atmosphere. The Earth's atmosphere now contains about 410 ppm carbon dioxide (compared to 280 ppm in the 18th century)1,2. The resulting global warming is anticipated to cause considerable harm to human civilization throughout the 21st century and beyond, particularly if the concentration rises above 450 ppm3. To mitigate this, we must urgently reduce our carbon emissions. There are initiatives on many levels of international politics setting goals for this reduction4,5. For instance, the IPCC panel of the United Nations advocates a 45% reduction of carbon dioxide emissions by 2030 relative to 2010 levels, to limit global warming to 1.5°C6.

We, as researchers in Theoretical Computer Science, acknowledge that our activities also contribute to the problem, in particular by the greenhouse gas emissions caused by our travel.

We believe that our research community should aim for a significant reduction of carbon emissions and evolve towards more sustainable practices. We believe that this can be achieved while preserving the quality of our research.

As an objective, we commit ourselves to reducing our emissions by at least 50% before 2030 relative to pre-2020 levels.

12:15 - 12:20 Closing Remarks
Prof. Dr. Ulrich Meyer

Sister Event:

Some of our colleagues from the CS department are organizing the 82. GI Workshop on Algorithms and Complexity (Theorietag) right before the SPP Final Meeting at the same venue. Please consider attending that workshop, too.