# Fall Days on Algorithms and Models for Real-Life Systems

*Hotel De Oolderhof, Herten-Roermond,* (see Google maps)

*November 2-6, 2015*

The IPA Fall Days are an annual multi-day event, dedicated to a specific theme of current interest to the research community of IPA. This year’s Fall Days are dedicated to Algorithms and Models for Real-Life Systems, fitting in the Real World Algorithmics and Models focus area of the research school for the period 2013-2018.

The programme of the Fall Days is composed by Michael Kaisers (CWI), Walter Kosters (UL) and Kevin Verbeek (TU/e).

#### Registration

Registration closed on 17 October.

As always, IPA PhD students can attend for free. For others, details on pricing will be made available upon request.

To make maximal use of the available capacity, we process applications on the following basis: Registrations are treated “first come, first served”. All PhD students (IPA and non-IPA) have to share a (quite spacious) room. Others may also be asked to share if we run out of rooms. We will accept participants until we run out of space at the hotel!

#### Programme

Monday 2 November is reserved for the IPA PhD workshop. Our keynote speaker for the PhD Workshop is Bogdan Vasilescu, winner of the IPA Dissertation Award of 2014. The workshop aims to promote the interaction and cohesion with PhD students from other universities on both a social and technical level, provide a hospitable setting in which presentation skills can be demonstrated and improved through constructive feedback by peers, and be fun.

The 3-6 November programme will cover the following topics: Algorithms and models for/inspired by biology; Geographical Information Systems; Smart Energy Systems and Robotics.

**The programme is available below.**

**MONDAY 2 NOVEMBER: PHD WORKSHOP**

11:30-12:00 | Registration |

12:00-13:10 | Lunch |

13:10-13:15 | Opening PhD Workshop |

13:15-13:45 | Mahmoud Talebi, TU/eContinuous Approximation of Stochastic Models for Wireless Sensor NetworksWireless sensor networks are usually composed of a large number of nodes, and with the increasing processing power and power efficiency they are expected to run more complex protocols in the future. This poses problems in the field of verification and performance evaluation of wireless networks. We address this issue by proposing a method of modeling large networks by dynamical systems rather than explicit Markov models, called Mean-Field Approximation. We apply this method to the slotted ALOHA protocol, and establish results on the long term trends of the protocol within very large networks, especially regarding the stability of ALOHA-type protocols. We then extend the modeling technique in order to express characteristics of a network running a CSMA/CA protocol. |

13:45-14:15 | Thomas Nägele, RUTowards a multidisciplinary model-based methodology for the development of motion control applicationsMotion control solutions nowadays are increasingly complex due to the rapid growth of the number of sensors and actuators. A strict separation of the domains involved also increases the development complexity. To improve both quality and speed of the development process, we will aim for close integration of the cyber and the physical part of development. We aim to develop a multidisciplinary model-based methodology that supports close integration of the domains involved in the development of motion control applications. The methodology should enable early validation of design concepts in both software and hardware. |

14:15-14:45 | Vincent Bloemen, UTMulti-Core On-The-Fly SCC DecompositionThe main advantages of Tarjan’s strongly connected component (SCC) algorithm are its linear time complexity and ability to return SCCs on-the-fly, while traversing or even generating the graph. Until now, most parallel SCC algorithms sacrifice both: they run in quadratic worst-case time and/or require the full graph in advance. The current talk presents a novel parallel, on-the-fly SCC algorithm. It preserves the linear-time property by letting workers explore the graph randomly and carefully communicating partially completed SCCs. We prove that this strategy is correct. For efficiently communicating partial SCCs, we develop a concurrent, iterable disjoint set structure (combining union-find with a cyclic list). We demonstrate scalability on a 64-core machine using 75 real-world graphs (from model checking), synthetic graphs (combinations of trees, cycles and linear graphs), and random graphs. Previous work did not show speedups for graphs containing a large SCC. We observe that our parallel algorithm is typically 10-30×faster compared to Tarjan’s algorithm for graphs containing a large SCC. Comparable performance (with respect to the current state-of-the-art) is obtained for graphs containing many small SCCs. |

14:45-15:00 | Coffee break |

15:00-15:30 | Önder Babur, TU/eTowards Statistical Comparison and Analysis of ModelsModel comparison is an important challenge in model-driven engineering, with many application areas such as model versioning and domain model recovery. There are numerous techniques that address this challenge in the literature, ranging from graph-based to linguistic ones. Most of these involve pairwise comparison, which might work, for instance, for model versioning with a small number of models to consider. However, they mostly ignore the case when there is a large number of models to compare, such as in common domain model/metamodel recovery from multiple models. In this paper we present a new perspective on model comparison and analysis. We propose the representation of models in vector space model as in information retrieval, and applying clustering techniques, to compare and analyse a large set of models. We demonstrate our approach on synthetic datasets of models generated via genetic algorithms from a subset of Ecore. |

15:30-16:00 | Wytse Oortwijn, UTDistributed Binary Decision Diagrams for Symbolic ReachabilityBinary decision diagrams are used in symbolic verification to efficiently represent state spaces. A crucial algorithm is symbolic reachability, which systematically yields all reachable system states. We propose novel algorithms for BDD-based symbolic reachability targeting high-performance networks of multi-core machines. Additionally, we provide a distributed hash table, cluster-based work stealing operations, and several caching structures that employ Remote Direct Memory Access (RDMA). All distributed data structures are lockless and carefully designed to minimize the number of roundtrips. We demonstrate speedups up to 45x with 64 machines, relative to sequential runs. Besides maintaining good time-efficiency, the proposed algorithms also allow substantially more memory to be efficiently employed, compared to non-distributed implementations. |

16:00-16:30 | Coffee break |

16:30-16:35 | Award Ceremony |

16:30-17:30 | Bogdan Vasilescu, UC DavisDeciphering secrets of social coding“Social coding” is a phrase made popular by GitHub, the online collaborative coding platform home to millions of users and repositories. It has come to represent a paradigm shift in software development, especially in the open source world. Coding has always been human-centric, but never more so than with the advent of GitHub. Code is rarely written for oneself, is meant to be shared, discussed, and changed by others as needed.This talk reports on some of my PhD adventures in learning about the social side of software engineering, from transparent platforms such as GitHub and Stack Overflow. Through a series of mixed-methods empirical studies, combining quantitative (data mining and statistical analysis) data with qualitative (survey-based) evidence, I will offer insights into some of the effects associated with developing software in such a public, social world. |

17:30-18:30 | Drinks |

18:30- | Dinner |

**TUESDAY 3 NOVEMBER**

09:00-10:00 | Bettina Speckmann, TU/eAlgorithms for Geographic DataA significant part of today’s data is geographic, has a geographic component (is geo-referenced), or benefits from geographic interpretation. In this talk I will first give an introduction to geographic data as it appears in various application areas, such as automated cartography and motion data analysis. I will then discuss algorithms for two representative problems: (1) the automated construction of flow maps, a specific type of thematic map, and (2) the analysis of the collective motion of a set of moving entities like people or birds. |

10:00-10:30 | Coffee |

10:30-11:15 | Kevin Buchin, TU/eAlgorithms for Animal Movement AnalysisThe recent explosion in animal movement data has resulted in the need for new algorithms for their analysis. We consider several examples of ecological questions and the resulting algorithmic challenges. Taking the problem of delineating stopovers in migration flight as a starting point we focus on the algorithmic problem of trajectory segmentation, that is, splitting a trajectory based on movement characteristics. We discuss algorithms for two approaches to this problem. Criteria-based segmentation splits a trajectory such that each part fulfills a given spatio-temporal criteria. Model-based segmentation assumes an underlying movement model and segments by fitting this model, balancing the likelihood of the model and its complexity. |

11:15-12:00 | Frank Staals, Madalgo, DenmarkRepresentative TrajectoriesAn important task in trajectory analysis is defining a meaningful representative for a set of similar trajectories. How to formally define and find such a representative is a challenging problem. We propose and discuss two possible definitions, and present efficient algorithms to compute such representatives. The definitions differ in how they incorporate the temporal component of the trajectories. In case time is not important, our definition measures the quality of a representative trajectory in terms of the homotopy area between the representative and the input trajectories. We investigate which properties of the trajectories allow us to compute a representative trajectory efficiently, and show that the problem is NP-hard in general. In case time is important, we define a representative c that consists of pieces of the input trajectories, switches from one entity to another only if they are within a small distance of each other, and such that at any time t, the point c(t) is as central as possible. We measure centrality in terms of the radius of the smallest disk centered at c(t) enclosing all entities at time t, and we discuss how our techniques can be adapted to other measures of centrality. |

12:15-13:30 | Lunch |

14:00-15:00 | Jan-Henrik Haunert, Universität OsnabrückOptimization approaches for automation in cartographyIn cartography, maps are classically produced with different scales and thus with different levels of detail. Map generalization is the problem of deriving a map of a smaller scale from a given map. It is usually divided into sub-problems such as the selection of important objects for the output map, the geometric simplification of lines, and the aggregation of areas. To develop automatic solutions for those problems, a model of cartographic constraints and criteria of cartographic quality needs to be formalized. Once this model has been defined, it can be solved by optimization. In my talk I will give an overview on map generalization and present a method for the aggregation of areas. I will then present an optimization-based approach for the generation of focus+context maps, in which different scales are combined in a single graphic, for example, to provide a user with both detailed information about an area of interest and overview information about the geographic context. |

15:00-15:30 | Coffee |

15:30-16:15 | Wouter Meulemans, giCentre (UK)Shaping Maps: Models, Measures and AlgorithmsThis talk encompasses two problems that relate to modeling and measuring shapes and the corresponding algorithms.The first problem is about the creation of schematic maps (abstract, stylistic depictions of geographic regions). To capture “shape” in this context, we review prominent similarity measures, assessing their suitability as an optimization criterion for the schematization process. Based on this, we present a heuristic algorithm, guided by a similarity measure, that can effectively and efficiently compute schematic maps.The second problem revolves around the question “what is the shape of a point set?”. We look at the geometric modeling and present shortest-path graphs as a way to describe shape. These shortest-path graphs are illustrated in two use cases from GIS and cartographic visualization. |

16:15-17:00 | Arthur van Goethem, TU/eAhead of the Curve - Curved SchematizationSimilar to cartographic generalization, schematization is focussed on deriving a map with a lower level of detail. In contrast to generalization, however, the maintained accuracy is not the main concern. Schematization is concerned with maintaining or emphasing the key message in a map. This is done using the “less is more” principle. By removing unrequired detail, the key concepts gain more focus. Very informally (very, very informally) one could argue, “generalization tries to maximize the detail for a given scale”, whereas “schematization tries to minimize the detail for a given scale”.Curves are a natural fit for schematization. They are able to represent many different shapes, significantly reducing the complexity. While manual cartography uses curves often, automated methods have left this area reasonably unexplored.In this talk we investigate three different forms of automated schematization using curves. Firstly we look at different ways to schematization country boundaries. Secondly, we investigate how to schematize delays on (road) networks. Lastly, we let our minds run free and explore the limits of schematization. If we push the concept of schematization to its extremes, does it still make sense? And what do we end up with? Is it possible to reach a near stenography simply by using schematization? |

18:00-20:00 | Dinner |

**WEDNESDAY 4 NOVEMBER**

09:00-10:00 | Dragan Bosnacki, TU/eAlgorithms for reverse engineering of biological networksDiscovering the connections between the genes in the living cells is still one of the open research questions in genetics and in particular in computational biology. Besides the fundamental importance, this kind of knowledge paves the way for prevention and development of medicines for various diseases.We present algorithms that combine statistics and graph theory that are used to reverse engineer genetic networks from so called perturbation experiments. In this kind of experiments the activity of a set of genes is artificially perturbed and the effects on the activity of the other genes is measured. The task is to establish the true connections between genes or at least to rank the possible interactions based on their likelihood. In this context an important problem is to distinguish between direct and indirect connections. We show how the scale-free structure of the biological networks can be exploited for efficient network inference. We also demonstrate how some of the algorithms can be parallelized, e.g., using GPUs, which makes them potentially scalable to the entire genome (the set of all genes) of an organism. |

10:00-10:30 | Coffee |

10:30-11:15 | Jetty Kleijn, ULPetri Net ModellingPetri nets describe local changes in a system caused by activities of local entities. Thus the framework of Petri nets provides an approach to the modelling of distributed systems based on an explicit representation of potentially concurrent behaviour. A wealth of theory and techniques has been developed for analysis of behaviour, simulation, and visualisation of the systems modelled in this way. The combination of graphical representation and underlying precise mathematical definitions makes Petri nets attractive for many application areas like the modelling of biological systems.In this lecture we introduce basic elements of the Petri net modelling approach and demonstrate how Petri nets may be applied to represent various aspects of biological processes. The process of gradient formation in systems biology serves as an example. |

11:15-12:00 | Katy Wolstencroft, ULWorkflows for Bioinformatics AnalysesScientific workflow systems, such as Taverna, enable the chaining together of distributed analysis tools and data resources into complex analysis pipelines. Workflows are ideal for analysing high throughput omics data, but they are equally applicable for gathering distributed data for the interpretation of biological results or for the population of biological models. In practice, many workflows combine both types of analysis. Here we describe examples of analysing and integrating biological data using Taverna workflows. Through Taverna, scientists have access to several thousand different tools and resources that are freely available, from a large range of life science institutions. The workflows themselves are reusable, executable models of the bioinformatics protocols they encapsulate and can be shared in workflow repositories, such as myExperiment. |

12:15-13:30 | Lunch |

14:00-15:00 | Michel Westenberg, TU/eVisual analytics for integrative network analysisBiological networks pose many challenges for visualization of both topological structure and node/edge attributes. There are different types of network, each with specific structural characteristics. For example, gene regulatory networks (describing gene-to-gene interactions) exhibit out-degrees with a scale-free distribution, in-degrees bound by a low maximum and few and small cycles. Besides these global structural properties, subnetworks of a specific structure, called motifs, provide important knowledge about gene regulatory networks to domain experts. Standard layout methods typically do not exploit these structural characteristics. The first part of the talk is about visualization techniques that were designed specifically for a certain type of biological network. The second part of the talk is devoted to integrative network analysis, in which node attributes play an important role. Integrative network analysis makes use of computational methods to extract smaller subnetwork modules, and performs enrichment analysis to annotate the modules with ontology terms or other available knowledge. Visual analysis of these annotated modules allows an analyst to focus on the relevant parts of the network only, and to quickly formulate hypotheses about the underlying biology. |

15:00-15:30 | Coffee |

15:30-16:15 | Fons Verbeek, ULFrom models to data to models ...This presentation will address the use of model systems in biology to reveal the mechanisms of the normal and abnormal functioning of an organism. Lots of different molecular engineering methods have boosted the research in biology as a consequence of which a deluge of different data produced. These data need for organised and need to support the further research. We will start from models that simply support to add dimensions to visualization. Next, a model should support the state of the art thinking and, of course finally we arrive at models that help us predict. The data that we will use are images, features, stochastics and domain knowledge, all derived from typical model systems like zebrafish and mouse. |

16:15-17:00 | Stefano Schivo, UTANIMO: get down from your throne and play with biological networksThanks to a constant technological improvement, biological laboratories can produce increasingly larger amounts of data. However, precisely because of the amount of data, it is difficult to interpret the results of many experiments. Bioinformatics was born to solve this problem: applying computational methods to the scariest problems of big data, we make the biologist’s life simpler. So, with the magnificent power of computer science at their fingertips, biomedical researchers will make short work of all illnesses, right? No. Unfortunately, there is a catch: biologists have studied biology, not computer science. The most powerful computational methods are nearly useless if they are too complex to be accessed by a biologist. With ANIMO (Analysis of Networks with Interactive MOdeling), we bring to the biologists the power of computational methods, taking the excellent formal models from their glorious pedestal and bringing them among the “real scientists”. ANIMO combines the power of advanced computational methods with a user-friendly interface that makes the biologist feel at home. With a few mouse clicks, the domain expert (which is now a biologist, and not a computer scientist) can compose a model of known biochemical interactions and match it against experimental data. This helps to formulate new hypotheses, which in turn will drive experimental research further on. The day when doctors will use model checking to choose the most effective medicines for each patient is nearer than we think! |

18:00-20:00 | Dinner |

**THURSDAY 5 NOVEMBER**

09:00-10:00 | Hendrik Jan Hoogeboom, ULGraph Polynomial for Gene RearrangementsCiliates are an ancient group of unicellular organisms. They have the remarkable property that their DNA is stored in two vastly different types of nuclei. During conjugation a germline nucleus called the micronucleus (MIC) is transformed into a somatic nucleus called the macronucleus (MAC). In this way, each MIC gene is transformed into its corresponding MAC gene, in a process that we call gene assembly. Various formal models for this gene transformation process were proposed.One of the models studied is the assembly polynomial (by Burns etal.) which abstracts basic features of the assembly process, like the number of segments excised. We give a general introduction to the mathematics for gene assembly, and the properties of this polynomial in particular.Joint work with Robert Brijder (Hasselt, B) |

10:00-10:30 | Coffee |

10:30-11:15 | Peter Bosman, CWIMulti-objective optimization for deformable image registration of medical imagesCurrently, two major challenges dominate the field of deformable image registration. The first challenge is related to the tuning of the developed methods to specific problems (i.e. how to best combine different objectives such as similarity measure and transformation effort). This is one of the reasons why, despite significant progress, clinical implementation of such techniques has proven to be difficult. The second challenge is to account for large anatomical differences (e.g. large deformations, (dis)appearing structures) that occurred between image acquisitions. In this talk, we present an approach based on multi-objective optimization to improve registration robustness and to simplify tuning for specific applications. We specifically consider the use of an advanced model-based evolutionary algorithm for optimization and a dual-dynamic transformation model (i.e. two “non-fixed” grids: one for the source- and one for the target image) to accommodate for large anatomical differences. The approach computes and presents multiple outcomes that represent efficient trade-offs between the different objectives (a so-called Pareto front). As a proof of concept, we performed tests on both artificial and real data with disappearing structures. Furthermore, the case of prone-supine image registration for 2D axial slices of breast MRI scans was evaluated. Results demonstrate strong potential of the proposed approach to account for large deformations and (dis)appearing structures in deformable image registration. |

11:15-12:00 | Sander Bohte, CWIContinuous deep-time neural reinforcement learningAs living organisms, one of our primary characteristics is the ability to rapidly process and react to unknown and unexpected events. To this end, we are able to recognise an event or a sequence of events and learn to respond properly. Responding in the real world requires one to learn to recognise both what is important, and also when to act. Reinforcement Learning (RL) is typically used to solve complex tasks: to learn the how. To respond quickly — to learn when — the environment has to be sampled often enough without unduly complicating the learning problem. Here, we derive a continuous-time version of on-policy SARSA reinforcement learning in a working-memory neural network model, AuGMEnT. Using a neural working memory network resolves the what problem, our when solution is built on the notion that in the real world, instantaneous actions of duration dt are actually impossible. We demonstrate how we can decouple action duration from the internal time-steps in the neural RL model using an action selection system, and we show how a hardwired accessory network can avoid the need for alternating forward and backward phasing for learning. This CT-AuGMEnT successfully learns to react to the events of a continuous-time task without pre-imposed specifications about the duration of the events or the delays between them. |

12:15-13:30 | Lunch |

14:00-15:00 | Wolfram Burgard, Freiburg UniversityProbabilistic Techniques for Mobile Robot NavigationProbabilistic approaches have been discovered as one of the most powerful approaches to highly relevant problems in mobile robotics including perception and robot state estimation. Major challenges in the context of probabilistic algorithms for mobile robot navigation lie in the questions of how to deal with highly complex state estimation problems and how to control the robot so that it efficiently carries out its task. In this talk, I will present recently developed techniques for efficiently learning a map of an unknown environment with a mobile robot. I will also describe how this state estimation problem can be solved more effectively by actively controlling the robot. For all algorithms I will present experimental results that have been obtained with mobile robots in real-world environments. |

15:00-15:30 | Coffee |

15:30-16:15 | Jan BroeninkModel-Driven development of Robot SoftwareCurrent research on model-driven design of embedded control software for robotic applications is presented. A Model-driven way of design asks for a structured approach using models and models of models, so meta models. Model-driven design of robot software supports efficiency of the design work, which is really necessary as high-tech industry is in a need for faster development of better quality products. We use an integral approach where both the discrete-event part and the continuous-time part of a computer-controlled robot is modeled. Both a way of working and supporting tools are being developed. The current state of this development is presented. |

16:15-17:00 | Wolfgang Ketter, Erasumus UniversityTBATBA |

18:00-20:00 | Dinner |

20:45- | Social Event |

**FRIDAY 06 NOVEMBER**

09:30-10:30 | Damien Ernst, University of LiègeMicrogrids and their destructuring effects on the electrical industryMicrogrids are small electrical power systems operated in parallel with the main utility grid. These last few years, microgrids have been strongly developing and the classical “Generation-Transmission-Distribution” model seems to be loosing its (almost) absolute predominance. In this talk, we answer three main questions:Question 1: What are the main factors that have set off the development of microgrids?Question 2: Are microgrids an epiphenomenon? If no, to what extend will they develop?
Question 3: What advice to give to (a) power production companies, (b) distribution companies (c) transmission companies, (d) governments to deal with microgrids? |

10:30-11:00 | Coffee |

11:00-12:00 | Michael Kaisers, CWICoopetition in energy systems and optimizationTBA |

12:15-13:30 | Lunch |