Research Seminar Informatica

OUrsi stands for "Open University Research Seminar Informatica". The OUrsi is the research seminar of the Department of Computer Science of the Open University of the Netherlands. OUrsi talks typically take place on the last Tuesday of the month at 11:00 in Heerlen, and are normally streamed online for the benefit of interested parties elsewhere. The OUrsi features presentations on ongoing research in all areas of Computer Science.

If you are interested in giving a presentation, or attending one (live in Heerlen or online), please contact Hugo Jonker. Below are listed the scheduled presentations, with the upcoming OUrsi  highlighted .


No. Date Presenter Title  


33. 16 October Johan Jeuring TBA

Johan Jeuring, OU & University of Utrecht

32. 25 September Josje Lodder Developing an e-learning tool for inductive proofs abstract

Developing an e-learning tool for inductive proofs
Josje Lodder

Proving is a central subject of most logic course. Students attending such courses do not only have to learn how to prove consequences in a formal system, but they also have to reason about formal systems. A typical example of an exercise that occurs in many textbooks is the following: prove that the number of left parentheses in a logical formula is equal to the number of right parentheses. These kinds of properties can be proved by structural induction, where the structure of the proof follows the structure of the inductive definition of the logical language. Students have conceptual as well as technical problems with inductive proofs. In this talk I will discuss our ideas and first results concerning the development of an e-learning tool that helps students to practice with inductive proofs.
31. 26 June Hugo Jonker Investigating Adblock wars abstract

Investigating Adblock wars
Hugo Jonker

We investigate the extent to which websites react to filter list updates. We collected data on the Alexa Top-10K for 120 days, and on specific sites newly added to filter lists for 140 days. By evaluating how long a filter rule triggers on a website, we can gauge how long it remains effective. We matched websites with both regular adblocking filter lists (Easylist) and with filter lists that remove anti-adblocking (primarily UblockProtector / NanoProtector). From our data, we observe that the effectiveness of the Easylist adblocking filter decays initially, but after ~80 days seems to stabilize. However, we found no evidence for any significant decay in effectiveness of the more specialized, but less widely used, Ublock Protector / Nano Protector list.
30. 24 April Stijn de Gouw Human-in-the-Loop Simulation of Cloud Services abstract, slides

Human-in-the-Loop Simulation of Cloud Services
Stijn de Gouw

We present an integrated tool-suite for the simulation of software services which are offered on the Cloud. The tool -suite uses the Abstract Behavioral Specification (ABS) language for modeling the software services and their Cloud deployment. For the real-time execution of the ABS models we use a Haskell backend which is based on a source-to-source translation of ABS into Haskell. The tool-suite then allows Cloud engineers to interact in real-time with the execution of the model by deploying and managing service instances. The resulting human-in-the-loop simulation of Cloud services can be used both for training purposes and for the (semi-)automated support for the real-time monitoring and management of the actual service instances.
29. 27 March Benjamin Krumnow The Shepherd Project: Automated security audits of website login processes abstract, slides

The Shepherd Project: Automated security audits of web login processes
Benjamin Krumnow

HTTP does not have memory. HTTP Cookies were created to address this. These cookies thus allow web sites to remember things, such as whether you are logged in or not. After logging in, the cookies authenticate the user, so that the user does not have to type his password with every click. If an attacker steals these cookies, he could thus impersonate a user. In 2010, the Firesheep plugin for Firefox demonstrated how easy it was to steal such cookies for many popular web sites (Gmail, Facebook, etc.). The problem that Firesheep exploited was that cookies were sent over an insecure channel (HTTP instead of HTTPS). Such an attack is called a session hijacking attack.

In the Shepherd project, we investigate the adoption of countermeasures against session hijacking attacks. Over the last year, we have developed an automatic selenium-based scanner. This scanner performs an automated audit of the security of login processes of web sites.
In this OUrsi talk, I will give you an overview of the current scanning framework, explain how it assesses login security and talk about our current study, where we analyse over 65K websites. I hope that we will have lively discussion to identify possible unclear points and future directions.

28. 27 February Vincent van der Meer Future-proofing recovery of deleted files abstract, slides

To maintain the ability to recover deleted files: about the further development and future-proofing of this digital forensic specialization
Vincent van der Meer

Reconstructing deleted files, file carving, is a crucial part of digital forensic work, often concerning child pornography. The explosive growth of data makes file carving increasingly time-consuming and therefore increasingly inadequate, so that evidence remains hidden. File carving is still set up separately per device and per app, resulting in a considerable delay to develop file carving for new apps. This research will investigate how the app-specific part can be made as small as possible by abstraction and generalization. Spin-off will be a widely applicable, modular, maintainable and expandable framework with corresponding benchmark to compare the results of new technologies.
27. 19 February Hassan Alizadeh Intrusion Detection and Traffic Classification using Application-aware Traffic Profiles abstract, slides

Intrusion Detection and Traffic Classification using Application-aware Traffic Profiles
Hassan Alizadeh

Along with the growing number of applications and end-users, online network attacks and advanced generations of malware have continuously proliferated. Many studies have addressed the issue of intrusion detection by inspecting aggregated network traffic with no knowledge of the responsible applications/services. Such systems fail to detect intrusions in applications whenever their abnormal traffic fits into the network normality profiles. This research addresses the problem of detecting intrusions in (known) applications when their traffic exhibits anomalies. To do so, a network monitoring tool needs to: (1) identify the source application responsible for the traffic; (2) have per-application traffic profiles; and (3) detect deviations from profiles given a set of traffic samples. This work is mainly devoted to the last two topics. Upon an initial survey on traffic classification techniques, this research proposes the use of three different kinds of Gaussian Mixture Model (GMM) learning approaches, namely classic, an unsupervised and Universal Background Model (UBM) to build per-application profiles. The effectiveness of the proposed approaches has been demonstrated by conducting different sets of experiments on two public datasets collected from real networks, where the source of each traffic flow is provided. Experimental results indicate that while the proposed approaches are effective for most applications, the UBM approach significantly outperforms other approaches.
26. 19 February Fenia Aivaloglou How Kids Code and How We Know abstract, slides

How Kids Code and How We Know
Fenia Aivaloglou

Programming education is nowadays beginning from as early as the elementary school age. Block-based programming languages like Scratch are increasingly popular, and there is substantial research showing that they are suitable for early programming education. However, how do children program, and are they really learning? In this talk we will focus on Scratch and explore the programs that children write. By scraping the Scratch public repository we have download and analyzed 250K projects in terms of complexity, used abstractions and programming concepts, and code smells. We will further discuss the effect of code smells to novice Scratch programmers and our experiments in Dutch classrooms. Focusing on how to teach good programming practices to children, we designed the edx MOOC "Scratch: Programmeren voor kinderen" and we analyzed the results of its first run, where more that 2K children actively participated. We found factors that can predict successful course completion, as well as age-related differences in the students' learning performance. To further examine the children's motivation and self-efficacy and their relation to learning performance we run a Scratch course in three schools and we will present our initial findings.
25. 30 January 2018 Greg Alpár Blockchain for identity management, seriously? abstract

Blockchain for identity management, seriously?
Greg Alpár

Right now (Dec. 11, 2017, 15:45) 1 Bitcoin is €14228.71. This amazing success of a cryptocurrency has made the blockchain technology, underlying that of Bitcoin, so exciting for businesses that it is often called the next big revolution after the internet. Blockchains are proposed to solve all kinds of problems in our digital world. Besides the financial context, identity is considered to be another promising area for blockchains. Is it a good idea? In this OUrsi session, I would like to share with you some thoughts with regard to the application of blockchains in identity management. Since there is no big results in our study yet, this talk will hopefully turn into a fruitful discussion.

24. 30 November Pekka Aho Automated Model Extraction and Testing of Graphical User Interface Applications abstract, slides

Automated Model Extraction and Testing of Graphical User Interface Applications
Pekka Aho

The industry practice for test automation through graphical user interface is script-based. In some tools the scripts can be recorded from the user (capture & replay tools) and in others, the scripts are written manually. The main challenge with script-based test automation is the maintenance effort required when the GUI changes. One attempt to reduce the maintenance effort has been model-based testing that aims to generate test cases from manually created models. When the system under testing (SUT) changes, changing the models and re-generating the test cases is supposed to reduce the maintenance effort. However, creating the formal models allowing test case generation requires very specialized expertise. Random testing (monkey testing) is a maintenance-free and scriptless approach to automate GUI testing. In random GUI testing, the tool emulates an end-user by randomly interacting with the GUI under testing by pressing buttons and menus and providing input to other types of GUI widgets. More recent research aims to automatically extract the GUI models by runtime analysis while automatically exploring and interacting with the GUI (or during random testing). Test case generation based on extracted models is problematic but by comparing extracted GUI models of consequent versions it is possible to detect changes on the GUI. This approach aims to address the maintenance problem with scriptless regression testing and change analysis.
23. 30 November Twan van Laarhoven Local network community detection: K-means clustering and conductance abstract, slides

Local network community detection: K-means clustering and conductance
Twan van Laarhoven
Joint work Elena Marchiori

Local network community detection is the task of finding a single community of nodes in a graph, concentrated around few given seed nodes. The aim is to perform this search in a localized way, without having to look at the entire graph. Conductance is a popular objective function used in many algorithms for local community detection. In this talk I will introduce a continuous relaxation of conductance. I will show that continuous optimization of this objective still leads to discrete communities. I investigate the relation of conductance with weighted kernel k-means for a single community, which leads to the introduction of a new objective function, σ-conductance. Conductance is obtained by setting σ to 0. I will present two algorithms for optimizing σ-conductance, which show promising results on benchmark datasets.
22. 27 June Jeroen Keiren An O(m log n) algorithm for stuttering equivalence abstract, slides

An O(m log n) algorithm for stuttering equivalence
Jeroen Keiren

In this presentation, I will discuss a new algorithm to determine stuttering equivalence with time complexity O(m log n), where n is the number of states and m is the number of transitions of a Kripke structure. Theoretically, this algorithm substantially improves upon existing algorithms, which all have time complexity of the order O(mn) at best. Moreover, when compared with other algorithms, the new algorithm has better or equal space complexity. Practical results confirm these findings: they show that the algorithm can outperform existing algorithms by several orders of magnitude, especially when the Kripke structures are large.
21. 13 June Stef Joosten Software development in relation algebra with Ampersand abstract, slides

Software development in relation algebra with Ampersand
Stef Joosten

Relation algebra can be used as a programming language for building information systems. We will demonstrate this principle by discussing a database-application for legal reasoning. In this case study, we will focus on the mechanisms of programming in relation algebra. Besides being declarative, relation algebra comes with attractive promises for developing big software. The case study was compiled with Ampersand, a compiler whose design and development is the focus of a research collaboration between Ordina and the Open University.
20. 9 May,
13.30 - 14.30
Hieke Keuning Code Quality Issues in Student Programs abstract, slides

Code Quality Issues in Student Programs
Hieke Keuning

Abstract: Because low quality code can cause serious problems in software systems, students learning to program should pay attention to code quality early. Although many studies have investigated mistakes that students make during programming, we do not know much about the quality of their code. We have examined the presence of quality issues related to program-flow, choice of programming constructs and functions, clarity of expressions, decomposition and modularization in a large set of student Java programs. We investigated which issues occur most frequently, if students are able to solve these issues over time and if the use of code analysis tools has an effect on issue occurrence. We found that students hardly fix issues, in particular issues related to modularization, and that the use of tooling does not have much effect on the occurrence of issues.
19. 4 April Greg Alpár The Alphabet of ABCs abstract, slides

The Alphabet of ABCs
Greg Alpár

The digital world becomes more and more a part of the physical world. Data, bound to our identity, relationships and communications, is processed all around the world. This may result in privacy harms, including unwanted advertising, data creep, exclusion, insecurity, etc. Although much of this data could possibly be used without linking personally to you, currently everybody seems to accept that we have to give up on the control over our privacy. A fundamental technological paradigm today is that you almost have to identify yourself in order to use a given digital service. What if we could avoid the need for this? For instance, you could just prove that you have a train ticket without a central party storing an account about you and about all your trips. Or you could merely show that you are over 18 of age without having to show (and let save) your name and credential number, so you are eligible to buy a bottle of 'jenever'. Or you could do online shopping with giving away as little personal information as you used to in a department store in the 70s. Attribute-based credentials (ABCs) provide a wide variety of possible scenarios, in which you can prove that you are entitled to do things that otherwise you can only do by having to use identifiers. In the last OUrsi meeting Fabian presented the IRMA technology, one of the most important practical ABC projects, and its applications. In this OUrsi meeting, I will talk about the neat cryptographic and mathematic techniques that allow for the above-mentioned high level of flexibility and practical privacy. (Don't get scared, you won't need lots of background in mathematics to understand my presentation!)
18. 7 March Rolando Trujillo
(University of Luxembourg)
Counteracting active attacks in social network graphs abstract, slides

Counteracting active attacks in social network graphs
Rolando Trujillo

Active privacy attacks in social networks attempt to de-anonymize a social graph by using social links previously established between the attackers and their victims. Malicious users are hard to identify a priori, so active attacks are typically prevented by anonymizing (perturbing) the social graph prior publication. This talk is an introduction to this type of privacy attack and the anonymization techniques that have been proposed to counteract it. Towards the end of the talk we will discuss drawbacks of current approaches and possible improvements.
17. 7 February 2017 Fabian van den Broek IRMA, as simple as ABC abstract, slides

IRMA, as simple as ABC
Fabian van den Broek

I will present the IRMA technology. IRMA (I Reveal My Attributes) is an attribute-based credential system in which users can obtain credentials stating attributes about themselves (e.g.\ I'm an employee of the Open University, with employee-number 1234). These attributes can be disclosed selectively to verifiers (e.g. I'm an employee of the Open University, xxxxxxxxxxxxxxxxxxxxx), offering a very flexible and potentially privacy-friendly way of authentication. This presentation will mostly focus on the progress we made in the last two years and the progress we are hoping to achieve in the coming years.

16. 25 October Jeroen Keiren Game-based approach to branching bisimulation abstract, slides

Game-based approach to branching bisimulation
Jeroen Keiren

Branching bisimilarity and branching bisimilarity with explicit divergences are typically used in process algebras with silent steps when relating implementations to specifications. When an implementation fails to conform to its specification, i.e., when both are not related by branching bisimilarity (with explicit divergence), pinpointing the root causes can be challenging. In this paper, we provide characterisations of branching bisimilarity (with explicit divergence) as games between Spoiler and Duplicator, offering an operational understanding of both relations. Moreover, we show how such games can be used to assist in diagnosing non-conformance between implementation and specification.
15. 4 October Arjen Hommersom Flexible Probabilistic Logic for Solving the Problem of Multimorbidity abstract, slides

Flexible Probabilistic Logic for Solving the Problem of Multimorbidity
Arjen Hommersom

One of the central questions in artificial intelligence is how to combine knowledge representation principles with machine learning methods. Effectively dealing with structure and relations in data opens up a wealth of applications in areas such as economics, biology, and healthcare. In this talk I will discuss a recent project proposal, which aims to develop a new methodology that provides powerful methods for deriving knowledge from structured data using state-of-the-art probabilistic logics. These methods will be applied to deal with a highly significant and challenging problem in modern medicine: managing patients with multimorbidity, i.e., patients that have multiple chronic conditions. The main purpose of this talk is to obtain feedback as this project will likely soon be presented to an evaluation committee in the exact sciences.

14. 28 June Hugo Jonker Gaming the H-index abstract

Gaming the H-index
Hugo Jonker

In this talk, we discuss some early results looking into formally modelling the scientific process, and elucidating what attacks can be performed on bibliographic metrics. Specifically, we distinguish between hacking, fraud, and gaming, provide initial notions of these three terms and initial results towards identifying which gaming attacks are possible on the H-index.
13. 31 May Josje Lodder Strategies for axiomatic Hilbert-style proofs abstract, slides

Strategies for axiomatic Hilbert-style proofs
Josje Lodder

Een van de onderwerpen die studenten tegenkomen bij een logicacursus is een afleidingssysteem zoals natuurlijke deductie of axiomatisch afleiden. Het vinden van afleidingen is niet altijd eenvoudig. Studenten moeten hier veel mee oefenen, en een e-learning systeem kan hier bij helpen. Voor natuurlijke deductie bestaan er verschillende ondersteunende tools, maar voor axiomatisch afleiden zijn er geen tools die studenten helpen bij het maken van deze afleidingen. Om gerichte feedback en feedforward te kunnen geven hebben we strategie├źn ontwikkeld waarmee we automatisch axiomatische bewijzen kunnen genereren, en we gaan deze strategie├źn ook gebruiken om uitwerkingen van studenten te volgen, zodat we op elk moment gericht feedback kunnen geven.
12. 26 April Bastiaan Heeren Ask-Elle: an Adaptable Programming Tutor for Haskell Giving Automated Feedback abstract, slides

Ask-Elle: an Adaptable Programming Tutor for Haskell Giving Automated Feedback
Bastiaan Heeren

Ask-Elle is a tutor for learning the higher-order, strongly-typed functional programming language Haskell. It supports the stepwise development of Haskell programs by verifying the correctness of incomplete programs, and by providing hints. Programming exercises are added to Ask-Elle by providing a task description for the exercise, one or more model solutions, and properties that a solution should satisfy. The properties and model solutions can be annotated with feedback messages, and the amount of flexibility that is allowed in student solutions can be adjusted. The main contribution of our work is the design of a tutor that combines (1) the incremental development of different solutions in various forms to a programming exercise with (2) automated feedback and (3) teacher-specified programming exercises, solutions, and properties. The main functionality is obtained by means of strategy-based model tracing and property-based testing. We have tested the feasibility of our approach in several experiments, in which we analyse both intermediate and final student solutions to programming exercises, amongst others.
11. 29 March Hugo Jonker MitM attacks evolved abstract, slides

MitM attacks evolved
Hugo Jonker

The security community seems to be thoroughly familiar with man-in-the-middle attacks. However, the common perception of this type of attack is outdated. It originates from when network connections were fixed, not mobile, before 24/7 connectivity became ubiquitous. The common perception of this attack stems from an era before the vulnerability of the protocol's context was realised. Thanks to revelations by Snowden and by currently available man-in-the-middle tools focused on protocol meta-data (such as so-called `Stingrays' for cellphones), this view is no longer tenable. Security protocols that only protect the contents of their messages are insufficient. Contemporary security protocols must also take steps to protect their context: who is talking to whom, where is the sender located, etc. In short: the attacker has evolved. It's high time for our security models and requirements to catch up.
10. 26 February
13.45-14.30, Utrecht
Tanja Vos TESTAR: Automated Testing at the User Interface Level abstract, slides

TESTAR: Automated Testing at the User Interface Level
Tanja Vos

Testing applications at the User Interface level is an important yet expensive and labour-intensive activity. Several tools exist to automate UI level testing. These tools are based on capture replay or visual imagine recognition. We present TESTAR, a tool for automated GUI testing that takes a totally different approach and has demonstrated to be highly useful in practice. Using TESTAR, you can start testing immediately and you do not need to specify test cases in advance. TESTAR automatically generates and executes random test sequences based on a structure that is automatically derived from the UI through the accessibility API. Defects that can be found this way are crashes or suspicious outputs. For defects related to functionality, expected outputs need to be added to the tool. This can be done incrementally. The structure on which testing is based is build automatically during testing, the UI is not assumed to be fixed and tests will run even though the UI evolves, which will reduce the maintenance problem that threatens the approaches mentioned earlier.
9. 23 February Freek Verbeek Cross-layer invariants for Network-on-Chips abstract, slides

Cross-layer invariants for Network-on-Chips
Freek Verbeek

One of the major challenges in the design and verification of manycore systems is cache coherency. In bus-based architectures, this is a well-studied problem. When replacing the bus by a communication network, however, new problems arise. Cross-layer deadlocks can occur even when the protocol and the network are both deadlock-free when considered in isolation. To counter this problem, we propose a methodology for deriving cross-layer invariants for Network-on-Chips. These invariants relate the state of the protocols run by the cores to the state of the communication network. We show how they can be used to prove the absence of cross-layer deadlocks. Our approach is generally applicable and shows promising scalability.
8. 26 January 2016 Harald Vranken About a recently submitted research proposal

7. 27 October Bernard van Gastel Inschatten van energieverbruik van software door middel van een typesysteem abstract, slides

Inschatten van energieverbruik van software door middel van een typesysteem
Bernard van Gastel

Energieverbruik speelt een steeds bepalendere rol binnen ICT. In veel toepassingsgebieden is het belangrijk om het energieverbruik te voorspellen, en ervoor te kunnen optimaliseren. Deze methoden moeten praktisch zijn: zowel snel feedback geven als voor grote programma/bibliotheken toepasbaar zijn. Omdat het energieverbruik sterkt afhangt van de hardware die gebruikt wordt, moet de gebruikte hardware snel in te stellen zijn en te wisselen zijn. In dit kader stellen wij een type systeem voor dat statisch gebruikt kan worden om energie verbruik compositioneel af te leiden. Dit type systeem werkt op een kleine theoretische programmeertaal waarbij hardware interacties expliciet gemaakt zijn.
6. 29 September
10.00-11.00, Eindhoven
Sung-Shik Jongmans Protocol Programming with Automata: (Why and) How? abstract, slides

Protocol Programming with Automata: (Why and) How?
Sung-Shik Jongmans

With the advent of multicore processors, exploiting concurrency has become essential in writing scalable programs on commodity hardware. One important aspect of exploiting concurrency is programming protocols. In this talk, I explain my recent work on programming protocols based on a theory of automata.
5. 25 August Sylvia Stuurman Setup for a experiment on teaching refactoring slides
4. 30 June Jeroen Keiren State Space Explosion: Facing the Challenge abstract, slides

State Space Explosion: Facing the Challenge
Jeroen Keiren

When processes run in parallel, the total number of states in a system grows exponentially; this is typically referred to as the state space explosion problem. The holy grail in model checking is finding a way in which we can effectively cope with this state space explosion. In this talk I discuss a line of research in which we use an equational fixed point logic and parity games as a basis for verification. I will focus on the latter, and discuss ideas that can be used to reduce these parity games. For those wondering whether this general theory has any applications at all I discuss industrial applications of this technology. If you want to know what these applications are, come listen to my talk!
3. 26 May Arjen Hommersom Probabilistic Models for Understanding Health Care Data abstract, slides

Probabilistic Models for Understanding Health Care Data
Arjen Hommersom

In the first part of this talk, I will give an overview of my recent research in using probabilistic models for health care applications. In the second part, I will zoom into recent results from the NWO CAREFUL project, where we attempt to reconcile probabilistic automata learning with Bayesian network modelling. In particular, the aim is to identify the probabilistic structure of states within a probabilistic automaton. We propose a new expectation-maximisation algorithm that exploits the state-based structure of these models. This model is currently being applied to learn models from psychiatry data to provide novel insights into the treatment of psychotic depressions. Preliminary results will be discussed.
2. 28 April Hugo Jonker FP-Block: usable web privacy by controlling browser fingerprinting abstract, slides

FP-Block: usable web privacy by controlling browser fingerprinting
Hugo Jonker

Online tracking of users is used for benign goals, such as detecting fraudulous logins, but also to invade user privacy. We posit that for non-oppressed users, tracking within one website does not have a substantial negative impact on privacy, while it enables legitimate benefits. In contrast, cross-domain

tracking negatively impacts user privacy, while being of little benefit to the user. Existing methods to counter tracking treat any and all tracking similar: client-side storage is blocked, or all sites are fed random characteristics to hamper re-identification. We develop a tool, FP-Block, that counters cross-domain tracking, while allowing intra-domain tracking. For each visited site, FP-Block generates a unique, consistent fingerprint: a "web identity". This web identity is then used for any interaction with that site, including for third-party embedded content. This ensures that ubiquitously embedded parties will see different, unrelatable fingerprints from different sites, and can thus no longer track the user across the web.
1. March 31 2015 Christoph Bockisch A design method for modular energy-aware software abstract, slides

A design method for modular energy-aware software
Christoph Bockish

Nowadays achieving green software by reducing the overall energy consumption of the system is becoming more and more important. A well-known solution is to make the software energy-aware by extending its functionality with energy optimizers, which monitor the energy consumption of software and adapt it accordingly. Modular design of energy-aware software is necessary to make the extensions manageable and to cope with the complexity of the software. To this aim, we propose a model of the energy utilization -- or more general resource utilization -- and an accompanying design method for components in energy-intensive systems. Optimizer components can analyze such Resource Utilization Models (RUM) of other components in terms of such a model and adapt their behavior accordingly. We show how compact RUMs can be extracted from existing implementations. Using the Counterexample-Guided Abstraction Refinement (CEGAR) approach, along with model-checking tools, abstract models can be generated that help establish key properties relating to energy consumption. This approach is illustrated by the concrete example of a network manager sub-system.