|phone:||+31 (0)45 576 2143|
For students with a background in security and/or formal methods, I am happy to supervise thesis projects on security and on privacy, under the broad theme "Security and Privacy in Modern Times". Concrete projects will align with my current research. For an idea of the type of subjects I am happy to supervise:
Projects can be tailored to MSc as well as BSc students. Other topics exist and can be discussed - contact me if you're interested.
Note. In general, I expect the thesis to be written in English, and to provide a solid basis for a (later to be written) publication. See also the page about my approach to supervision.
In general, I'm interested in research that identifies the "bad guys", research that helps to identify security or privacy weaknesses (or the impact of such weaknesses), and research to mitigate security and privacy issues. More concretely, here is an incomplete list of project categories I'm happy to supervise. If you're looking for a project in one of these categories, do contact me.
This is a partial list of project ideas. Contact me to discuss specific subjects of interest to you.
In federated learning, you train an ML-approach locally, then
the locally trained models are aggregated into a global model. This
allows for privacy in training ML-models. However, it also allows a
malicious agent to submit an incorrectly trained model. The goal of this
project is to measure the reliability of submitted model parameters.
Skills used: machine learning, deep learning.
Co-supervisor: Mina Alishahi (OU).
Online reviews have become important in selecting which goods to
purchase. However, online reviews can be manipulated - and even
be for completely different products. The goal of this project is
to distinguish bogus reviews from genuine reviews of the offered
Skills used: NLP, web scraping.
Co-supervisor: Mina Alishahi (OU).
Online abuse has become a rampant phenomena. In this project,
we're focused on collecting data from social media that allows
analysis of origins of online abuse -- either the development of
victims, or the development of abuser. How does abuse start? When
does abuse cross a line? Do victims retaliate? Do abusers become
increasingly abusive? How much of this can be automatically
identified at a large scale?
Skills used: scraping, text analysis, security assessment, predictive modelling.
Co-supervisor: Clara Maathuis (OU).
Scientific fraud is increasingly becoming a
problem. Several classes of fraud (such as plagiarism) can be
automatically detected. However, such detection methods are
focused on the results of one specific type of fraud, instead of
the underlying incentives behind fraud.
In this project, we build upon previous work that developed methods to identify outliers in publication metrics. This project focuses on "secondary" or derived publication metrics, such as number of co-authors, average number of papers with co-authors, etc. The goal is to identify which of such "derived" or secondary publication metrics are useful as indicators for scientific fraud.
Skills used in project: basics of set theory, basic python programming.
There exist several rival frameworks that
automate browsers: specific frameworks such as
Puppeteer+DevTools (for Chrome) and Marionette (for Firefox),
but also generic frameworks such as Playwright
(Chrome/FF/Webkit) and Selenium+WebDriver. The goal of this
project is to investigate differences and similarities between
these, including automatically determining their browser
fingerprint surfaces. That is: to what extent do such frameworks
have a different browser fingerprint than the browser they rely
on for automation?
Skills used: web automation, programming.
Online marketplaces where small companies
and individuals can sell goods have become commonplace.
Examples of such marketplaces include those for traditional
goods (Amazon.com, bol.com, etc.), but also marketplaces for
digital goods such as e-books.
Typically, such marketplaces offer some quality control for helping users gauge sellers, such as reviews. However, these controls may be gamed. On Amazon, it is possible to swap out a specific item (e.g., jar of honey) for a much more expensive item (e.g., drone) while keeping the reviews and user ratings(!). In e-books, a boom in spammy books has occurred. (see the "fighting ebook spam" project below).
The goal of this project is to semi-formalise this problem and design and develop a way towards automated detection of such baiting / bait and switch shenanigans.
Skills used: programming, web scraping.
Possible directions: formalisation, machine learning, theoretical generic approach, practical market-specific solution.
Web bots (scrapers) automatically traverse
the internet to gather data from and measure aspects of web
sites. Web bots may be used for benign as well as nefarious
purposes. To combat nefarious bots, web sites sometimes employ
bot detection methods. Unfortunately, anti-bot measures affect the
reliability of studies performed using web bots. Thanks to
preliminary work, some lower bounds on the prevalence of web bot
detection are known. However, previous work uses two orthogonal
approaches to identify bot detection. As such, a comprehensive
picture is missing. The goal of this project is to construct
a classifier, train it to recognise bot detection, and determine
how often web bot detection is used on the internet. The input
for the recognition is to come from two orthogonal approaches:
fingerprint-surface based detection of web bots and behavioral
detection of web bots.
Skills used: web scraping, machine learning.
NTFS is the default system in Windows since
Windows 2000. In addition to a plethora of NTFS drivers by
Microsoft, there are also third party drivers, such as a MacOS
driver by Paragon and NTFS-3g, included by default in Ubuntu.
These drivers may all behave subtly different. Disk allocation
strategies, fragmentation patterns and other properties may
reveal what NTFS driver operated on a disk. The goal of this
project is to find specific characteristics which indicate a
specific driver. Being able to identify one or more drivers used
on a disk from its contents has applications in digital
forensics, such as finding hidden OSes, or assisting in
determining file origins.
Skills used: experiment design, programming, virtual machines, low-level analysis.
Privacy is a hot research topic. Many
papers analyse privacy of systems. To do so, they have to specify
what privacy (in their specific case) actually is. This has led to
a handful of different formalisations of privacy. The goal of the
project is to establish a formal framework in which at least three
approaches to privacy definitions (observational equivalence,
unlinkability and quantified privacy) can be formalised and
compared. Are they equivalent? If not, in which cases do they
Expected prior knowledge: basic formal modelling, basic formal analysis (trace equivalences, observational equivalences).
A phone can read its own vibration out
using its accelerometers. This is unique for each phone and
cannot be imitated: a physical uncloneable function or PUF.
However, many apps can trigger the buzz function and read out
accelerometer values. The goal of this project is to develop an
app that allows for authentication using this PUF functionality
in a secure way.
Skills used in project: Android programming, security analysis.
Co-supervisor: dr. Fabian van den Broek.
This project builds upon the work of the
Pwitter project in developing a privacy layer for Twitter. In this
project, the concept is extended to a more generic framework,
beyond the simple structure of Twitter (where there only exists a
follow relation). You will build a layer on top of an existing,
complex social network that enables a user to privately
communicate over the social network, while retaining privacy
against other users and the social network. The specific privacy
guarantees enabled by your layer will be formally analysed.
Skills used in this project: browser plugin programming, formal security analysis.
Spam is not only an email problem. There are
ebooks that are copy-pasted together, flung together quickly with
no regards for quality of content, only to provide a revenue
stream for their authors. There are various schemes related to
ebook spam. There is a scheme in which the books themselves
provide the revenue. This type of scheme relies on selling many
books to turn a profit, and thus is more likely to use fraudulent
means to promote the book (fake reviews, etc.). Another type of
scheme relies on Amazon's Kindle Direct Publishing
programme, in which Amazon pays out money depending on the amount
of pages read.
The goal of this project is to investigate the current state of ebook spam schemes, and devise countermeasures.
Skills used in the project: web scraping, programming.