Below is a collection of ideas for student projects. Some are half-backed, some are not even written down. If you are interested in systems-oriented computer science, talk to me in person. In general I expect that students have a solid understanding of operating systems and computer networks and that they are able to handle programming tasks well.
Soundification of Status Information
This is a topic for someone interested in computer generated sounds. I am interested in algorithms that convert status information obtained by monitoring systems (say a monitor of computer network) into audible sounds that (i) are not intrusive but (ii) can signal significant changes in the conditions. There is work in this space. Some people recently wrote special programming languages that allow to describe sound generations as programs.
This topic is somewhat experimental and producing a 'good' or even just a 'reasonable' solution will be difficult since most naive approaches tend to be annoying soon. This is a topic for students with a strong interested in music, sound generation, etc.
OpenIntel Data Analysis
Researchers in the Netherlands collect once per day data from a large portion of the Domain Name System as part of the OpenIntel project. See <https://www.openintel.nl/> for more information.
The data collected is a big haystack and there are likely many needles inside but they are difficult to find. The data processing usually happens using Apache big data tools, but it might be possible to work with smaller subsets of the data using Python data analysis tools.
Possible questions to research could be:
- Describe the dynamics of CDN concentration over time.
- Analyze the structural properties of IPv6 addresses announced in the DNS.
- Analyze and compare the dynamics of IP address changes for IPv4 and IPv6 addresses.
This requires either good knowledge of the Python data analysis tools or excellent knowledge of Java/Scala and Apache big data analysis frameworks.
IPv6 Target Address Generation
The IPv4 address space is small enough that it is feasible to scan the entire address space in a reasonable short time. With IPv6 addresses, such brute force scans are not feasible anymore and several authors have proposed algorithms and heuristics to generate likely IPv6 target addresses. The goal of this work will be to analyze and compare several proposals and in the ideal case to experiment with possible improvements.
eBPF in the Linux Kernel
The Berkeley Packet Filter (BPF) was designed long ago to filter network packets in the OS kernel. A user space program compiles a high-level filter expression into bytecode for a virtual machine residing in the OS kernel, which is executed when packets arrive.
Linux recently got an extended version (eBPF), which can be triggered by events (not just arriving packets) and which can do actions other than filtering. This opens the door to a number of new usages. A good intro page: <http://www.brendangregg.com/ebpf.html>
Possible questions to research could be:
- How secure, robust and scalable is eBPF?
- Prototype new usages of eBPF.
This topic requires excellent C programming skills and the ability to understand low-level system aspects.
LMAP Implementation Extensions
RFC 8193 and RFC 8194 define an information model and a data model for large-scale network measurement systems. An implementation of the data model (called lmapd) is available but there are several extensions lacking. One requirement is that lmapd should be able to run efficiently on systems with limited resources.
There is room for improvements of the current implementation and to prototype measurement scenarios in order to learn about any shortcomings of the design or implementation.
This requires good C programming skills.
Identification of people, applications, devices, etc. in aggregated network flow data.
Ripe Atlas Measurements
The RIPE NCC maintains a huge collection of measurement probes, call the RIPE Atlas project <https://atlas.ripe.net/>. It is possible to run your own experiments in order to obtain insights how the network behaves. There are extensive REST APIs to access RIPE Atlas from your favorite scripting language.
Software Update Robustness / Antifragile Software
We are interested whether it is possible to extend attacks on networked devices by causing repair mechanisms (e.g., software updates) to fail via secondary attacks.
This may be related to ideas coming from the antifragile software community, namely, how to design systems that do not get fragile when attacked but instead become more robust when attacked.