Project Topic Resources
Sample Project Topics
Below is a list of potential topics for the course project. Feel free to adapt one of these or propose another topic.
- Data breaches at universities. Combine reports of data breaches with information on universities. Join the two data sets based on names and then examine the data to see if particular characteristics of the university affect the probability of a breach occurring (e.g., public vs. private, enrollment, etc.). One could also look at university rankings to see if there is any correlation between university ranking and breach probability.
- Online password database hacks. Examine a dataset on web password database breaches to estimate the annual probability of a breach occurring, how many customers are affected per breach, and the success rate per hacker group. Another potential area of inquiry is to gather supplementary data on the affected website (e.g., category, website ranking, country, hosting provider, server software type).
- Bitcoin Mining Pools DoS Attack Measurement and Modeling. Bitcoin mining pools frequently suffer DoS attacks by competitors hoping to increase their chances of computing the next critical hash value for the block chain. One research project could investigate reports of past DoS attacks and report back on the observed incidence of such attacks. Another project could set up a game-theory model of mining pools deciding when to attack and when to mine.
- SEC cybersecurity reports by industry. The SEC now requires publicly-traded companies that have experienced or are at risk of experiencing a cybersecurity breach to disclose information in their regulatory filings. Using data collected by SMU researchers for 4000 publicly-traded companies in 2012, investigate how the reports vary across industries and other characteristics.
Here are some additional topics that could lead to suitable projects:
- Patching policies
- Vulnerability disclosure
- Cyberwar
- ISP assistance in cleaning up malware on consumer machines
- Information sharing
- Interdependent security
- Cyberinsurance
- Bitcoin
- Data breach research
- Attitudes towards privacy
- Behavioral economics of information security
- Extensions to Gordon-Loeb model
- Advertising fraud (empirical analysis, models)
- Fake antivirus software
- Challenges in empirical computer security research
- Network security (BGP, DNSSEC, ...)
- Proactive versus reactive security investment models
- Payment system security
- Economic issues in identity management
- Mapping false-positive and negative rates between ROC curves and known costs
Data Resources
If you are considering an empirical paper, look at some data sources and see if you can find one that you can do suitable analysis on. In particular, look for datasets that include numerical and categorical variables. Another approach is to pick a class of cybercrime and try to find as much information as you can to come up with an estimate of its cost, who is affected by it, and what the likelihood of attack is.
- ONI assessment of country-level censorship filtering
- Copyright and government requests to Google for content removal
- Data breaches
- Hacked online databases
- Country ZeroAccess botnet
- PhishTank repository of phishing URLs
- SANS Internet Storm Center (including API with lots of data access), list of suspicious domains, observed compromised IPs
- Shadowserver Botnet Statistics
- AA419 volunteers tracking advanced-fee fraud scams
Linking data sources
- US Hospital database
- US University database
- IP to ASN mapping
- Geolite IP to country mapping
- Geolite Python library
Again this is a partial list. The idea here is to find supplemental data that can shed more light on existing security-related data sources.