Project Topic Resources
Sample Project Topics
Below is a list of potential topics for the course project. Feel free to adapt one of these or propose another topic.
- Data breaches at universities. Combine reports of data breaches with information on universities. Join the two data sets based on names and then examine the data to see if particular characteristics of the university affect the probability of a breach occurring (e.g., public vs. private, enrollment, etc.). One could also look at university rankings to see if there is any correlation between university ranking and breach probability. A similar project could be run on another industry where comprehensive data on institutions is available.
- Online password database hacks. Examine a dataset on web password database breaches to estimate the annual probability of a breach occurring, how many customers are affected per breach, and the success rate per hacker group. Another potential area of inquiry is to gather supplementary data on the affected website (e.g., category, website ranking, country, hosting provider, server software type).
- Scholarship Scams. University email accounts receive lots of emails offering scholarships with very low requirements. These are often scams. See for example http://www.ecollegefitness.com/scholarshipawards2.html or http://www.collegecognitive.com/collegescholarship.html. For this project one could investigate the ruse behind the scam by engaging with the website, to eventually explain what the scammers' business model is. Additionally, by using targeted Google searches, one could find other similar websites that are being advertised.
- SEC cybersecurity reports by industry. The SEC now requires publicly-traded companies that have experienced or are at risk of experiencing a cybersecurity breach to disclose information in their regulatory filings. Using data collected by SMU researchers for 4000 publicly-traded companies in 2012, investigate how the reports vary across industries and other characteristics.
- Investigate scam prevalence and trends. AA419 and Escrow Fraud have tracked scam websites for years, but how is the threat evolving? Are the number of scam websites increasing or decreasing across categories? Are they shifting top-level domains or hosting infrastructure? Are some scams becoming more prevalent relative to others?
- Website Defacement Trends: Examine a database of defaced webservers for trends.
- Investigate UK Computer Crime Database: Alice Hutchings, a criminologist at the University of Cambridge, has collated a database of computer crimes occurring in the UK. Examine the database for trends and perhaps link to other sources.
Another source of ideas is to watch Dr. Richard Clayton's talk at SMU, "Evil on the Internet", in which he gives examples of live websites that are engaged in scams. A great project topic would be to choose one such scam to investigate more thoroughly.
Here are some additional topics that could lead to suitable projects:
- Patching policies
- Vulnerability disclosure
- Cyberwar
- ISP assistance in cleaning up malware on consumer machines
- Information sharing
- Interdependent security
- Cyberinsurance
- Bitcoin
- Data breach research
- Attitudes towards privacy
- Behavioral economics of information security
- Extensions to Gordon-Loeb model
- Advertising fraud (empirical analysis, models)
- Fake antivirus software
- Challenges in empirical computer security research
- Network security (BGP, DNSSEC, ...)
- Proactive versus reactive security investment models
- Payment system security
- Economic issues in identity management
- Mapping false-positive and negative rates between ROC curves and known costs
Data Resources
If you are considering an empirical paper, look at some data sources and see if you can find one that you can do suitable analysis on. In particular, look for datasets that include numerical and categorical variables. Another approach is to pick a class of cybercrime and try to find as much information as you can to come up with an estimate of its cost, who is affected by it, and what the likelihood of attack is.
- ONI assessment of country-level censorship filtering
- Copyright and government requests to Google for content removal
- Data breaches
- Hacked online databases
- Country ZeroAccess botnet
- PhishTank repository of phishing URLs
- SANS Internet Storm Center (including API with lots of data access), list of suspicious domains, observed compromised IPs
- Shadowserver Botnet Statistics
- AA419 volunteers tracking advanced-fee fraud scams
Linking data sources
- US Hospital database
- US University database
- IP to ASN mapping
- Geolite IP to country mapping
- Geolite Python library
Again this is a partial list. The idea here is to find supplemental data that can shed more light on existing security-related data sources.