Web Privacy Measurement is the observation of websites and serves to detect, characterize and quantify privacy-impacting behaviors. Applications of Web Privacy Measurement include the detection of price discrimination, targeted news articles and new forms of browser fingerprinting. Although originally focused solely on privacy violations, WPM now encompasses measuring security violations on the web as well.
For these studies to be truly large-scale and repeatable, creating an automated measurement platform is necessary. At least within the academic literature, measurement infrastructures in the field of WPM have been largely one-off and do not comprehensively address the engineering challenges within this realm.
OpenWPM, a flexible, stable, scalable and general web measurement platform, is our solution to this infrastructure vacuum. OpenWPM is a web privacy measurement framework which makes it easy to collect data for privacy studies on a scale of thousands to millions of site. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection
OpenWPM has been developed and tested on Ubuntu 14.04/16.04. An installation script,
install.sh is included to install both the system and python dependencies automatically. A few of the python dependencies require specific versions, so you should install the dependencies in a virtual environment if you’re installing a shared machine. If you plan to develop OpenWPM’s instrumentation extension or run tests you will also need to install the development dependencies included in
It is likely that OpenWPM will work on platforms other than Ubuntu
Our primary technical contributions thus far are as follows:
Note that OpenWPM is under active development, and should be considered experimental software.