Crawl of the Alexa Top Million domains from October 5-7, 2016 using ZBrowse, a headless Chrome browser instrumented to track object dependencies.

Study Details

Study
Security Challenges in an Increasingly Tangled Web
World Wide Web Conference (WWW), April 2017
Authors
Deepak Kumar, Zane Ma, Zakir Durumeric, Ariana Mirian, Joshua Mason, J. Alex Halderman, Michael Bailey
Citation
@InProceedings{tangledweb17,
	title={Security Challenges in an Increasingly Tangled Web},
	author={Kumar, Deepak and Ma, Zane and Durumeric, Zakir and Mirian, Ariana and
		Mason, Joshua and Halderman, J. Alex and Bailey, Michael},
	booktitle = {Proceedings of the 26th International Conference on World Wide Web},
	year={2017}
}
Contact
Deepak Kumarde, Censys Research Team
Tags
TCP/80, ZBrowse

Dataset Details

Crawl of the Alexa Top Million domains from October 5-7, 2016 using ZBrowse, a headless Chrome browser instrumented to track object dependencies. The dataset contains one JSON blob per website, and presents the dependencies loaded by the website in a tree structure.

File Download

File NameMetaDataSHA-1 FingerprintSizeUpdated At
www_17.lz4 unavailable 78C0491969DA6CC84E032CFEE1B9408437A7E2D4 42 GB 2016-06-12