Finding samples of various types of Security related can be a giant pain. This is my attempt to keep a somewhat curated list of Security related data I've found, created, or was pointed to. If you perform any kind of analysis with any of this data please let me know and I'd be happy to link it from here or host it here. Hopefully by looking at others research and analysis it will inspire people to add-on, improve, and create new ideas.
All data generated and hosted by Security Repo is done so under the following license (exceptions noted where applicable).
Security Repo by
Mike Sconzo is licensed under a
Creative Commons Attribution 4.0 International License
Q: How do you give without having to do anything?
A: Simply visit this site.
I've decided that I'm going to start posting the logs from this site to the site. It's a great way to open source some data, and after a few discussions I don't think any privacy will be violated. If I receive a lot of backlash about this decision perhaps I'll reverse it, but until further notice web logs for this domain will be available here.
Data
Created
- Network
- MACCDC2012 - Generated with Bro from the 2012 dataset
- Bro logs generated from various Threatglass samples
- Snort logs generated from various Threatglass samples
Exploit kits and benign traffic, unlabled data. 6663 samples available.
- tg_snort_fast.7z Snort Fast Alert format logs (5MB)
- tg_snort_full.7z Snort Full Alert format logs (9MB)
- Gameover Zeus DGA sample 31000 DGA domains from Dec 2014
- Domain Transfer Data Old domain transefer data from several registrars, JSON format. (8MB)
- Modbus and DNP3 logs ICS logs generated w/Bro from various PCAPs (1MB)
- Malware
- System
- Web Logs from Security Repo - these logs are generated by you the community, and me updating this site.
- Squid Access Log - combined from several sources (24MB compressed, ~200MB uncompresed)
- auth.log - approx 86k lines, and mostly failed SSH login attempts
- Honeypot data - Data from various honeypots (Amun and Glastopf) used for various BSides presentations posted below. Approx 994k entries, JSON format.
- Other
3rd Party
- Other
- Network
- KDD Cup 1999 Data - Network connection data [License Info: Unknown]
- NETRESEC - Publicly available PCAP files - loads of great PCAP files [License Info: Unknown]
- Internet-Wide Scan Data Repository - Various types of scan data [License Info: Unknown]
- Detecting Malicious URLs - Mirror - URLS/features/labels [License Info: Unknown]
- hackertarget 500K HTTP Headers - HTTP Headers [License Info: Unknown]
- Threatglass - PCAPs that contain various exploit kits as well as some legit traffic mixed in. [License Info: Unknown]
- pcapr - Searchable repository of PCAPs, look for various phrases to pull out the Security related ones (eg. exploit, xss, etc...) [License Info: TOS]
- OpenDNS public domain lists - various domain lists [License Info: Public Domain]
- MIT 1999 DARPA Intrusion Detection Evaluation Data Set - Labeled attack and nont attack data (PCAP and system logs) [License Info: Unknown]
- MIT 1998 DARPA Intrusion Detection Evaluation Data Set - Network and file system data [License Info: Unknown]
- DDS legit and DGA labeled domains - DDS Blog [License Info: Unknown]
- Honeypot Data - DDS Blog [License Info: Unknown]
- Honeypot Data with GeoIP info - DDS Blog [License Info: Unknown]
- DGA Domains - updated frequently [License Info: License]
- Malware URLs - updated daily list of domains and URLs associated with malware [License Info: Disclaimer posted in link]
- UDP Scan data - provided by Rapid7 [License Info: Unknown]
- Continously updated IP block list - Created by Packetmail (?) [License Info: no for-sale or paywall use]
- Common Crawl - "open repository of web crawl data that can be accessed and analyzed by anyone" [License Info: Open]
- Malware Traffic Analysis - a site with labled exploit kits and phishing emails. [License Info: Unknown]
- Simple Web Traces - Cloud Storage, DDoS, DNSSEC, and may more types of PCAPs. [License Info: Various]
- SiLK - LBNL-05 Anonymized enterprise packet header traces. [License Info: Unknown]
- DGA Archive Multiple DGA data sets generated by the actual algorithm vs. captured network traffic. [License Info: CC BY-NC-SA 3.0]
- Information Security Centre of Excellence (ISCX) Data related to Botnets and Android Botnets. [License Info: Unknown]
- CSIC 2010 HTTP Dataset Labeled (normal, anomalous) HTTP data in CSV format. [License Info: Unknown]
- VAST Challenge 2012 IDS logs generated by IEEE [License Info: Unknown]
- University of Victoria Botnet Dataset Malicious and benign traffic from LBNL and Ericsson (merged publically available data)[License Info: Unknown]
- UCSD Network Telescope Dataset on the Sipscan Public and restricted datasets of various malware and other network traffic. [License Info: Available on dataset page]
- UNSW-NB15 This data set has nine families of attacks, namely, Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode and Worms. (CSV data) [License Info: Unknown]
- Stratosphere IPS Public Datasets PCAPs, Samples, etc... [License Info: Unknown]
- Awesome Industrial Control System Security - Has links to SCADA PCAPs and other SCADA related resources [License Info: Apache License 2.0 (site), Data: various]
- Cisco Umbrella Popularity List - Top 1 million most daily popular domains [License Info: Unknown]
- Alexa Top 1 Million - The static 1 million most popular sites by Alexa [License Info: Unknown]
- Using machine learning to detect malicious URLs - Cade and labeled URL data. [License Info: Unknown]
- Majestic Million Domains - Top million domains with the most referring subnets. [License Info: Attribution 3.0 Unported (CC BY 3.0)]
- IoT device captures IoT Device PCAP by Aalto University Research [License Info: Listed on site]
- Project Bluesmote - Syrian Bluecoat Proxy Logs [License Info: Public Domain]
- Data for a Black Hat 2017 Handout - Various types of data (network, host, etc...) for different use cases (e.g. Remote Exploitation, Spear Phishing, Ransomware, WebShell) [License Info: Apache 2]
- Aktion Open Source Exploit Detection Tool - Variety of different kinds of data centered around exploit detection [License Info: Apache 2]
- Atkion V2 Open Source Exploit Detection Tool - Variety of different kinds of data centered around exploit detection [License Info: Apache 2]
- 2017-SUEE-data-set - PCAP files that show various HTTP attack (slowloris, slowhttptest, slowloris-ng) [License Info: Unknown]
- UCI ML Repository - Website Phishing Data Set A collection of Phishing Websites as well as legitimate ones. [License Info: Listed on site]
- 2007 TREC Public SPAM Corpus - SPAM Corpus [License Info: Listed on site]
- ML Driven Web Application Firewall - Machine learning driven web application firewall to detect malicious queries with high accuracy (URL data) [License Info: Unknown]
- West Point NSA Data Sets - Snort IDS, DNS Service, and Web Server logs. [License Info: Unknown]
- Phish-IRIS - A small scale multi-class phishing web page screenshots archive [License Info: Listed on site]
- DGArchive - Samples of DGA domains from various types of malware. [License Info: Contact for access/info]
- Netlab360 DGA Domains - Samples of DGA domains from various types of malware. License Info: Unknown]
- Quantcast Top Sites - Most popular sites on the Internet according to Quantcast. [License Info: Unknown]
- DomCop Top 1M - Top One Million sites according to DomCop. [License Info: Unknown]
- Blackweb Domains - A project that aims to categorize as many domains as possible, also provies a whitelist. [License Info: Unknown]
- Charles University SIS Access Log Dataset - The package contains an anonymized server log collected on a live installation of a student information system run by Charles University between May and November 2018 [License Info: Creative Commons Attribution 4.0 International]
- Malware
- System
- File
- contagio malware dump - A resource for files/data regarding targeted attacks [License Info: Unknown]
- VirusShare.com - Because Sharing is Caring [Login Required] - Huge collection of downloadable/torrentable malware files for various architectures [License Info: Unknown]
- Vx Heaven - sorted by AV set of virus samples (available via BitTorrent) [License Info: Unknown]
- TechHelpList SPAM List - Samples of SPAM messages and associated threat that was delivered in addition to other rich information [License Info: Unknown]
- MalShare - A community driven public malware repository. [License Info: TOS]
- URLhaus - Daily malware batches. [License Info: CC-0]
- MALWAREbazzar - Daily malware batches. [License Info: CC-0]
- Password
- Threat Feeds
- ISP Abuse Email Feed - Feed showing IOCs from various Abuse reports (other feeds also on the site) [License Info: Unknown]
- VXvault - List of URLs and MD5s that are malicious [License Info: Unknown]
- AlienVault OTX - Build your own threat feed from community contributors, complete with API [License Info: Legal Info]
- Tracker - Malware hashes and their associated campaigns [License Info: About]
- Malware Domain List - Labeled malicious domains and IPs [License Info: Unknown]
- Clean MX Phishing DB - URLs and IPs associated with phishing emails, also targets are listed where determined [License Info: Unknown]
- Clean MX Virus DB - Labeled URLs and IPs associated with various types of malware [License Info: Unknown]
- TechHelpList MalTLQR Upatre and Dyreza Tracker - IPs and hashes for Upatre and Dyreza families [License Info: Unknown]
- CyberCrime Tracker - Labled URLs and IPs for various malware families [License Info: Unknown]
- CyberCrime ZbotScan - List of hashes associated with various Zbot variants [License Info: Unknown]
- abuse.ch trackers - Trackers for ransomeware, ZeuS, SSL Blacklist, SpyEye, Palevo, and Feodo [License Info: Unknown]
- Unit 42 Indicators - Indicators from the Unit 42 reports [License Info: Unknown]
- Threat Feeds - Threat feed aggregator [License Info: Various]
- C2IntelFeeds - Automatically created C2 feeds, currently VPNs and various C2. [License Info: Unknown]
Contact
If you dig the site, have data, need data, or whatever, find me on Twitter or GitHub.
Misc
Various things that I needed to stick someplace.