Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set"
Description
Machine learning-based intrusion detection requires suitable and realistic
data sets for training and testing. However, data sets that originate from
real networks are rare. Network data is considered privacy sensitive and the
purposeful introduction of malicious traffic is usually not possible. In this
paper we introduce a labeled data set captured at a smart factory located
in Vienna, Austria during normal operation and during penetration tests with different
attack types. The data set contains 173 GB of PCAP files, which represent 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic. The captured malicious traffic was originated
by a professional penetration tester who performed two types of attacks: (a)
aggressive attacks that are easier to detect and (b) stealthy attacks that are
harder to detect. Our data set includes the raw PCAP files and extracted
flow data. Labels for packets and flows indicate whether packets (or flows)
originated from a specific attack or from benign communication. We describe
the methodology for creating the data set, conduct an analysis of the data
and provide detailed information about the recorded traffic itself. The data
set is freely available to support reproducible research and the comparability
of results in the area of intrusion detection in industrial networks.
File description:
a_day1, a_day2, s_day1, s_day2, tf_a and tf_s: Main data set, where files starting with "tf" are training files containing only benign, operational data and all other files are attack files containing both, operational data and attack data.
images.zip: Contains descriptive images about the data.
extractions.zip: Contains extracted packets, flows in both labeled and unlabeled form.
a_day_tuesday_dos.zip: additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled.
Files
a_day1.zip
Files
(136.6 GiB)
Name | Size | |
---|---|---|
md5:4f64a19cdf5d0806512cb2cb34bb09a5
|
2.8 GiB | Preview Download |
md5:e64e23cc64756365a207292bc11c8fc3
|
1.9 GiB | Preview Download |
md5:31c7c2d1186095372e917885fb0c7e0e
|
21.2 GiB | Preview Download |
md5:80e6f68d3a5024131fd7d8b69f788f20
|
5.2 GiB | Preview Download |
md5:3083b39a47ebdc9261a3d1e37d285c6a
|
4.3 MiB | Preview Download |
md5:cad30e6c1d0e7feeb06c41e3674974d0
|
5.5 GiB | Preview Download |
md5:94c35bf961d07abb2617080fe4783d57
|
6.0 GiB | Preview Download |
md5:e1e689c877b45858fc6c41fd000f85da
|
40.3 GiB | Preview Download |
md5:1a809db9022a5073ff4f30f00639addb
|
53.7 GiB | Preview Download |
Additional details
Dates
- Submitted
-
2024-08-11First Submission of Paper. Dataset available for Editors and Reviewers only.