nPrint OS Detection
Dataset Overview
This dataset was created by re-labeling the traffic originally captured in the CICIDS 2017 dataset. The labels were generated by using the mapping of IP address to operating system avaialble on the CICIDS website. The traffic was split first by the source IP address of each packet, and then split into 100 packet samples (sequentially). Although two versions of the task were tested, the leaderboard and dataset below represent the harder OS detection task, across 13 classes.
Task Description
The task is to classify the operating system of the device that sent each 100 packet sample.
Links and Facts
- Dataset Link: Google Drive
- Dataset Size (Uncompressed): < 1 GB
- Disallowed Features: IPv4 Source IP, IPv4 Destination IP, TCP Ssource Port, TCP Destination Port, TCP SEQ and TCP ACK Numbers.
- Number of Classes: 13
- pcapML Metadata Comment Format:
sampleID,easylabel_hardlabel
- Protocols: IPv4, TCP
- Metric to Optimize: Balanced Accuracy
Special Dataset Notes
None
Citation(s)
Original CICIDS Dataset
@article{sharafaldin2018toward,
title={Toward generating a new intrusion detection dataset and intrusion traffic characterization.},
author={Sharafaldin, Iman and Lashkari, Arash Habibi and Ghorbani, Ali A},
journal={ICISSp},
volume={1},
pages={108--116},
year={2018}
}
nPrint OS Detection Dataset
@inproceedings{10.1145/3460120.3484758,
author = {Holland, Jordan and Schmitt, Paul and Feamster, Nick and Mittal, Prateek},
title = {New Directions in Automated Traffic Analysis},
year = {2021},
isbn = {9781450384544},
url = {https://doi.org/10.1145/3460120.3484758},
doi = {10.1145/3460120.3484758},
series = {CCS '21}
}