Link Search Menu Expand Document

Nmap Network and IoT Device Fingerprinting

Dataset Overview

This dataset was gathered by actively probing remote network devices across the internet using Nmap. Labels were gathered through a combination of SSH and Telnet banner grabs (for routers) and Shodan (for IoT devices)

Task Description

The task is to use the raw packet responses from Nmap’s probes to accurately classify the 15 classes of devices.

  • Dataset Link: Google Drive
  • Dataset Size (Uncompressed): < 1GB
  • Disallowed Features: None
  • Number of Classes: 15
  • pcapML Metadata Comment Format: sampleID,label,probe_name
  • Protocols: IPv4, TCP, ICMP
  • Optimization Metric: balanced accuracy

Special Dataset Notes

Device fingerprinting represents a unique task in that each packet in the dataset is a response to a specific probe. As such, each response packet in the dataset is named according to the probe that it was responding to. This extra information is helpful when extracting features from the dataset, and is encoded in the pcapML metadata comment.

The IPs have been cryptopanned prior to release.

Citation(s)

Routers:

@article{holland2020classifying,
  title={Classifying Network Vendors at Internet Scale},
  author={Holland, Jordan and Teixeira, Ross and Schmitt, Paul and Borgolte, Kevin and Rexford, Jennifer and Feamster, Nick and Mayer, Jonathan},
  journal={arXiv preprint arXiv:2006.13086},
  year={2020}
}

IoT:

@inproceedings{10.1145/3460120.3484758,
author = {Holland, Jordan and Schmitt, Paul and Feamster, Nick and Mittal, Prateek},
title = {New Directions in Automated Traffic Analysis},
year = {2021},
isbn = {9781450384544},
url = {https://doi.org/10.1145/3460120.3484758},
doi = {10.1145/3460120.3484758},
series = {CCS '21}
}

Leaderboard


ModelBalanced AccuracyPaperCode
nPrint Autogluon95.4CCS 2021 - ArxivnPrint