Link Search Menu Expand Document

Usage

pcapml_fe exposes a simple iterator over a sorted pcapml encoded dataset. The iterator can return raw bytes, scapy packets or dpkt packets. The iterator contains all of the information about a single traffic sample, including the sampleID, the metadata associated with the traffic, the raw packet buffers, and the timestamps associated with each raw packet.

Below is a full example (source available here) of loading and iterating over the pcapml encoded DTLS dataset available for download

Note: The third import provides helpers that can read raw bytes into dpkt or scapy Ethernet frames (dpkt_readEther and scapy_readEther, respectively). The helpers are not strictly necessary to parse pcapML files.

import argparse
import pcapml_fe
from pcapml_fe_helpers import *

def main():
    '''
    Reads a pcapng file labeled and sorted with pcapml, presenting traffic samples to 
    the user for features to be extracted from. To test the method on a new dataset
    the only needed change is to load in a different dataset
    '''
    parser = argparse.ArgumentParser()
    parser.add_argument('pcapml_dataset')
    args = parser.parse_args()
    
    for traffic_sample in pcapml_fe.sampler(args.pcapml_dataset):
        extract_info(traffic_sample)

def extract_info(traffic_sample):
    '''
    Each sample contains the sampleID, metadata and a list of packets 
    with their associated timestamps
    '''
    print("Sample ID:", traffic_sample.sid)
    print("Sample metadata:", traffic_sample.metadata)
 
    '''
    iterating over the traffic sample (packets and timestamps)
    '''
    for pkt in traffic_sample.packets:
        # Print packet timestamp and raw bytes
        print(pkt.ts, pkt.raw_bytes)

        dpacket = dpkt_readEther(pkt.raw_bytes)
        dpacket.pprint()

        spacket = scapy_readEther(pkt.raw_bytes)
        print(spacket.summary())
        pass

if __name__ == '__main__':
    main()