ITEA is the Eureka Cluster on software innovation
ITEA is the Eureka Cluster on software innovation
ITEA 4 page header azure circular

IoT Dataset 2024b for Training AI Models

Project
20020 ENTA
Type
New service
Description

Network traffic from nine widely-used Internet of Things (IoT) devices is collected. These IoT devices include: Amcrest Camera, Smart Coffeemaker, Ring Doorbell, Amazon Echodot, Google Nestcam, Google Nestmini, Kasa Powerstrip, Samsung 32 inch Smart Television, and Amazon Smartplug. The traffic collected from these devices are stored as individual .pcap files. The dataset can be used for research purpose in the area of IoT Traffic Analysis. The dataset is available for download from IEEE Dataport: https://ieee-dataport.org/documents/dalhousie-nims-lab-iot-2024-dataset

Contact
Nur Zincir-Heywood
Email
zincir@cs.dal.ca
Research area(s)
IoT Traffic Classification, Activity Detection, Traffic Analysis
Technical features

This dataset presents real-world IoT device traffic captured under a scenario termed "Active," reflecting typical usage patterns encountered by everyday users. Our methodology emphasizes the collection of authentic data, employing rigorous testing and system evaluations to ensure fidelity to real-world conditions while minimizing noise and irrelevant capture. The dataset comprises of nine popular IoT devices. Each device's traffic is stored in individual files. For our research, we extract flows from these files using flow analysis tools: Tranalyzer and NFStream. The dataset is organized into device-specific folders, with each containing the "Active" scenario and corresponding files labeled. Comprehensive details regarding our setup and methodology are provided in our IEEE/IFIP NOMS - AnNet 2024 paper, along with a thorough explanation of the dataset's structure in the readme file. Notably, all captured data is benign, and devoid of any malware. This dataset serves as a valuable resource for understanding IoT device behaviour and using AI models for network traffic patterns in a real-world context.

Integration constraints

The data can be used for model training for Artificial Intelligence (AI), Machine Learning (ML), or Deep Learning (DL) experiments.

Targeted customer(s)

Researchers in the area of Network Traffic Analysis. Researchers can be from Academia, Government, or Industry.

Conditions for reuse

The terms of reuse is dictated by IEEE Dataport licensing agreement.

Confidentiality
Public
Publication date
22-07-2024
Involved partners
Dalhousie University (CAN)