IoT Dataset 2024b for Training AI Models
- Project
- 20020 ENTA
- Type
- New service
- Description
Network traffic from nine widely-used Internet of Things (IoT) devices is collected. These IoT devices include: Amcrest Camera, Smart Coffeemaker, Ring Doorbell, Amazon Echodot, Google Nestcam, Google Nestmini, Kasa Powerstrip, Samsung 32 inch Smart Television, and Amazon Smartplug. The traffic collected from these devices are stored as individual .pcap files. The dataset can be used for research purpose in the area of IoT Traffic Analysis. The dataset is available for download from IEEE Dataport: https://ieee-dataport.org/documents/dalhousie-nims-lab-iot-2024-dataset
- Contact
- Nur Zincir-Heywood
- zincir@cs.dal.ca
- Research area(s)
- IoT Traffic Classification, Activity Detection, Traffic Analysis
- Technical features
This dataset presents real-world IoT device traffic captured under a scenario termed "Active," reflecting typical usage patterns encountered by everyday users. Our methodology emphasizes the collection of authentic data, employing rigorous testing and system evaluations to ensure fidelity to real-world conditions while minimizing noise and irrelevant capture. The dataset comprises of nine popular IoT devices. Each device's traffic is stored in individual files. For our research, we extract flows from these files using flow analysis tools: Tranalyzer and NFStream. The dataset is organized into device-specific folders, with each containing the "Active" scenario and corresponding files labeled. Comprehensive details regarding our setup and methodology are provided in our IEEE/IFIP NOMS - AnNet 2024 paper, along with a thorough explanation of the dataset's structure in the readme file. Notably, all captured data is benign, and devoid of any malware. This dataset serves as a valuable resource for understanding IoT device behaviour and using AI models for network traffic patterns in a real-world context.
- Integration constraints
The data can be used for model training for Artificial Intelligence (AI), Machine Learning (ML), or Deep Learning (DL) experiments.
- Targeted customer(s)
Researchers in the area of Network Traffic Analysis. Researchers can be from Academia, Government, or Industry.
- Conditions for reuse
The terms of reuse is dictated by IEEE Dataport licensing agreement.
- Confidentiality
- Public
- Publication date
- 22-07-2024
- Involved partners
- Dalhousie University (CAN)