Under Review at the ACM MM 2026 Dataset Track
IMPACT: A Multi-view Industrial Assembly Benchmark
for Fine-grained Action Understanding, Procedural Reasoning, and Error Analysis
A synchronized five-view RGB-D dataset and benchmark for industrial assembly under viewpoint shift, partial observability, multi-route execution, and anomaly recovery.
Assembly Task and Recording Setup
IMPACT is collected around real angle grinder assembly and disassembly with synchronized ego and exo RGB-D, gaze, audio, and cognitive metadata.
Dataset Overview
IMPACT comprises 112 trials from 13 participants and 39.5 hours of synchronized industrial assembly data.
Two physical product configurations
Two angle grinder configurations provide a natural cross-configuration generalization setting.
Partial-order procedural execution
Valid trajectories follow a prerequisite graph rather than a single rigid sequence.
Anomaly and compliance supervision
Explicit anomaly recovery supervision is paired with compliance phases and a six-category anomaly taxonomy.
Trial-level cognitive workload
Each trial is paired with NASA-TLX workload measurement.
Benchmark Tasks
Download
The public release combines repository-hosted protocol assets with Google Drive bundles for annotations, multi-view media, ego eye-tracking traces, released feature packs, and a quick-start sample subset.
| Type | Status | Content |
|---|---|---|
| Protocol Assets | In Repo | Official splits, mappings, labels, and procedure metadata included in this repository. |
| Starter Code | In Repo | Task wrappers, evaluation scripts, and released baselines included in this repository. |
| Full Annotation Bundle | Released | Official release folder containing annotations_v1.zip. |
| RGB Videos | Released | Five-view RGB archives distributed via the Google Drive release: videos_ego, videos_front, videos_left, videos_right, and videos_top. |
| Depth Streams | Released | Exocentric depth archives distributed via the Google Drive release: depth_front, depth_left, depth_right, and depth_top. |
| Ego Audio | Released | audio_ego.zip for the egocentric audio stream, distributed via the Google Drive release. |
| Ego Eye Tracking | Released | eye_tracking_ego.zip for egocentric eye-tracking TSV traces, distributed via the Google Drive release. |
| Pre-extracted Features | Released | Released feature bundles for I3D, MViTv2, and VideoMAEv2, distributed via the Google Drive release. |
| Sample Pack | Released | Quick-start subset with 3 executions, multi-view media, eye-tracking traces, released features, and task annotations for download verification and review. |
| Hugging Face Mirror | Live | Public dataset mirror for the IMPACT release bundles. |
Release Update
Planned dataset maintenance updates will be announced here.
Annotation Version Update
A refined annotation release is planned for late June 2026. The update will reduce residual annotation noise, improve label accuracy, and include a detailed changelog documenting version differences and corrections. The current public release corresponds to Annotation v1 for academic release and early use; the upcoming release will publish denoised annotations.
Annotation Schema
Aligned supervision spans interaction dynamics, procedural structure, compliance, and state evolution.
Fine-grained actions
Hand-specific labels capture coordinated bimanual interaction.
Procedural structure
Coarse steps and completion events connect segmentation to graph-based reasoning.
Assembly states
Component-wise ternary states support direct ASR and indirect PSR.
Compliance and anomalies
Compliance phases are paired with anomaly categories for diagnosis.
Benchmark
Released methods are organized by task family, with detailed protocols maintained in the task-specific pages.
Released coverage
TAS,CV-TA,CV-SM,AF-S,PSR,ASR,PPR, andATRare released.AF-Lis documented in the repository with its task protocol and baseline notes.- Detailed model lists and launch commands are maintained in the repository.
Metrics
- Dense prediction metrics for temporal and compliance tasks.
- Retrieval and classification metrics for cross-view tasks.
- Task-specific forecasting, state, and procedural reasoning metrics.
Leaderboard
Leaderboard links and submission policy will be added when the public benchmark release is finalized.
Resources
The repository is the operational layer for reproducible benchmarking and protocol inspection.
Paper, Citation, and License
These items are visible early because they matter to both users and reviewers.
Paper & Citation
Paper PDF and citation links will be added here when the public release is finalized.
@inproceedings{impact2026,
title = {IMPACT: A Dataset for Multi-Granularity Human Procedural Action Understanding in Industrial Assembly},
author = {TBD},
booktitle = {ACM Multimedia},
year = {2026}
}
License
- Repository-authored code: repository root
LICENSE. - Dataset assets and documentation:
LICENSE-DATA. third_party/: per-directory provenance and license notices.
Acknowledgements
We gratefully acknowledge Lei Qi, Weitong Kong, Chen Zhang, Haiwen Sun, and Yuwei Hu for their valuable support throughout the data collection and annotation process for IMPACT.