IIoT Lightweight IDS Generalization Failure

Arxiv other 2026-07-01T00:00:00
arXiv Paper — PDF not available. Only the Executive Summary is available here. To read or download the full paper, visit the arXiv abstract page.

Abstract

Lightweight machine learning models are increasingly proposed for intrusion detection in Industrial Internet of Things (IIoT) networks due to their suitability for resource-constrained edge deployment. However, most reported results evaluate these models only within their training network, leaving behavior on unseen industrial networks largely unverified. This study trains four representative lightweight architectures on one widely used IIoT dataset and evaluates them, without retraining, on two independent and structurally distinct IIoT datasets, using a feature representation restricted to attributes available across all three sources. Explainability analysis, corroborated across two architecturally distinct top-performing models, shows that both rely overwhelmingly on coarse port-category features, and a direct comparison of port-category prevalence across datasets reveals that the most influential category occurs in source-domain attack traffic at 96 to 435 times the rate observed in the two target domains, indicating that coarsening port resolution relocates rather than removes a documented shortcut. Evaluation under naturally imbalanced class distributions, rather than the balanced distributions common in prior work, reveals a further effect: the evaluation protocol used can reverse which target network appears to pose the greater generalization challenge. Adversarial robustness and the capacity to recover performance through limited exposure to target-domain data are also assessed; robustness to adversarial perturbation is found to be unrelated to a models ability to generalize across networks, and recovery through limited adaptation varies considerably by architecture. These findings suggest that deployment readiness for lightweight IIoT intrusion detection should be assessed using cross-network evaluation under realistic class distributions, rather than within-domain accuracy alone.

Loading executive summary...

LINK COPIED TO CLIPBOARD