1. Dataset Poisoning IoCs
-
Data distribution anomalies:
– Sudden spike in rare labels or features.
– Shifted statistical patterns compared to golden datasets. -
Suspicious data sources:
– Unverified or untrusted dataset contributions.
– Metadata tampering (timestamps, authorship). -
Performance degradation:
– Accuracy improves on training set but collapses on validation.
– Bias indicators (model favoring adversary-preferred outputs).
2. Adversarial Sample IoCs
-
Misclassification patterns:
– High-confidence wrong predictions with minimal input changes. -
Perturbation footprints:
– Inputs containing imperceptible noise vectors. -
Model instability:
– Prediction confidence fluctuating abnormally.
– Softmax outputs approaching random uniform distribution. -
Detection evasion attempts:
– Inputs triggering bypass of filters/guards.
3. Compromised Weight IoCs
-
Integrity failures:
– Hash/signature mismatch on model weights.
– Absence of expected cryptographic watermarks. -
Backdoor triggers:
– Model responding to specific hidden triggers (e.g., unusual tokens). -
Output drift:
– Deviation from golden dataset benchmarks. -
Unexpected behaviors:
– Model produces malicious outputs (toxicity, bias) not present before.
4. System-Level IoCs
-
Infrastructure anomalies:
– Unauthorized model file modifications.
– Unexplained retraining events in logs. -
Data pipeline compromise:
– Unusual network connections during dataset ingestion. -
Account abuse:
– Insider or adversary credentials injecting malicious updates.
Key Takeaway
AI IoCs extend beyond system logs — they involve data integrity, adversarial perturbations, and cryptographic validation of models. Monitoring for these signs is critical to early detection and containment of AI-specific attacks.
#AI #ThreatIntel #IndicatorsOfCompromise #AdversarialAI #MLSecurity #CyberDudeBivash
