Why
A tagging strategy without enforcement degrades silently. Within months, compliance drops as teams deploy via console, forget tags in ad-hoc scripts, or bypass shared IaC modules. Enforcement is the only path to near-100% tag compliance, and it must be layered: prevent at the source, detect what slips through, and remediate automatically where safe.
What
Implement a defence-in-depth enforcement stack across three preventive layers and three detective layers, rolled out progressively over ~12 weeks to avoid blocking legitimate deployments.
Defence in Depth
═══════════════════════════════════════════════════════════════
PREVENT DETECT & FIX
──────────────────────── ────────────────────────
Layer 1: IaC Module Defaults Layer 4: Compliance Scanning
Layer 2: CI/CD Pipeline Gates Layer 5: Auto-Remediation
Layer 3: Cloud-Native Policies Layer 6: Alerting & Tickets
Combined target: > 98% tag coverage (spend-weighted) How
Embed Tags in Shared IaC Modules
Mandatory tags become required input variables in all shared Terraform modules, CloudFormation templates, and Bicep modules. Teams cannot deploy without providing them.
# Terraform example — mandatory tag variable
variable "mandatory_tags" {
type = object({
cost_center = string
business_unit = string
application = string
environment = string
owner = string
})
validation {
condition = contains(
["prod","stg","dev","sbx"],
var.mandatory_tags.environment
)
error_message = "environment must be: prod, stg, dev, sbx"
}
}
locals {
all_tags = merge(var.mandatory_tags, {
managed-by = "terraform"
created-date = formatdate("YYYY-MM-DD", timestamp())
})
} This catches ~80–90% of violations. It misses console deployments, CLI one-offs, and resources auto-created by managed services.
Add CI/CD Pipeline Validation
Add a scanning step to every CI/CD pipeline that deploys infrastructure. Start in warning mode, then promote to blocking mode after 2 weeks.
| Tool | What It Does |
|---|---|
| tflint | Terraform linter with custom rules for required tags |
| checkov | Policy-as-code scanner (Terraform, CF, ARM, Bicep) |
| OPA / Rego | General-purpose policy engine for plan output |
| Sentinel | HashiCorp policy-as-code (Terraform Cloud/Enterprise) |
Pipeline flow:
git push
→ pre-commit: tflint (local, fast)
→ CI step 1: checkov / OPA scan
→ CI step 2: terraform plan
→ CI step 3: policy eval on plan output
→ CI step 4: terraform apply (only if all pass) Deploy Cloud-Native Policies
Roll out in three phases: Audit → Enablement → Deny over ~4 weeks.
AWS — Service Control Policies (SCPs)
{
"Effect": "Deny",
"Action": ["ec2:RunInstances", "rds:CreateDBInstance", "s3:CreateBucket"],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/cost-center": "true",
"aws:RequestTag/environment": "true",
"aws:RequestTag/owner": "true"
}
}
} Also deploy AWS Tag Policies at OU level for allowed values and case enforcement.
Azure — Azure Policy
Use built-in policies: “Require a tag and its value on resources” and “Inherit a tag from the resource group / subscription if missing”. Assign at Management Group scope for org-wide coverage. Start with Audit effect, then promote to Deny.
Enable Cost Management Tag Inheritance in settings — this propagates Subscription/RG tags to cost records even if resources themselves are untagged. Azure has the strongest inheritance story of all three providers.
GCP — Compensating controls
GCP has no native “require label” deny constraint. Compensate with IaC discipline (default_labels in provider block), CI/CD gates (OPA/Sentinel), and Cloud Asset Inventory feeds to Cloud Functions for detective labelling.
Deploy Detective Controls
Set up compliance scanning, alerting, and auto-remediation for what slips through prevention.
Scanning:
| Provider | Tool | What It Does |
|---|---|---|
| AWS | AWS Config Rules | required-tags managed rule, continuous eval |
| Azure | Azure Policy Compliance | Real-time dashboard, drill to non-compliant |
| GCP | Cloud Asset Inventory | Export to BigQuery for trending |
| Cross-cloud | Cloud Custodian / Steampipe | SQL or YAML policies across all providers |
Auto-remediation by environment:
| Environment | Action for Untagged Resources |
|---|---|
| Sandbox | Auto-tag defaults + alert. 7 days untagged → stop. 30 days → terminate |
| Dev | Auto-tag defaults + alert. 14 days untagged → stop |
| Staging | Alert owner + ticket. SLA: 5 business days |
| Production | Alert owner + P3 ticket. NEVER auto-remediate. SLA: 10 business days |
Progressive Rollout Schedule
| Phase | Timeline | Goal |
|---|---|---|
| Phase 1: Visibility | Weeks 1–4 | Audit-mode policies. Baseline metrics. Build case. |
| Phase 2: Enablement | Weeks 5–8 | IaC modules updated. CI/CD warns. Tagging sprint. |
| Phase 3: Enforcement | Weeks 9–12 | Audit → Deny. Auto-remediation in non-prod. SLAs. |
| Phase 4: Optimisation | Ongoing | Tighten targets 90% → 95% → 98%. Quarterly review. |
Deliverable Checklist
- Shared IaC modules updated with mandatory tag variables
- CI/CD pipeline scanning step deployed (warning → blocking)
- Cloud-native policies deployed per provider (Audit → Deny)
- Tag Inheritance enabled (Azure Cost Management)
- Detective scanning operational (Config Rules / Policy / CAI)
- Auto-remediation active for sandbox/dev
- Alert routing configured (Slack/Teams → owner)
- Ticketing integration for prod violations with SLAs