Build Zero Trust on Microsoft Azure — A Hands-On Architecture Guide for Cloud Engineers By CyberDudeBivash • Date: September 20, 2025 (IST)

 


Executive summary

This is your practical, step-by-step blueprint to ship Zero Trust on Azure—grounded in identity-first controls, private-by-default networking, least-privilege automation, and continuous verification with policy & telemetry. You’ll stand up:

  • A landing zone with management groups, policy baselines, and hub-and-spoke networking (private endpoints everywhere).

  • Identity guardrails with Entra ID: PIM (JIT admin), Conditional Access, workload identities, and access reviews.

  • Network segmentation with Azure Firewall Premium, DDoS Standard, DNS Private Resolver, and NSG/ASG micro-segments.

  • App & data protections: managed identity everywhere, Key Vault with purge protection, Purview discovery, Defender controls.

  • Continuous compliance: Azure Policy + GitHub Actions/ADO pipelines with OIDC federation and policy-as-code.

  • Detection & response: Microsoft Sentinel analytics, UEBA, and SOAR playbooks.


Reference architecture (mental model)

Tenant (Entra ID) └─ Management Groups: /Root ├─ Platform (Connectivity, Identity) ├─ Landing-Zones (Corp, Online, SAP, Data) └─ Sandbox Subscriptions (per LZ) ├─ Hub (shared) ─ Azure Firewall Premium, DDoS, DNS Private Resolver, Bastion └─ Spokes (per app tier) ─ AKS/AppSvc + Private Endpoints to PaaS (KV, Storage, SQL) Identity ├─ Entra ID PIM (JIT), Conditional Access, Identity Protection ├─ Workload identities (Managed Identity, Federated OIDC from CI) Network ├─ Hub/Spoke vWAN or Virtual Network Manager with security admin rules ├─ UDR to Firewall; deny internet egress except approved FQDNs Data & Apps ├─ Key Vault (RBAC, purge protection), CMK for PaaS, Purview scans └─ Defender for Cloud/Endpoint/K8s; posture via Azure Policy Monitoring └─ Log Analytics + Microsoft Sentinel + Automation (Logic Apps, Runbooks)

Prerequisites

  • Entra ID tenant with Global Admin (temporary) using PIM; at least 3 subscriptions (Management/Hub, Landing Zone App, Shared Services).

  • Tooling: Azure CLI, Terraform or Bicep, GitHub/ADO, jq, pwsh.

  • Security baseline: MFA enforced; legacy auth disabled; break-glass accounts stored offline.


Part 1 — Foundation (Landing Zone + Policy Baseline)

1. Create management group hierarchy (least privilege)

az account management-group create --name platform az account management-group create --name landing-zones az account management-group create --name sandbox

2. Assign subscriptions to groups

az account management-group subscription add \ --name landing-zones --subscription <APP_SUB_ID>

3. Deploy core policy baseline (deny public exposure)

  • Deny public IP on PaaS; Require private endpoints; Enforce tags; Require diagnostic settings to Log Analytics; Allowed locations/SKUs; Key Vault soft-delete + purge protection; Disk encryption.

Tip: Keep policies modular. Use initiative (policy set) IDs and version them in Git.

Example (Bicep)Require Private Endpoints on Storage initiative attachment:

resource policyAssignment 'Microsoft.Authorization/policyAssignments@2022-06-01' = { name: 'require-storage-private-endpoints' scope: managementGroupResourceId('landing-zones') properties: { policyDefinitionId: '/providers/Microsoft.Authorization/policySetDefinitions/Require-Private-Endpoints-Storage' enforcementMode: 'Default' parameters: {} } }

4. Turn on Defender for Cloud (plans per workload)

az security pricing create --name VirtualMachines --tier Standard az security pricing create --name AppServices --tier Standard az security pricing create --name KubernetesService --tier Standard

5. Log analytics + diagnostics everywhere (centralized)

LA_RG="rg-secops"; LA_WS="log-cdb-eastus" az monitor log-analytics workspace create -g $LA_RG -n $LA_WS # Example: route Activity Logs to LA az monitor diagnostic-settings create \ --name "to-la" --resource "/subscriptions/<SUB_ID>" \ --workspace "/subscriptions/<SUB_ID>/resourcegroups/$LA_RG/providers/microsoft.operationalinsights/workspaces/$LA_WS" \ --export-to-resource-specific true \ --logs '[{"category": "Administrative","enabled": true}]'

Part 2 — Identity-First Controls (Entra ID)

6. Enforce MFA & block legacy auth (Conditional Access)

Policies to create (start with “Report-only”, then On):

  • Require MFA for all users, exclude break-glass.

  • Block legacy protocols (POP/IMAP/SMTP Basic).

  • Require compliant/hybrid-joined device for admin roles.

  • High-risk sign-ins ⇒ require password change (Identity Protection).

7. Privileged Identity Management (PIM)

  • Make all admin roles eligible, not permanent.

  • Require approval, MFA, reason, and ticket for activation.

  • Set activation time limits and notifications.

8. Access Reviews

  • Quarterly access reviews for high-priv groups (Owners, App Admins).

  • Automate remediation (remove if no response).

9. Workload identities (no secrets in CI)

  • Prefer Managed Identity for Azure resources.

  • For CI/CD (GitHub Actions, ADO), use federated credentials (OIDC):

az ad app create --display-name "cicd-gh-oidc" # Create federated credential mapping GitHub repo to the app az ad app federated-credential create --id <appId> --parameters ' { "name":"gh-main", "issuer":"https://token.actions.githubusercontent.com", "subject":"repo:Org/Repo:ref:refs/heads/main", "audiences":["api://AzureADTokenExchange"] }' # Assign least-priv RBAC to the app at subscription/resource group scope az role assignment create --assignee <appId> --role "Contributor" --scope "/subscriptions/<SUB_ID>/resourceGroups/rg-landingzone"

Part 3 — Network Segmentation (Private-by-Default)

10. Hub & Spoke

  • Hub: Azure Firewall Premium, DDoS Standard, DNS Private Resolver, Bastion.

  • Spokes: per app domain; UDRs to force all egress via firewall; NSG/ASG segmenting tiers.

11. Egress allowlisting

  • Use Firewall FQDN tags and app rules; deny outbound internet by default.

  • Centralize DNS—split-horizon, block malicious domains, resolve Private Endpoints.

12. Private Endpoints everywhere

  • Storage/SQL/Key Vault/Container Registry/Service Bus: disable public network access, add Private Endpoints.

# Example: disable public network on Key Vault az keyvault update -g rg-app -n kv-app --public-network-access Disabled

13. WAF + front door

  • Place Azure Front Door (WAF) or App Gateway (WAF_v2) in front of public apps; enable bot & OWASP rules; turn on TLS 1.2+ only.


Part 4 — Secure Apps & Data

14. Managed identity from code to data

  • App Service/AKS → MSI → Key Vault (secrets/keys); use Key Vault references for App Config.

  • For AKS: enable Azure AD integration, Azure CNI, Network Policies, Defender for Containers, and Azure Policy for AKS.

AKS security flags (example)

az aks create -g rg-aks -n aks-zta \ --enable-aad --aad-admin-group-object-ids <groupId> \ --network-plugin azure --network-policy azure \ --enable-defender --defender-config workspaceResourceId=/subscriptions/<SUB>/resourceGroups/$LA_RG/providers/Microsoft.OperationalInsights/workspaces/$LA_WS \ --enable-managed-identity

15. Encrypt data (CMK) & classify

  • Enable customer-managed keys for Storage/SQL/AKS where feasible.

  • Onboard to Microsoft Purview: auto-label sensitive data; scan data estates; enforce DLP where applicable.

16. Secrets governance

  • Key Vault: RBAC-only (disable access policies), soft-delete + purge protection, private endpoint, logging enabled.

  • No credentials in code/variables. Rotate with Key Rotation or automation.


Part 5 — Continuous Compliance (Policy-as-Code + Pipelines)

17. Policy repo layout

/policy /definitions /initiatives /assignments /landingzone /bicep or /terraform /.github/workflows

18. GitHub Actions (OIDC) → Deploy infra & enforce policy

Example workflow (excerpt):

name: deploy-landingzone on: [push] jobs: plan-apply: permissions: id-token: write contents: read runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Az Login (OIDC) uses: azure/login@v2 with: client-id: ${{ secrets.AZURE_CLIENT_ID }} tenant-id: ${{ secrets.AZURE_TENANT_ID }} subscription-id: ${{ secrets.AZURE_SUB_ID }} - name: Terraform Init/Plan run: | terraform -chdir=landingzone init terraform -chdir=landingzone plan -out tfplan - name: Policy Compliance Check run: az policy state summarize --management-group-name landing-zones - name: Terraform Apply if: github.ref == 'refs/heads/main' run: terraform -chdir=landingzone apply -auto-approve tfplan

19. Guardrails you should codify

  • Deny public IP on PaaS & VMs.

  • Require Private Endpoints, Diagnostic Settings, CMK for specific SKUs.

  • Enforce tag schema (Owner, DataClass, Env).

  • Restrict locations/SKUs, allow only approved images (SIG).

  • Kubernetes: policy to block privileged pods, require signed images, enforce ingress/egress policies.


Part 6 — Detection, Response & Automation

20. Microsoft Sentinel (connect & detect)

  • Connect: Azure AD, M365D, Defender for Cloud, Azure Activity, Firewall, AKS.

  • Analytics (KQL examples):

A. Impossible travel for privileged roles

SigninLogs | where ResultType == 0 | where isnotempty(LocationDetails) | where AuthenticationRequirement == "multiFactorAuthentication" | summarize makeset(Location) by UserPrincipalName, bin(TimeGenerated, 1h) | where array_length(makeset_Location) > 1 | join kind=inner ( IdentityInfo | where AssignedRoles has "Privileged" ) on $left.UserPrincipalName == $right.AccountUPN

B. Public exposure drift (should be zero)

AzureActivity | where OperationNameValue =~ "MICROSOFT.NETWORK/PUBLICIPADDRESSES/WRITE" | where ActivityStatusValue == "Succeeded" | summarize count() by Caller, bin(TimeGenerated, 1h)

21. SOAR playbooks (Logic Apps)

  • Auto-quarantine VM/Pod on high-confidence alert.

  • Disable user & revoke refresh tokens on high-risk sign-in.

  • Open ticket with enriched WHOIS/Geo and policy failures.


Part 7 — Step-by-Step Build Checklist 

  1. Create management groups and map subscriptions.

  2. Enable Defender for Cloud plans; create Log Analytics workspace.

  3. Deploy policy initiatives at MG scope (deny public, require diag, private endpoints).

  4. Set Tenant-wide CA: require MFA, block legacy auth, device conditions for admins.

  5. Configure PIM for all admin roles; enable Access Reviews.

  6. Stand up hub (Firewall, DDoS, DNS PR, Bastion); spokes per app.

  7. Force egress through firewall with UDRs; define FQDN allowlists.

  8. Disable public network access on PaaS; add Private Endpoints.

  9. Deploy AKS/App Service with managed identity; bind to Key Vault (private).

  10. Enable Purview scans and CMK where needed.

  11. Wire Sentinel connectors; deploy analytics rules; test alerts.

  12. Create CI/CD with OIDC to Azure; run Terraform/Bicep from pipeline; gate by policy compliance.

  13. Add image signing (ACR content trust/Notary), Defender image scans, and K8s policies.

  14. Back up & lock critical resources (resource locks on KV, FW).

  15. Run tabletop: admin compromise, data exfil attempt, public exposure drift. Fix gaps.


Operational tips & pitfalls

  • Don’t bypass policy for speed; add controlled exemptions with expiry and owners.

  • Measure: % resources with private endpoints, policy compliance %, mean time to policy fix, Sentinel MTTR.

  • Break-glass: 2 cloud-only accounts with strong secrets, stored offline; exclude from CA (carefully audited).

  • Cost: DDoS Standard + Firewall are worth it; use Azure Savings Plans and right-size SKUs; collect only useful logs.


Minimal Terraform skeleton 

module "hub" { source = "./modules/hub" location = "eastus" ddos_enabled = true } module "policy" { source = "./modules/policy" mg_scope = "/providers/Microsoft.Management/managementGroups/landing-zones" } module "kv" { source = "./modules/keyvault" rg_name = "rg-app" name = "kv-app-zta" public_network = false purge_protection = true private_endpoint = true }

Validation: “done means done”

  • Policy compliance ≥ 98% at MG scope; 0 public IPs on PaaS.

  • 100% privileged roles eligible (PIM); CA hit rate on non-compliant device = expected.

  • Firewall logs show only allowlisted destinations; DNS PR resolving to PE addresses.

  • Sentinel detects your test incidents; SOAR runs containment.


Migration strategy (existing workloads)

  1. Discover & classify: inventory networks, public endpoints, identities.

  2. Contain: put spokes behind Firewall; enable DDoS; route traffic.

  3. Harden: move secrets to Key Vault; enable MI; disable PaaS public access.

  4. Refactor: add Private Endpoints; switch to OIDC in CI; replace shared secrets.

  5. Prove: enforce policy; onboard to Sentinel; run tabletops; fix.


#CyberDudeBivash #ZeroTrust #MicrosoftAzure #CloudSecurity #EntraID #ConditionalAccess #PIM #AzureFirewall #PrivateEndpoint #KeyVault #Purview #DefenderForCloud #AKS #Sentinel #PolicyAsCode #Terraform #GitHubActions

Comments

Popular posts from this blog

CyberDudeBivash Rapid Advisory — WordPress Plugin: Social-Login Authentication Bypass (Threat Summary & Emergency Playbook)

Hackers Injecting Malicious Code into GitHub Actions to Steal PyPI Tokens CyberDudeBivash — Threat Brief & Defensive Playbook

Exchange Hybrid Warning: CVE-2025-53786 can cascade into domain compromise (on-prem ↔ M365) By CyberDudeBivash — Cybersecurity & AI