Every AWS architecture guide says the same thing: keep production and staging in separate environments. But "separate" is vague, and in practice the boundary erodes over time. A peering connection added for a one-off data copy that never got removed. A shared Transit Gateway where route table scoping was meant to be configured "later." A central logging VPC that has subnets in both environments.

The isolation exists on the architecture diagram. Whether it exists in the actual AWS routing tables is a different question — and the one that auditors ask.

Why Prod/Staging Isolation Actually Matters

The risks of a network path between production and staging are not hypothetical:

Data leakage

Staging environments routinely use weaker credentials, broader IAM policies, and less restrictive security groups. A developer debugging a staging service can accidentally query a production database if the network path exists — and the production database doesn't know the request came from staging.

Config drift applied to production

Changes validated in staging may behave differently if staging can actually reach production services. A misconfigured service discovery entry, a DNS split-horizon failure, or a load balancer routing rule can cause staging traffic to hit production endpoints in ways that are very hard to debug.

Audit scope expansion

For PCI-DSS, SOC 2, and similar frameworks, auditors treat any system reachable from the production environment as potentially in scope. If staging VPCs can reach production, your entire staging fleet gets pulled into the audit. This dramatically increases both the compliance burden and the remediation cost of any staging vulnerability.

How Isolation Breaks in Practice

Most teams start with good intentions. The prod/staging boundary breaks in one of three patterns, and all three are common enough to check for explicitly.

1. The Peering That Never Got Removed

Someone on the platform team needed to copy a database from production to staging for a load test. They created a VPC peering connection, did the copy, and filed a ticket to remove it. The ticket got closed as "done" but the peering was never deleted. Six months later, the peering is still active. No one queries it because no one knows what they're looking for.

VPC peering connections don't expire. They stay in active state indefinitely unless explicitly deleted. The AWS console shows peering as a flat list — there's no visual indication that a peering between vpc-prod-api and vpc-staging-api represents an isolation breach.

2. Shared Transit Gateway with Incorrect Route Tables

Transit Gateways make it easy to connect many VPCs. The isolation control is the TGW route table: you create separate route tables for production and staging, attach each environment's VPCs to the appropriate table, and do not add cross-environment routes.

The failure mode is a route table that was created with propagation enabled and never scoped. If the production TGW route table propagates all attached VPC CIDRs — including staging VPCs that were attached "temporarily" — then production VPCs can route to staging and vice versa, even though the TGW was designed to enforce isolation.

3. The Shared Services VPC That Bridges Both

Shared services VPCs (for DNS, logging, monitoring, secrets management) are attached to both production and staging TGW route tables by design — they need to serve both environments. But if this VPC itself has broad routing or permissive NACLs, it can become a transit path: traffic from staging can reach the shared services VPC, and from there reach production VPCs, without a direct peering between prod and staging.

The two-hop problem: Checking whether a production VPC peers directly with a staging VPC is not sufficient. A shared services VPC with attachments in both environments creates a two-hop path that peering checks won't surface.

How to Enforce Isolation

The most reliable approach is defense in depth: enforce at the VPC peering layer, the TGW route table layer, and monitor continuously for violations.

Separate VPCs with No Direct Peering

Production and staging VPCs should have no VPC peering connections between them. This is the simplest control — if there's no peering, there's no direct path. Enforce this with an SCP (Service Control Policy) if you want to prevent peering from being created across environment account boundaries:

{
  "Effect": "Deny",
  "Action": "ec2:CreateVpcPeeringConnection",
  "Resource": "*",
  "Condition": {
    "StringNotEquals": {
      "ec2:AccepterVpc": "arn:aws:ec2:*:PROD-ACCOUNT-ID:vpc/*"
    }
  }
}

Transit Gateway Route Table Scoping

If production and staging share a TGW, create separate route tables and explicitly control which VPC CIDRs appear in each:

# Create a dedicated route table for production VPCs
aws ec2 create-transit-gateway-route-table \
  --transit-gateway-id tgw-0abc1234 \
  --tag-specifications 'ResourceType=transit-gateway-route-table,Tags=[{Key=Name,Value=prod-rt}]'

# Associate production VPC attachments with this route table only
aws ec2 associate-transit-gateway-route-table \
  --transit-gateway-route-table-id tgw-rtb-prod \
  --transit-gateway-attachment-id tgw-attach-prod-api

# Do NOT propagate staging VPC routes into the production route table
# Verify what's currently propagated:
aws ec2 get-transit-gateway-route-table-propagations \
  --transit-gateway-route-table-id tgw-rtb-prod \
  --query 'TransitGatewayRouteTablePropagations[*].{VPC:ResourceId,State:State}' \
  --output table

Check for Routes Between Prod and Staging CIDRs

To verify no route from a production VPC subnet reaches a staging CIDR (e.g., staging is 10.20.0.0/16):

# List all route tables in the production VPC
aws ec2 describe-route-tables \
  --filters "Name=vpc-id,Values=vpc-prod-0abc1234" \
  --query 'RouteTables[*].Routes[?DestinationCidrBlock==`10.20.0.0/16`]' \
  --output json

# Check TGW route table for staging CIDR
aws ec2 search-transit-gateway-routes \
  --transit-gateway-route-table-id tgw-rtb-prod \
  --filters "Name=route-search.longest-prefix-match,Values=10.20.0.1" \
  --output table

An empty result on both means no route exists from production to the staging CIDR. That's what you want.

How to Prove It to Auditors

Enforcement is necessary but not sufficient. Auditors need evidence that the isolation has been maintained, not just that it exists at the moment they ask.

"We think it's isolated" is not an answer. "Here is a signed report from 14 days ago showing no path existed, and here is the scan from yesterday showing the same" is an answer.

The practical evidence package for prod/staging isolation includes:

The Isolation Rule Approach in Netway

Netway lets you define isolation rules directly in its configuration. You specify two groups of VPCs — for example, group_a = [vpc-prod-api, vpc-prod-db] and group_b = [vpc-staging-api, vpc-staging-db] — and Netway runs a breadth-first search on the full topology graph on every scan to determine whether any path exists between the two groups.

The result is a timestamped pass/fail record per scan. If the isolation holds, the scan logs a pass. If a path is found, the scan logs a violation with the exact path traced:

ISOLATION VIOLATION DETECTED
Rule: production-staging-isolation
Path: vpc-prod-api → pcx-0abc1234 → vpc-staging-api
Detected: 2026-06-14T09:42:11Z
Scan ID: scan_8f2a3b1c

This gives you the exact peering connection to investigate and remove. More importantly, it gives you an auditable record: every scan in the history is a documented segmentation test result, covering the 6-month cadence required by frameworks like PCI-DSS 11.4.5 without requiring a manual penetration test.

The change log between scans surfaces the moment isolation breaks — not the next time someone manually checks.