Skip to main content
TACUNS
Module 2 of 8
25% complete
Module 2

Asymmetric Routing — 'Website Opens Intermittently, Ping Works'

The Call Comes In

This is how asymmetric routing almost always presents to TAC. The words vary but the pattern is always the same:

  • "Website opens intermittently — sometimes it works, sometimes it hangs"
  • "Ping works fine but the application fails"
  • "Only some users are affected — same office, same VLAN"
  • "One-way traffic — we can reach the server but it cannot respond"
  • "Started after we added a second ISP / replaced the core switch / added a load balancer"
  • "Issue happens only for TCP, UDP seems fine"
  • "After failover everything broke"

Why This Pattern Is Distinctive

Ping working while TCP fails is the first signal. ICMP is stateless — it does not require the firewall to track a session, so asymmetric return paths do not break it. TCP requires the full three-way handshake to pass through the same stateful firewall in both directions. When return traffic bypasses the firewall that saw the SYN, the SYN-ACK arrives at a device with no session record for that flow, and it gets dropped or reset.

What Most Engineers Try First (And Why It Fails)

  • Restarting sessions in the session browser — sessions come back the same way, same routing problem
  • Disabling security policies temporarily — policies are not the issue, routing is
  • Rebooting the firewall — brings sessions back, same asymmetric paths still exist in routing table
  • Blaming the ISP — ISP is delivering packets correctly, issue is in return path routing
  • Checking security policy — policy is matching and permitting correctly, that is not the problem
  • Adding bypass rules — traffic hits a deny, but the deny is not the root cause

These all fail because they address the wrong layer. The firewall is doing its job correctly — it is tracking state and enforcing policy. The problem is that the physical network is delivering traffic through paths that split the conversation across devices that do not share session state.

How TAC Thinks About This Problem

A stateful firewall has one fundamental requirement: it must see both directions of a TCP flow. The outbound SYN and the inbound SYN-ACK must pass through the same firewall instance. If they do not, the firewall that saw the SYN has no record of a SYN-ACK arriving — because it never did. The firewall that received the SYN-ACK has no session record — because it never saw the SYN.

The firewall is not broken. The network is routing different halves of the same conversation through different devices.

This is fundamentally a routing design problem, not a firewall configuration problem. The fix is always in routing — either on the firewall, on the upstream device, or both.

The Actual TAC Debug Sequence

Step 1 — Confirm One-Way Traffic with Session Analysis

pan-os-cli
# Find the session for the affected flow
show session all filter source <client-ip> destination <server-ip>

# Get the session ID from output, then examine it
show session id <session-id>

# What to look for in the output:
# c2s flow state: ACTIVE   ← client-to-server packets arriving
# s2c flow state: INIT     ← server-to-client never completed — asymmetric
# OR
# total byte count(c2s): 48576   ← incrementing
# total byte count(s2c): 0       ← frozen at zero — no return traffic

If c2s bytes are growing and s2c bytes are zero or frozen, the firewall is receiving outbound traffic from the client but return traffic from the server is not arriving here. It is arriving somewhere else.

Step 2 — Confirm with Packet Capture on Both Directions

pan-os-cli
# Set up capture filter for the affected flow
debug dataplane packet-diag set filter match source <client-ip> destination <server-ip>
debug dataplane packet-diag set filter match source <server-ip> destination <client-ip>

# Enable capture on ingress and egress stages
debug dataplane packet-diag set capture stage firewall file /tmp/asymm-debug.pcap

# Enable logging
debug dataplane packet-diag set log on

# Have the user reproduce the issue (attempt the connection)
# Wait 20-30 seconds, then stop

# Review what was captured
show debug dataplane packet-diag log | match "flow"

# Clear when done
debug dataplane packet-diag clear filter
debug dataplane packet-diag clear log

If you see SYN packets arriving from the client in the capture but no SYN-ACK returning from the server direction, that is the confirmation. The server sent the SYN-ACK — it just went somewhere else.

Step 3 — Validate the Return Path Routing

pan-os-cli
# Check what route the firewall would use to reach the server
test routing fib-lookup virtual-router default ip <server-ip>

# Check what route the firewall uses for return traffic
# (source is server, destination is client)
test routing fib-lookup virtual-router default ip <client-ip>

# Show the full routing table — look for ECMP routes or unexpected paths
show routing route

# Check if PBF (Policy Based Forwarding) is involved
show pbf rule

# If active/passive HA — confirm which unit is active
show high-availability state

# If active/active HA — check session distribution and owner
show session info
show high-availability state

Step 4 — Identify the Asymmetric Path

ScenarioWhere to LookDebug Command
Return traffic bypassing firewall entirelyUpstream router routing table — server subnet traffic going directTraceroute from server to client, or route table on upstream router
HA active/active session owner mismatchFirewall B receiving SYN-ACK for session owned by Firewall Ashow session id <id> — 'session synced from HA peer'
ECMP sending flows on different pathsLoad balancer or router using per-packet ECMP instead of per-flowCheck ECMP configuration on upstream device
PBF rule sending return traffic out wrong interfacePolicy-based forwarding overriding routing tableshow pbf rule — check match criteria and egress interface
New ISP added without symmetric routingNew default route sending some traffic out new ISP, returns via old ISPshow routing route — look for dual default routes
Load balancer in one-arm mode not NATingLoad balancer not masquerading — server responds directly to client bypassing LB and firewallCheck load balancer NAT configuration

Step 5 — Check for the Mismatch in Logs

pan-os-cli
# Look for TCP RST or flow state errors in traffic logs
show log traffic direction equal forward | match "deny|drop|rst"

# Check global counters for TCP-related drops
show counter global filter delta yes severity drop | match "tcp|flow|asym"

# Key counters that indicate asymmetric routing:
# flow_tcp_rst_from_server  — server sent RST, firewall saw incomplete handshake
# flow_tcp_syn_ack_send_err — firewall trying to send SYN-ACK but no return path
# flow_fwd_l3_loopback_err  — routing loop detected

Root Cause Patterns and How to Confirm Each

Pattern 1: Return Traffic Bypassing the Firewall

This is the most common pattern after a network change. A new route is added, or an existing route is modified, and traffic from a server subnet now returns to clients via a direct path that does not traverse the firewall.

Confirm: run traceroute from the server to the client IP address. If the path does not pass through the firewall's IP, this is the cause.

Fix: modify the routing on the upstream device or server gateway to ensure return traffic routes through the firewall. This may require adding a static route on the server pointing client subnets via the firewall's inside interface.

Pattern 2: HA Active/Active Session Owner Mismatch

In active/active HA, each firewall owns a portion of the session table. When outbound traffic goes through Firewall A but return traffic arrives at Firewall B, Firewall B has no record of that session. It can forward the packet to Firewall A via the HA data link — but only if the session was properly synced and HA is functioning. Under load or during a sync delay, this path fails.

Confirm: show session id on both HA peers for the same flow. One will show "session synced from HA peer: True". If neither shows the session, the sync failed.

Fix: Correct upstream routing so that traffic flows consistently through the same HA unit for the same session. This typically requires configuring the upstream router to use the same HA unit for a given source subnet rather than load-balancing per-packet.

Pattern 3: PBF Misconfiguration

Policy-Based Forwarding rules direct traffic out specific interfaces based on match criteria. A common mistake is creating a PBF rule that sends outbound traffic via one interface but no corresponding rule exists to keep return traffic on the same path.

Confirm: show pbf rule and check if any rule matches the affected source. If a PBF rule exists, verify whether the egress interface matches the interface the return traffic should arrive on.

pan-os-cli
# Check active PBF rules
show pbf rule

# Test if PBF applies to this specific traffic
test pbf-rule-match from <zone> source <ip> destination <ip> protocol 6 port 443

Pattern 4: ECMP Per-Packet Load Balancing

ECMP (Equal-Cost Multi-Path) routing is commonly configured on upstream routers. If the router is load-balancing per-packet instead of per-flow, different packets of the same TCP session take different paths. Some arrive at the firewall correctly, others bypass it or arrive at the wrong HA unit.

Confirm: check ECMP configuration on the upstream router. Verify whether load balancing is configured for per-packet or per-flow (source-destination hash). Per-packet ECMP is incompatible with stateful firewalls.

Fix: change upstream ECMP to per-flow hashing (source-destination IP or 5-tuple hash). This ensures all packets of a single TCP session always take the same path.

The Fix and Validation

After identifying and correcting the routing asymmetry, validate before declaring the issue resolved:

pan-os-cli
# Clear existing broken sessions for affected traffic
# (Do this during a maintenance window or with user awareness)
clear session all filter source <client-ip> destination <server-ip>

# Confirm new sessions form correctly after clearing
# Have user attempt to connect again, then immediately run:
show session all filter source <client-ip> destination <server-ip>
show session id <new-session-id>

# Confirm both byte counts are incrementing
# c2s and s2c should both increase within 30 seconds

# Run counter check — drops should stop
show counter global filter delta yes severity drop

# Final validation — sustained traffic test
# Have user run a continuous operation (file download, ERP login)
# Monitor session byte counts for 2-3 minutes — both directions must grow

Do Not Forget

Existing sessions that formed before the routing fix may still be broken — they were established on the asymmetric path. Clear them so new sessions form on the correct symmetric path. If you cannot clear all sessions (production risk), the issue will self-resolve as existing sessions time out — typically within minutes for idle flows, hours for long-lived connections.