Module 1 of 8

13% complete

Module 1

The TAC Mental Model — How NGFW Really Processes a Packet

Trademark Notice

NGFW is a registered trademark of the vendor This course is independently created from real-world TAC experience for educational purposes and is not affiliated with or endorsed by the vendor

Why Most Firewall Engineers Debug Wrong

When a production outage hits, most engineers open the GUI and start clicking — checking security policies, looking at NAT rules, refreshing the session browser. Sometimes they find the issue. More often they spend hours going in circles while users and management are waiting.

TAC engineers debug differently. They follow the packet. Every single production issue — asymmetric routing, NAT failures, App-ID shifts, SSL decryption breaks — has a precise point in the processing pipeline where it goes wrong. If you know the pipeline, you know where to look first.

This module builds that mental model. Every module that follows uses it. Without this foundation, the debug sequences in later modules will feel like a list of commands. With it, they will feel like logical steps you could derive yourself.

The Two Processing Paths

The first thing TAC checks when a session has an issue: is this packet on the slow path or the fast path? This single question changes everything about where the problem can be.

Slow Path — New Sessions

When a packet arrives and no matching session exists in the session table, it goes through full policy evaluation. Every engine runs: zone lookup, routing, security policy match, App-ID classification, NAT translation, security profiles. This is where most configuration issues are caught.

Fast Path — Existing Sessions

When a packet arrives and matches an existing session, it bypasses most of the policy stack and forwards at hardware speed. This is efficient — but it creates the most confusing production scenarios.

The Most Misunderstood Fact in NGFW

A session existing in the session table does NOT mean traffic is flowing. A session in ACTIVE state can still be dropping packets. This is the single biggest source of confusion in TAC cases. Engineers see the session, assume traffic is fine, and look elsewhere. The session table is a state machine — not a delivery guarantee.

The Complete Processing Pipeline

Follow every packet through this sequence mentally when you are troubleshooting. The moment you can pinpoint which stage is failing, the fix becomes obvious.

Stage	What Happens	TAC Debug Command
1. Ingress Interface	Packet received, VLAN tag processed, ingress zone determined from interface mapping	show interface ethernet1/x
2. Session Lookup	Check session table — existing session (fast path) or new flow (slow path)?	show session all filter source X.X.X.X
3. Pre-NAT Routing	Route lookup using original destination IP before any NAT translation	test routing fib-lookup virtual-router default ip X.X.X.X
4. Zone Determination	Egress zone identified from route lookup — determines which policy applies	show routing route
5. Security Policy Match	Policy evaluated using pre-NAT source/destination IP, source/destination zone	test security-policy-match from trust to untrust source 10.1.1.1 destination 8.8.8.8
6. NAT Translation	DNAT and/or SNAT applied to matched session	show running nat-policy
7. App-ID Processing	Application identified over first packets — policy re-evaluated with final app	show session id <id>
8. Content-ID / Profiles	Threat inspection, URL filtering, file blocking on allowed sessions	show log threat direction equal forward
9. Egress & Forwarding	Packet forwarded out egress interface with NAT-translated addresses	show counter global filter delta yes severity drop

The Critical Detail About Pre-NAT Policy Matching

This is one of the most common sources of misconfigured security policies in production. Security policy is always matched against the pre-NAT IP address — the original source and destination before any translation occurs.

→

If your DNAT rule translates destination 203.0.113.10 → 192.168.1.50, the security policy must reference 203.0.113.10 as the destination (or any), not 192.168.1.50.

→

If your SNAT rule translates source 192.168.1.0/24 → 203.0.113.5, the security policy must reference 192.168.1.0/24 as the source, not the translated address.

Zone After NAT

Zone determination happens BEFORE NAT translation, based on the routing table lookup against the original destination. After NAT, the actual forwarding uses the translated address — but the zone used for policy matching was already determined. This is why NAT hairpin scenarios break in specific ways covered in Module 3.

App-ID Is Dynamic, Not Static

This is the second most misunderstood concept, and the root cause of an entire class of production outages covered in Module 4.

App-ID does not identify an application from the first packet. It builds a picture over the first several packets of a flow. During this classification period, the traffic is identified as an intermediate application — typically ssl, web-browsing, or unknown — until enough data is available to make a final determination.

Phase	App-ID State	Policy Match
Packets 1-3	unknown / incomplete	Policy must allow 'unknown' or the initial protocol
Packets 4-10	web-browsing or ssl (intermediate)	Policy must allow intermediate application
After threshold	Final app identified (e.g., office365-base)	Policy re-evaluated against final app
Post-identification	App locked in for session lifetime	No further re-evaluation unless session reset

Why Sessions Get Denied Mid-Flow

If your security policy allows ssl but not office365-base, the session starts — App-ID classifies it as ssl, policy permits it. Several packets later, App-ID identifies it as office365-base. Policy is re-evaluated. No rule allows office365-base. Session denied. From the user perspective: application works for two seconds then disconnects. From the engineer perspective: there is an allowed session in the browser AND a deny log at the same time for the same flow. This is not a bug — it is exactly how App-ID is designed to work.

Reading the Session Table Like a TAC Engineer

The session table is the primary source of truth during any production debug. Not the GUI — the CLI session output. Here is what each field actually tells you.

pan-os-cli

> show session id 12345

Session          12345
-----------------------
        c2s flow:
                source:      10.1.1.100 [trust]
                dst:         203.0.113.10
                proto:       6
                sport:       52341        dport:      443
                state:       ACTIVE       type:       FLOW
                src user:    domainjohn.doe
                dst user:    unknown

        s2c flow:
                source:      203.0.113.10 [untrust]
                dst:         10.1.1.100
                proto:       6
                sport:       443          dport:      52341
                state:       ACTIVE       type:       FLOW

        start time                    : Mon May 26 08:23:11 2026
        timeout                       : 3600 sec
        time to live                  : 3542 sec
        total byte count(c2s)         : 18432
        total byte count(s2c)         : 245760
        layer7 packet count(c2s)      : 24
        layer7 packet count(s2c)      : 186
        vsys                          : vsys1
        application                   : ssl
        rule                          : allow-web-traffic
        session to be logged at end   : True
        session in session ager       : True
        session synced from HA peer   : False
        address/port translation      : source + destination
        nat-rule                      : internet-snat(vsys1)
        layer7 processing             : enabled

What TAC Reads From This Output

→

application: ssl — App-ID has not completed identification yet, or this traffic is staying as ssl. If you expect office365, something is preventing deeper inspection.

→

rule: allow-web-traffic — this is the security policy that matched. If the app shifts from ssl to something not allowed by this rule, the session will be denied.

→

state: ACTIVE on both c2s and s2c — both directions exist. If c2s is ACTIVE but s2c shows INIT or is missing entirely, asymmetric routing is likely — return traffic is not reaching this firewall.

→

session synced from HA peer: False — this session was originated here, not synced from the standby. In an HA failover scenario, sessions synced from peer behave differently.

→

byte counts — if c2s bytes are incrementing but s2c bytes are frozen, traffic is leaving the firewall but responses are not returning. Classic one-way traffic symptom.

The TAC 5-Minute First Triage Sequence

When a production outage call comes in, before touching any configuration, TAC runs this sequence. It takes under five minutes and immediately narrows the problem space from infinite to one of five categories.

pan-os-cli

# Step 1 — Is the firewall itself healthy?
show system resources
show system state | match "ha."

# Step 2 — Are sessions being created for the affected traffic?
show session all filter source <affected-client-ip>
show session all filter destination <affected-server-ip>

# Step 3 — Are packets arriving at the firewall?
# Enable a quick dataplane packet capture (30-second window)
debug dataplane packet-diag set filter match source <client-ip>
debug dataplane packet-diag set capture stage firewall file /tmp/debug.pcap
debug dataplane packet-diag set log on
# Wait 30 seconds while user reproduces issue
debug dataplane packet-diag clear filter

# Step 4 — Are drops happening and at which stage?
show counter global filter delta yes severity drop

# Step 5 — What does routing say about the destination?
test routing fib-lookup virtual-router default ip <destination-ip>

What Each Step Tells You

Step Result	What It Means	Next Action
No session found for affected traffic	Traffic not reaching firewall, OR firewall dropping before session creation	Check routing, check upstream device, check zone protection
Session exists, both directions ACTIVE, bytes incrementing both ways	Firewall is forwarding — problem is application or server side	Check server, check App-ID, check security profile logs
Session exists, c2s bytes incrementing, s2c frozen	Return traffic not reaching firewall — asymmetric routing	Module 2 — asymmetric routing debug flow
Drops in counter global, flow_policy_deny	Security policy is dropping the traffic	test security-policy-match to identify which rule
Drops in counter global, flow_nat_no_translation	NAT rule not matching	Check NAT rule order and match criteria
No drops, no sessions, no packets in capture	Traffic not arriving at the firewall interface	Check upstream routing, check physical/logical interface

Global Counters — Reading Drop Reasons

Global counters are the fastest way to understand what the dataplane is doing with traffic. The key is filtering for drops and using the delta flag so you see only what is happening right now, not cumulative counts since last reboot.

pan-os-cli

# Show only drop counters with delta (changes since last check)
show counter global filter delta yes severity drop

# Common drop counters and what they mean:
# flow_policy_deny         — security policy denied the session
# flow_policy_nat_deny     — NAT-translated traffic denied by policy
# flow_nat_no_translation  — no NAT rule matched
# flow_fwd_l3_noroute      — no route to destination
# flow_rcv_dot1q_tag_err   — VLAN tag mismatch
# flow_tcp_rst_mismatch    — TCP RST received in wrong state
# appid_unknown_drop       — App-ID blocking unknown applications
# decrypt_error            — SSL decryption failure
# flow_parse_error         — malformed packet

Delta Flag Is Critical

Always use delta yes during an active outage. Without it, you see numbers that have been accumulating since the last reboot — a counter showing 50,000 drops might be from three months ago. With delta yes, you see only what is happening in the current sampling window. Run it twice, 10 seconds apart, and compare — that is your real-time drop rate.

HA and Session Ownership — The Hidden Variable

In high-availability deployments, there is a variable that catches even experienced engineers off guard: session ownership. In active/passive HA, all sessions are owned by the active device. In active/active HA, sessions are distributed — and which device owns a session determines which device must see both directions of traffic for that session.

→

In active/active HA: if the SYN packet goes through firewall A but the SYN-ACK returns through firewall B, firewall B has no session for this flow. It drops it. From the user perspective: intermittent connection failures that seem random and impossible to reproduce.

→

This is a design-level asymmetric routing issue that cannot be fixed purely on the firewall. The upstream routing must be corrected to ensure session owner consistency. Module 2 covers this in full.

pan-os-cli

# Check which device is active and owns sessions
show high-availability state

# In active/active — check session distribution
show session info

# Check if a specific session was synced from peer
show session id <id> | match "synced"

# Check HA path monitoring status
show high-availability path-monitoring

Next Module