Threat Hunting Workflow — Hunting Before the Alert Fires
The Difference Between Alert Response and Threat Hunting
Alert response is reactive — an alert fires, an analyst investigates it. Threat hunting is proactive — an analyst decides to look for attacker activity that has not yet triggered any alert. These are fundamentally different workflows that require different skills and different mindsets.
The reason threat hunting exists: detection rules only catch what rule authors anticipated. Sophisticated attackers operate below the threshold of existing rules — using legitimate tools (living off the land), moving slowly to avoid time-based detection, and exploiting gaps in detection coverage that have never been mapped. Threat hunting is how organizations find these attackers before the damage is done.
What Most Teams Call Hunting Is Not Hunting
The Hunt Loop — How Professional Hunters Work
| Phase | What Happens | Output |
|---|---|---|
| 1. Hypothesis Generation | Hunter creates a specific, testable statement: 'An attacker using Kerberoasting would generate specific event IDs from specific service accounts in a specific pattern' | One hypothesis statement with expected data sources and evidence indicators |
| 2. Data Source Identification | Determine which logs contain evidence for or against the hypothesis — Windows Security events, AD logs, network traffic, endpoint telemetry | List of specific log sources, fields, and time ranges needed for the hunt |
| 3. Hunt Execution | Write and run queries against the identified data sources. Iterate as results inform the next query | Raw query results — either evidence of the behavior or absence of evidence |
| 4. Analysis and Escalation | Evaluate results: confirmed malicious activity (escalate to IR), suspicious but unconfirmed (continue hunting), confirmed benign (document and close) | Decision: escalate, continue, or close. Documentation either way |
| 5. Detection Creation | If the hunt finds real attacker behavior — create a SIEM detection rule from the hunt query. Future incidents trigger automatically | New SIEM rule that catches the same behavior in real time |
Every completed hunt produces one of two outputs: a confirmed threat escalated to incident response, or a documented coverage gap with a new detection rule. A hunt that produces neither is incomplete. The value of threat hunting is not in the act of hunting — it is in the detection rules that come out the other side.
Hypothesis Generation — Where Hunters Get Ideas
Sources That Drive Hunt Hypotheses
- MITRE ATT&CK technique reports — 'Do we have coverage for T1055 Process Injection? Let's test it'
- Threat intelligence reports on active campaigns — 'This group uses specific PowerShell obfuscation patterns — do we have logs that would show this?'
- Vendor security advisories — 'This CVE allows remote code execution via a specific HTTP path — are any of our servers being probed for it?'
- Post-incident review findings — 'The last incident used a technique we only detected at the containment stage — can we detect it earlier?'
- Intel from sector ISACs — 'Healthcare sector peers report lateral movement via WMI — what does WMI lateral movement look like in our logs?'
- Hunt team gut instinct — 'We have not validated whether our logging captures parent-child process trees with full command line arguments — let's test it'
Writing a Good Hypothesis
A good hunt hypothesis is specific, testable, and falsifiable. Vague hypotheses produce vague hunts that find nothing actionable.
| Bad Hypothesis | Good Hypothesis | Why Better |
|---|---|---|
| 'Look for malware' | 'An attacker using PowerShell to download a payload would show powershell.exe with Invoke-WebRequest or WebClient in the command line, spawned from a non-standard parent process' | Specific behavior, specific fields, specific parent process context — testable |
| 'Check for lateral movement' | 'SMB lateral movement using Pass-the-Hash shows authentication events with NTLMv2 from unexpected source IPs to high-value targets between 10pm and 6am' | Specific protocol, authentication type, time window, target criteria — falsifiable |
| 'Find exfiltration' | 'Data exfiltration over DNS tunneling produces queries with unusually long subdomain labels (over 50 characters) to domains that were registered within the last 30 days' | Measurable field value, specific domain characteristic — produces a testable SIEM query |
Hunt Execution — Real SIEM Queries
Hunting for PowerShell-Based Execution
| PowerShell download cradle detection
| Hunt hypothesis: attacker using PowerShell to pull down a payload shows
| specific command line patterns in Windows Security logs or Sysmon Event ID 1
index=endpoint (sourcetype=WinEventLog:Security OR sourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational)
| where EventCode IN (4688, 1)
| where like(lower(CommandLine), "%invoke-webrequest%")
OR like(lower(CommandLine), "%webclient%")
OR like(lower(CommandLine), "%downloadstring%")
OR like(lower(CommandLine), "%downloadfile%")
OR like(lower(CommandLine), "%-enc %")
OR like(lower(CommandLine), "%-encodedcommand%")
| stats count by ComputerName, ParentImage, Image, CommandLine, _time
| sort -_time
| table _time, ComputerName, ParentImage, Image, CommandLine
| Evaluate results:
| - PowerShell spawned from Office apps (WINWORD, EXCEL, OUTLOOK) = high suspicion
| - Encoded commands (-enc) with no business justification = suspicious
| - Legitimate: IT scripts, Windows Update, software deployment tools
| - Key differentiator: parent process. svchost → powershell is suspicious.
| msiexec → powershell during software install is likely legitimateHunting for Kerberoasting
| Kerberoasting detection — attacker requesting service tickets for offline cracking | Hunt hypothesis: Kerberoasting generates Event ID 4769 (Kerberos Service Ticket Request) | with encryption type 0x17 (RC4-HMAC) for service accounts index=wineventlog sourcetype=WinEventLog:Security EventCode=4769 | where TicketEncryptionType="0x17" | where like(ServiceName, "%$") = false | stats count as ticket_requests, dc(ServiceName) as unique_services by SubjectUserName, src_ip | where ticket_requests > 5 OR unique_services > 3 | sort -ticket_requests | table SubjectUserName, src_ip, ticket_requests, unique_services | What makes this suspicious: | - 0x17 encryption type (RC4-HMAC) instead of modern AES = older, crackable tickets | - Multiple service account tickets in short window from same user | - Service name not ending in $ (computer accounts typically end in $) | - Source IP is an unexpected workstation, not an admin system | Legitimate: service accounts requesting specific tickets for their own services | Hunting: look for volume (many tickets) from one account in short time
Hunting for Living Off the Land — WMI Lateral Movement
| WMI lateral movement hunt | Hypothesis: wmic.exe or WMI service used to execute commands on remote systems | shows specific Sysmon events or Windows event patterns index=endpoint sourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational | where EventCode=1 | where like(Image, "%wmic.exe") | where like(CommandLine, "%/node:%") | rex field=CommandLine "/node:(?<target_host>[^s]+)" | stats count by ComputerName, target_host, CommandLine, _time | sort -_time | table _time, ComputerName, target_host, CommandLine | WMI subscription persistence — checks for attacker persistence via WMI filters index=wineventlog sourcetype=WinEventLog:Microsoft-Windows-WMI-Activity/Operational | where EventCode IN (5857, 5858, 5859, 5860, 5861) | stats count by EventCode, PossibleCause, _time | sort -_time
What to Do When the Hunt Finds Something
A hunt that finds real attacker behavior transitions into incident response. The handoff must preserve the investigative chain — what was found, where it was found, what queries produced the results, and what the evidence means in context.
- Document the exact queries that produced findings — IR needs to reproduce and extend them
- Preserve the raw results before they age out of the SIEM hot tier
- Identify the earliest timestamp in the evidence — this is the dwell time starting point
- Do not alert the attacker by changing firewall rules or disabling accounts before IR has scoped the full compromise
- Hand off to IR with: affected hosts, affected accounts, earliest evidence timestamp, techniques identified by ATT&CK ID, and recommended immediate investigation steps
The Dwell Time Finding