Threat Hunting Workflow — Hunting Before the Alert Fires

The Difference Between Alert Response and Threat Hunting

Alert response is reactive — an alert fires, an analyst investigates it. Threat hunting is proactive — an analyst decides to look for attacker activity that has not yet triggered any alert. These are fundamentally different workflows that require different skills and different mindsets.

The reason threat hunting exists: detection rules only catch what rule authors anticipated. Sophisticated attackers operate below the threshold of existing rules — using legitimate tools (living off the land), moving slowly to avoid time-based detection, and exploiting gaps in detection coverage that have never been mapped. Threat hunting is how organizations find these attackers before the damage is done.

What Most Teams Call Hunting Is Not Hunting

Running a saved SIEM query on a schedule is alert monitoring with extra steps — not threat hunting. Threat hunting starts with a hypothesis about attacker behavior, executes a specific investigation to test that hypothesis, and either finds evidence (new detection) or rules it out (documented coverage gap). A hunt without a hypothesis is an open-ended query session with no defined outcome.

The Hunt Loop — How Professional Hunters Work

Phase	What Happens	Output
1. Hypothesis Generation	Hunter creates a specific, testable statement: 'An attacker using Kerberoasting would generate specific event IDs from specific service accounts in a specific pattern'	One hypothesis statement with expected data sources and evidence indicators
2. Data Source Identification	Determine which logs contain evidence for or against the hypothesis — Windows Security events, AD logs, network traffic, endpoint telemetry	List of specific log sources, fields, and time ranges needed for the hunt
3. Hunt Execution	Write and run queries against the identified data sources. Iterate as results inform the next query	Raw query results — either evidence of the behavior or absence of evidence
4. Analysis and Escalation	Evaluate results: confirmed malicious activity (escalate to IR), suspicious but unconfirmed (continue hunting), confirmed benign (document and close)	Decision: escalate, continue, or close. Documentation either way
5. Detection Creation	If the hunt finds real attacker behavior — create a SIEM detection rule from the hunt query. Future incidents trigger automatically	New SIEM rule that catches the same behavior in real time

→

Every completed hunt produces one of two outputs: a confirmed threat escalated to incident response, or a documented coverage gap with a new detection rule. A hunt that produces neither is incomplete. The value of threat hunting is not in the act of hunting — it is in the detection rules that come out the other side.

Hypothesis Generation — Where Hunters Get Ideas

Sources That Drive Hunt Hypotheses

MITRE ATT&CK technique reports — 'Do we have coverage for T1055 Process Injection? Let's test it'
Threat intelligence reports on active campaigns — 'This group uses specific PowerShell obfuscation patterns — do we have logs that would show this?'
Vendor security advisories — 'This CVE allows remote code execution via a specific HTTP path — are any of our servers being probed for it?'
Post-incident review findings — 'The last incident used a technique we only detected at the containment stage — can we detect it earlier?'
Intel from sector ISACs — 'Healthcare sector peers report lateral movement via WMI — what does WMI lateral movement look like in our logs?'
Hunt team gut instinct — 'We have not validated whether our logging captures parent-child process trees with full command line arguments — let's test it'

Writing a Good Hypothesis

A good hunt hypothesis is specific, testable, and falsifiable. Vague hypotheses produce vague hunts that find nothing actionable.

Bad Hypothesis	Good Hypothesis	Why Better
'Look for malware'	'An attacker using PowerShell to download a payload would show powershell.exe with Invoke-WebRequest or WebClient in the command line, spawned from a non-standard parent process'	Specific behavior, specific fields, specific parent process context — testable
'Check for lateral movement'	'SMB lateral movement using Pass-the-Hash shows authentication events with NTLMv2 from unexpected source IPs to high-value targets between 10pm and 6am'	Specific protocol, authentication type, time window, target criteria — falsifiable
'Find exfiltration'	'Data exfiltration over DNS tunneling produces queries with unusually long subdomain labels (over 50 characters) to domains that were registered within the last 30 days'	Measurable field value, specific domain characteristic — produces a testable SIEM query

Hunt Execution — Real SIEM Queries

Hunting for PowerShell-Based Execution

splunk-spl

| PowerShell download cradle detection
| Hunt hypothesis: attacker using PowerShell to pull down a payload shows
| specific command line patterns in Windows Security logs or Sysmon Event ID 1

index=endpoint (sourcetype=WinEventLog:Security OR sourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational)
| where EventCode IN (4688, 1)
| where like(lower(CommandLine), "%invoke-webrequest%")
    OR like(lower(CommandLine), "%webclient%")
    OR like(lower(CommandLine), "%downloadstring%")
    OR like(lower(CommandLine), "%downloadfile%")
    OR like(lower(CommandLine), "%-enc %")
    OR like(lower(CommandLine), "%-encodedcommand%")
| stats count by ComputerName, ParentImage, Image, CommandLine, _time
| sort -_time
| table _time, ComputerName, ParentImage, Image, CommandLine

| Evaluate results:
| - PowerShell spawned from Office apps (WINWORD, EXCEL, OUTLOOK) = high suspicion
| - Encoded commands (-enc) with no business justification = suspicious
| - Legitimate: IT scripts, Windows Update, software deployment tools
| - Key differentiator: parent process. svchost → powershell is suspicious.
|   msiexec → powershell during software install is likely legitimate

Hunting for Kerberoasting

splunk-spl

| Kerberoasting detection — attacker requesting service tickets for offline cracking
| Hunt hypothesis: Kerberoasting generates Event ID 4769 (Kerberos Service Ticket Request)
| with encryption type 0x17 (RC4-HMAC) for service accounts

index=wineventlog sourcetype=WinEventLog:Security EventCode=4769
| where TicketEncryptionType="0x17"
| where like(ServiceName, "%$") = false
| stats count as ticket_requests, dc(ServiceName) as unique_services by SubjectUserName, src_ip
| where ticket_requests > 5 OR unique_services > 3
| sort -ticket_requests
| table SubjectUserName, src_ip, ticket_requests, unique_services

| What makes this suspicious:
| - 0x17 encryption type (RC4-HMAC) instead of modern AES = older, crackable tickets
| - Multiple service account tickets in short window from same user
| - Service name not ending in $ (computer accounts typically end in $)
| - Source IP is an unexpected workstation, not an admin system

| Legitimate: service accounts requesting specific tickets for their own services
| Hunting: look for volume (many tickets) from one account in short time

Hunting for Living Off the Land — WMI Lateral Movement

splunk-spl

| WMI lateral movement hunt
| Hypothesis: wmic.exe or WMI service used to execute commands on remote systems
| shows specific Sysmon events or Windows event patterns

index=endpoint sourcetype=XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
| where EventCode=1
| where like(Image, "%wmic.exe")
| where like(CommandLine, "%/node:%")
| rex field=CommandLine "/node:(?<target_host>[^s]+)"
| stats count by ComputerName, target_host, CommandLine, _time
| sort -_time
| table _time, ComputerName, target_host, CommandLine

| WMI subscription persistence — checks for attacker persistence via WMI filters
index=wineventlog sourcetype=WinEventLog:Microsoft-Windows-WMI-Activity/Operational
| where EventCode IN (5857, 5858, 5859, 5860, 5861)
| stats count by EventCode, PossibleCause, _time
| sort -_time

What to Do When the Hunt Finds Something

A hunt that finds real attacker behavior transitions into incident response. The handoff must preserve the investigative chain — what was found, where it was found, what queries produced the results, and what the evidence means in context.

Document the exact queries that produced findings — IR needs to reproduce and extend them
Preserve the raw results before they age out of the SIEM hot tier
Identify the earliest timestamp in the evidence — this is the dwell time starting point
Do not alert the attacker by changing firewall rules or disabling accounts before IR has scoped the full compromise
Hand off to IR with: affected hosts, affected accounts, earliest evidence timestamp, techniques identified by ATT&CK ID, and recommended immediate investigation steps

The Dwell Time Finding

The most valuable output of a proactive hunt is the dwell time number — how long the attacker was in the environment before detection. If the hunt finds evidence from 30 days ago that was never alerted, the dwell time is at minimum 30 days. That number drives the scope of the investigation and the severity of the incident. Every hunt that finds real activity should report dwell time as a primary finding.

Next Module