DHCP Exhaustion & DNS Failures — 'Network Is Up But Users Cannot Connect'
The Complaint Pattern — Infrastructure Looks Fine, Users Cannot Work
- "Network is up, switches are fine, but I cannot get an IP address"
- "Some users are working, others are stuck at 169.254.x.x — APIPA address"
- "DNS is not resolving anything — websites time out but IP addresses work fine"
- "After the power outage came back up, half the floor cannot get on the network"
- "New employees cannot connect but existing users are fine"
- "VoIP phones are dropping calls — phones show IP but cannot reach call manager"
DHCP and DNS failures produce a specific pattern: the physical network infrastructure appears completely healthy. Switches are up, routing is functioning, firewalls are passing traffic. But users cannot access anything because they either cannot get an address or cannot resolve names. The infrastructure is working — but the services that enable users to use it are not.
DHCP Exhaustion — When the Pool Runs Out
What Causes Pool Exhaustion
| Cause | How to Identify | Frequency |
|---|---|---|
| Lease time set too short — stale leases pile up | Pool utilization near 100% even when fewer devices than addresses exist | Common in environments where lease time was reduced without adjusting pool size |
| Rogue DHCP client requesting many leases with spoofed MACs | Server log shows massive lease requests from rotating MAC addresses | Less common but causes pool depletion in minutes |
| IoT or medical device explosion — more devices than planned | Lease count exceeds original pool design capacity | Very common as device counts grow over years without pool expansion |
| DHCP relay misconfiguration — requests going to wrong server or pool | Clients getting addresses from wrong subnet or APIPA addresses | Common after network changes or VLAN additions |
| Abandoned leases from devices that did not properly release | Leases still active for IP addresses that no longer have active devices | Common — printers, scanners, equipment that is removed without graceful disconnection |
Debug Sequence — Confirming DHCP Exhaustion
! On Cisco IOS DHCP server: ! Step 1: Check pool status immediately show ip dhcp pool ! Shows for each pool: ! Total addresses: 254 (or whatever your pool size is) ! Leased addresses: 253 ← near pool maximum = exhaustion ! Available addresses: 1 ← one address remaining ! Step 2: Check binding details show ip dhcp binding ! Lists every active lease with: ! IP address, hardware address (MAC), lease expiry time ! Count the entries — if it equals pool size, pool is full ! Step 3: Find leases that may be stale (expired but still showing) show ip dhcp binding | include Expired ! These are leases that should have been cleaned up but were not ! Step 4: Check for lease conflicts show ip dhcp conflict ! A conflict means an IP was discovered on the network that was NOT assigned by this server ! This removes that IP from the pool automatically — conflicts reduce available addresses ! Step 5: See DHCP server statistics show ip dhcp server statistics ! Look for: ! Messages received: DISCOVER, REQUEST, RELEASE, DECLINE counts ! High DECLINE count = address conflicts (clients are declining offered addresses) ! Zero RELEASE count = clients are not properly releasing addresses when disconnecting
Emergency Recovery — Free Up Pool Space
! Option 1: Clear expired bindings immediately clear ip dhcp binding * ! WARNING: This clears ALL bindings — active clients will need to request new addresses ! Users will experience a brief disconnection while DHCP renews ! Best done outside business hours if possible ! If during business hours — clear one subnet at a time: clear ip dhcp binding 10.1.1.0 255.255.255.0 ! Option 2: Clear only confirmed stale leases ! First identify IPs that are leased but do not respond to ping: ! (Run from DHCP server or a device that can reach all subnets) ! Then clear specific bindings: clear ip dhcp binding 10.1.1.150 ! Option 3: Temporarily expand the pool (immediate relief) ip dhcp pool VLAN10 network 10.1.1.0 255.255.254.0 ! Expand from /24 (254 addresses) to /23 (510 addresses) ! Requires IP space to be available — coordinate with network planning ! Option 4: Reduce lease time to force faster recycling: ip dhcp pool VLAN10 lease 0 4 ! Sets lease time to 4 hours instead of 1 day ! Stale leases will recycle 6x faster ! Tradeoff: increases DHCP traffic as clients renew more frequently
DNS Resolution Failures — The Harder Problem to Diagnose
DNS failures are deceptive because they look like general network failures to users. "The internet is down" is the report — but the internet is not down. Name resolution is broken. The distinction matters because the fix is completely different.
The IP vs Name Test
Debug Sequence — Tracing DNS Failures
! Step 1: Test DNS resolution from the affected client nslookup google.com ! If this returns "Server: Unknown" or times out — DNS server is unreachable ! If this returns the IP — DNS works, problem is elsewhere ! Specify the DNS server explicitly to test connectivity: nslookup google.com 8.8.8.8 nslookup google.com 10.1.1.10 ! If external DNS works but internal DNS fails: internal DNS server problem ! If both fail: DNS traffic is being blocked (firewall rule or routing issue) ! From Linux/macOS: dig google.com @10.1.1.10 ! Shows full DNS response including query time and server used ! Query time of 3000ms+ = DNS server is responding but very slowly ! Step 2: Check if DNS server is reachable at all ping 10.1.1.10 ! If ping fails: routing issue, not DNS ! If ping succeeds: DNS service is down or not responding on port 53 ! Test specifically on port 53: nslookup -timeout=5 -retry=1 google.com 10.1.1.10 ! If this fails but ping succeeds: DNS service is not running on that server ! Step 3: Check DHCP-assigned DNS servers ! On Windows client: ipconfig /all | findstr DNS ! Verify DNS server IPs are correct ! A wrong DNS server IP (from DHCP misconfiguration) causes all resolution failures ! Step 4: On Cisco IOS acting as DNS forwarder: show ip dns view show ip dns cache ! Shows what the router knows about DNS — if forwarding is configured correctly
Internal DNS Server Failures
! If the DNS server is a Windows Server running Active Directory DNS:
! On the DNS server itself:
! Check DNS service status:
Get-Service -Name DNS
! Should be: Status = Running
! If stopped: Start-Service DNS — and investigate why it stopped
! Check DNS server logs:
Get-WinEvent -LogName "DNS Server" -MaxEvents 50 | Where-Object {$_.LevelDisplayName -ne "Information"}
! Look for: zone transfer failures, database corruption errors, forwarder failures
! Test DNS resolution from the server to itself:
nslookup google.com 127.0.0.1
! If this fails on the DNS server: the service itself is the problem
! If this succeeds: problem is network between clients and DNS server
! Check if Active Directory replication is affecting DNS:
repadmin /showrepl
! AD DNS zones replicate via AD replication — if replication is broken, DNS zones can be stale
! Check forwarder configuration:
dnscmd /info /forwarders
! If forwarders are unreachable: external DNS resolution fails while internal resolution works
! This creates partial failure — internal resources resolve, external do notThe Combined Failure — DHCP + DNS After Power Restoration
The most disruptive scenario: a facility loses power, UPS runs out, everything shuts down. Power restores and the network comes back — but dozens of users cannot work. This is the most common post-outage complaint and has a specific cause pattern.
! What happens during unplanned power loss: ! 1. DHCP server loses power while holding active leases in memory ! 2. If DHCP server does not persist leases to disk (or write is not flushed): ! All active leases are lost when power is cut ! 3. After power restore: ! - DHCP server starts with empty lease database ! - Clients that are still on (laptops on battery, VoIP phones with PoE backup) ! - try to renew their existing lease (they have an IP but it is no longer in the server database) ! - Server responds with NAK (negative acknowledgment) — lease not recognized ! - Client falls back to discovery — works if pool has space ! - Devices that were powered off come back and also request leases ! Symptom pattern: ! - Some users work (got new leases), others do not (lease conflict or pool near full) ! - VoIP phones may have IP but call manager is unreachable (route to call manager through ! down link, or call manager itself not fully restarted) ! Post-power-outage DHCP recovery: ! Step 1: Check pool utilization show ip dhcp pool show ip dhcp binding ! Step 2: Clear all bindings and let everyone re-acquire clear ip dhcp binding * ! All clients will get fresh leases — some users will briefly disconnect ! Step 3: Check DHCP conflict list — clean it clear ip dhcp conflict * ! Conflicts happen when a device has a static IP in the DHCP range ! Step 4: Verify DHCP server IP persistence is configured ! On Cisco IOS: ip dhcp database flash:dhcp-bindings.txt write-delay 60 ! This persists DHCP lease database to flash storage every 60 seconds ! After power restore, server reloads from the saved file — leases survive
Persistent DHCP lease storage is not enabled by default on many DHCP servers. Without it, any unplanned restart — server crash, power loss, service restart — wipes all lease knowledge and causes a network-wide re-acquisition event. Configure lease database persistence during initial deployment, not after the first power outage.
DHCP Snooping — Prevention for Rogue DHCP Servers
A rogue DHCP server on the network hands out incorrect DNS, gateway, or IP addresses to clients — causing partial or total connectivity failure that looks identical to a legitimate DHCP problem.
! Enable DHCP snooping to block rogue DHCP servers: ip dhcp snooping ip dhcp snooping vlan 1,10,20,30 ! Mark only the legitimate DHCP server uplink as trusted: interface GigabitEthernet1/0/1 ip dhcp snooping trust ! This port can send DHCP offers ! All other ports: DHCP server responses (OFFER, ACK, NAK) are dropped ! Access ports are untrusted by default after enabling snooping: ! Any device connected to an access port that tries to act as DHCP server ! will have its responses dropped ! Verify DHCP snooping is working: show ip dhcp snooping show ip dhcp snooping binding ! Shows legitimate DHCP bindings learned through the trusted port