Building a Windows Failure-Path Test Foundation with Application Verifier
· Go Komura · Windows Development, Bug Investigation, Industrial Camera, Application Verifier, Failure-Path Testing, Handle Leak
Application Verifier is a powerful tool when you want to surface, ahead of time, the anomalies that occur in Windows native code and at the Win32 boundary. Especially when you want to test handle anomalies, heap corruption, and low-resource failure paths, it can bring out problems quite quickly that normal-path testing alone would never show.
In Part 1, When an Industrial Camera Control App Suddenly Crashes After One Month (Part 1) - Finding Handle Leaks and Designing Logs for Long-Running Operation, we covered a case where investigating a control app that crashed after long-running operation revealed a handle leak as the cause. But strengthening the logs is only half the job. What you really want is to be able to test, in advance, whether you are in a state where “you can tell what happened” if an unexpected programming mistake ever causes a memory leak, a handle leak, a partial failure, or a missed release in the future.
That is where we used Application Verifier. It is a tool that lets you inject runtime checks and fault injection into code running in Windows native code and at the Win32 boundary. What is especially convenient in practice is that you can trigger memory-exhaustion-like and resource-exhaustion-like failure modes ahead of time, without actually devouring the machine’s memory.
In this second part, we organize what Application Verifier is, what it can do, and how to build it into a failure-path test foundation, in the context of an industrial camera control app.
Table of Contents
- The Conclusion First (In One Line)
- What Is Application Verifier?
- 2.1. In One Sentence
- 2.2. Where It Shines
- 2.3. What You Gain
- What Application Verifier Can Do
- 3.1. Basics: Handles / Heaps / Locks / Memory / TLS, etc.
- 3.2. Low Resource Simulation: Front-Loading Memory and Resource Exhaustion
- 3.3. Page Heap and the Debugger
- 3.4.
!avrf/!htrace/ Logs
- Why We Introduced It This Time
- 4.1. The Goal Is Not Just “Finding Bugs”
- 4.2. Triggering Memory-Exhaustion-Like Phenomena
- 4.3. Verifying We Can Trace Handle Anomalies When They Occur
- How to Trigger Memory- and Resource-Exhaustion-Like Phenomena
- 5.1. The Idea Behind Low Resource Simulation
- 5.2. What You Can Make Fail
- 5.3. How to Apply It in Practice
- How to Look at Handle Anomalies
- 6.1. The
HandlesCheck - 6.2. Viewing Open / Close Stacks with
!htrace - 6.3. How to Combine It with Your Own Logs
- 6.1. The
- How to Build a Failure-Path Test Foundation
- 7.1. Move the Execution Unit into a Harness
- 7.2. Split the Test Menu
- 7.3. What to Collect
- 7.4. Acceptance Criteria
- 7.5. Caveats
- A Rough Decision Guide
- Summary
- References
1. The Conclusion First (In One Line)
- Application Verifier is a tool that makes misuse at Windows’ unmanaged / native boundary easier to catch at runtime
- Its value is not only “finding bugs,” but forcing rarely seen failure paths to occur ahead of time
Handlesdetects invalid handles,Heapsexposes heap corruption, andLow Resource Simulationperforms fault injection of memory-exhaustion-like and resource-exhaustion-like situations- Delegating the leak investigation of a long-running resident EXE entirely to Application Verifier is a bad approach; combining it with your own
Handle Countand resource-lifecycle logs is the realistic path - In a failure-path test foundation, it is easier to read results if you run a normal-path verifier run and a fault injection run separately
- Even when you want to test a DLL, what you enable Application Verifier on is the test EXE that actually exercises that DLL
In short, Application Verifier is a tool for dragging the “nasty bugs” living around Windows’ native / Win32 boundary out into the open. It is an especially good fit in worlds like equipment control apps, where native SDKs, P/Invoke, and Win32 APIs routinely mix.
2. What Is Application Verifier?
2.1. In One Sentence
Application Verifier is a runtime verification tool for Windows user-mode applications. It monitors how a running app uses OS APIs and handles resources, detecting suspicious usage and letting you deliberately inject failures.
Unlike “static analysis” or “unit testing,” it is a tool for seeing how things break when that code path is actually exercised. That makes it well suited for flushing out failure paths that routine functional testing never reaches.
flowchart LR
A[Test harness] --> B[Control app / SDK wrapper]
B --> C[Application Verifier]
C --> D[Win32 API / native DLL / OS resources]
C --> E[verifier stop]
C --> F[debugger output]
C --> G[AppVerifier logs]
B --> H[Own structured log]
2.2. Where It Shines
It tends to be especially effective in situations like these.
- You call native DLLs or a camera SDK
- You cross P/Invoke or COM boundaries
- You use handles, heaps, locks, and virtual memory heavily, directly or indirectly
- The app rarely crashes on the normal path, but lifetime management looks fragile on the failure paths
- “Occasionally returns strange failures” shows up before “crashes”
Conversely, it is not a tool for tracing object graphs in the purely managed world. So even in a C# app, it pays off considerably if the native SDK or Win32 boundary is thick — but it is not a single tool for fully investigating pure managed heap leaks.
2.3. What You Gain
In practice, the benefits boil down to roughly these three.
- Stop native-boundary misuse early
- invalid handles
- heap corruption
- lock misuse
- virtual memory API misuse, etc.
- Front-load failure modes that only appear under low resources
malloc-equivalents occasionally failCreateEventandCreateFileoccasionally failVirtualAllocfails
- Easier tracing when combined with a debugger
!avrf!htrace!heap -p -a- verifier stop logs
What hurts in equipment control apps is “not knowing what happened on the failure path.” Application Verifier is quite effective at reducing that “not knowing.”
3. What Application Verifier Can Do
3.1. Basics: Handles / Heaps / Locks / Memory / TLS, etc.
Application Verifier’s basic set is Basics.
The checks you use most in practice are gathered here.
| Layer | What it watches | How it applies in this context |
|---|---|---|
Handles |
Use of invalid handles | Whether you are stepping on closed / corrupted handles |
Heaps |
Heap corruption | Flushing out buffer corruption and use-after-free at the native SDK boundary |
Leak |
Resources not released at DLL unload | Tests of short-lived harnesses, and cases that include unloads |
Locks / SRWLock |
Lock misuse | Checking races between reconnect and shutdown |
Memory |
Misuse of VirtualAlloc / MapViewOfFile, etc. |
Checking anomalies around large buffers and shared memory |
TLS |
Misuse of Thread Local Storage APIs | Insurance for native code with complex thread boundaries |
Threadpool |
Consistency of threadpool APIs and worker state | Backup when callbacks and async processing are abundant |
The point is to stop suspicious usage on the spot, rather than “read about it after the crash.” For long-running defects, this front-loading pays off considerably.
3.2. Low Resource Simulation: Front-Loading Memory and Resource Exhaustion
This is the genuinely convenient part in practice. That is because you can trigger phenomena close to memory exhaustion and resource exhaustion without actually devouring the RAM.
The idea is simple.
- Take a certain API call
- With a certain probability
- Make it fail on purpose
This lets you exercise error paths that are practically never taken otherwise.
Concretely, it becomes easy to trigger phenomena like these on purpose.
HeapAllocandVirtualAllocfailCreateFilefailsCreateEventfailsMapViewOfFilefails- OLE/COM allocations like
SysAllocStringfail
This is far more manageable than trying to genuinely exhaust memory and torturing the whole machine. What is more, you can target fault injection at specific DLLs only. For configurations like equipment control apps where your own wrappers mix with vendor SDKs, this is quite practical.
3.3. Page Heap and the Debugger
For heap corruption, the combination of Heaps and page heap is strong.
Full page heap in particular has the advantage of using guard pages to stop close to the moment of corruption.
However, it is quite heavy. Rather than long brute-force runs, it is more usable to narrow down to scenarios close to the repro and run them under the debugger.
So as an operating practice, a split like this is realistic.
- First apply
Basicsbroadly - Once the heap looks suspicious, use full page heap
- If it is too heavy, fall back to light page heap
- For production-like long-run testing, rely primarily on your own logs
Ultimately, AppVerifier is not a magic wand but a tool whose blade you swap per situation.
3.4. !avrf / !htrace / Logs
Application Verifier does not just raise a stop and walk away. With its debugger extensions and logs, what happened becomes easier to chase.
!avrf- View the current verifier settings and the stop currently raised
!htrace- View the stacks of a handle’s open / close / invalid references
!heap -p -a- Combined with page heap, trace the corrupted heap block
- AppVerifier logs
- Logs can be kept for when a stop occurs
It is especially welcome that enabling Handles automatically enables handle tracing.
This makes it much easier to trace, after the fact, “where this handle was opened and where it was closed.”
4. Why We Introduced It This Time
4.1. The Goal Is Not Just “Finding Bugs”
Our goal this time was not simply “find one bug with AppVerifier.” Put more practically, what we wanted to verify was the following.
- When a resource leak happens again on some other failure path in the future
- Will the logs properly retain the context?
- Can we chase it down to the end, together with debugger information?
- Will we avoid ending up in a “no idea what happened” state?
In other words, we used it not only as a detector, but as a test of our observation infrastructure.
4.2. Triggering Memory-Exhaustion-Like Phenomena
Genuinely causing memory exhaustion on a regular development machine is fairly tedious. Worse, once the whole machine becomes unstable, the test itself fills with noise.
So we used Low Resource Simulation to go in the direction of deliberately stepping on the failure paths that memory or resource exhaustion would likely trigger.
This makes it much easier to answer questions like these.
- If
CreateEventfails, docameraIdandphaseremain in the logs? - After a half-finished initialization, does cleanup actually run?
- If
VirtualAllocfails, does the retry avoid corrupting state? - If
CreateFilefails on the save path, does the handle come back?
What we want to emphasize is that causing the anomaly is not the goal; the goal is that the failure mode is readable when the anomaly occurs.
4.3. Verifying We Can Trace Handle Anomalies When They Occur
As with the handle leak in Part 1, with handles the place that finally crashes and the true cause easily drift apart.
So what we wanted to confirm was this.
- When an invalid handle stop is raised, can we trace the open / close with
!htrace? - Does it tie back to the
resourceId/sessionId/phasein our own logs? - Does the handle count come back down after the failure?
- When the harness is a short-lived process, are the leak deltas easy to read?
Once you can see this far, you can go from a mere “a bug appeared” to “which responsibility’s lifetime management broke down.”
5. How to Trigger Memory- and Resource-Exhaustion-Like Phenomena
5.1. The Idea Behind Low Resource Simulation
Low Resource Simulation is, in plain terms, fault injection. Rather than faithfully recreating a low-resource environment, the idea is to artificially mix in the representative API failures that occur under low resources.
So its use cases are quite clear-cut.
- Verifying cleanup on failure paths
- Verifying the robustness of retry / reconnect
- Verifying initialization where partial successes and partial failures mix
- Verifying that logs remain even for “failures that normally never happen”
The trick here is to not fail everything from the start. If you turn everything on at once, the logs explode and you lose track of “what you are even looking at.”
5.2. What You Can Make Fail
With Low Resource Simulation, you can probabilistically fail the following representative classes of APIs.
| Class | Examples | Examples in an equipment control app |
|---|---|---|
Heap_Alloc |
Heap allocation | Temporary buffers, image metadata, SDK-wrapper internal allocations |
Virtual_Alloc |
Virtual memory allocation | Larger frame buffers, ring buffers |
File |
CreateFile, etc. |
Opens of save paths and log files |
Event |
CreateEvent, etc. |
Frame-ready notification, stop/reconnect synchronization |
MapView |
CreateMapView, etc. |
Shared memory and memory-mapped files |
Ole_Alloc |
SysAllocString, etc. |
COM / OLE boundary |
Wait |
WaitForXXX family |
Around synchronization wait failures |
Registry |
Registry access | Reading/writing settings and driver-adjacent configuration |
In practice, rather than opening everything at once, the key is to start narrow, with the classes closest to the failure path you want to look at this time.
5.3. How to Apply It in Practice
As a command-line sketch, it looks like this, for example.
appverif /verify CameraHarness.exe
appverif /verify CameraHarness.exe /faults
appverif -enable lowres -for CameraHarness.exe -with heap_alloc=20000 virtual_alloc=20000 file=20000 event=20000
appverif -query lowres -for CameraHarness.exe
The approach goes like this.
- First run the normal path with
Basicsalone - Then add
Low Resource Simulationand run with fault injection - If needed, assign probabilities only to the failures you want to see, such as
fileorevent - If you want to target a specific DLL, scope the injection to that DLL
The /faults shortcut is convenient, but on its own it is centered on OLE_ALLOC and HEAP_ALLOC.
If you want to look at the failure paths of CreateFile or CreateEvent, it is more reliable to spell out -enable lowres -with file=... event=....
In equipment control apps, it is often easier to read results when you scope to the camera wrapper or the save-path DLL, rather than scattering faults across the whole app.
For example, you can build scenarios like these.
CreateEventfailure right after a reconnect startsCreateFilefailure at the start of saving- Temporary buffer allocation failure
SysAllocStringfailure during COM conversion- Verifying the failure paths of the wait APIs
These are practically never reached by routine normal-path testing alone. That is exactly why deliberately stepping on them is worth it.
6. How to Look at Handle Anomalies
6.1. The Handles Check
For everything handle-related, start with Handles.
This makes the use of invalid handles easier to detect.
The accidents it typically catches are these.
- Using a handle again after it was closed
- Passing a corrupted handle value
- Using a handle left uninitialized by a partial failure
- A broken lifetime leading to access from another thread
Where long-run operation would only show “an odd error appears occasionally,” under the verifier it can stop right on the spot. This front-loading helps a great deal.
6.2. Viewing Open / Close Stacks with !htrace
What makes Handles so welcome is that it pairs well with handle tracing.
windbg -xd av -xd ch -xd sov CameraHarness.exe
!avrf
!htrace 0x00000ABC
What you want to see with !htrace is roughly this.
- Where that handle was opened
- Where it was closed
- Whether it was referenced as an invalid handle
- Whether opens are piling up more than expected
What makes handle leaks and handle misuse troublesome is that the API that finally fell over is not the true cause.
With !htrace, you can trace that handle’s history quite concretely.
6.3. How to Combine It with Your Own Logs
That said, Application Verifier alone is not enough. In particular, doing the leak investigation of a long-running resident EXE with it alone is quite painful.
So in practice we combine the following.
- Periodic
Handle Count sessionIdresourceIdphase- Lifecycle logs of create/open and close/dispose
- Dumps and debugger output at verifier stops
With this, you can chase the problem like so, for example.
- The heartbeat shows the slope of
Handle Countis suspicious - The lifecycle logs narrow down the resource that has a
Createbut noClose - A verifier run surfaces the invalid handle or misuse ahead of time
!htraceshows the open / close stacks
This combination makes things dramatically easier to chase.
7. How to Build a Failure-Path Test Foundation
7.1. Move the Execution Unit into a Harness
Application Verifier cannot be enabled retroactively on an already-running process. You configure first, then launch.
Moreover, the settings persist until you explicitly remove them. So in practice, it is easier to handle if you target a test harness EXE rather than the production app itself.
For example, a configuration like this.
flowchart LR
A[Scenario Runner] --> B[CameraHarness.exe]
B --> C[CameraSdkWrapper.dll]
C --> D[Vendor SDK]
B --> E[Structured Log]
B --> F[Dump / Debugger]
With this, you get the advantages of:
- Running one scenario per process
- Leak deltas being easy to read
- Easy toggling of the AppVerifier settings ON/OFF
- Being able to test DLLs through the EXE side
The commands look like this.
appverif /verify CameraHarness.exe
appverif /n CameraHarness.exe
Enable before launch; disable explicitly. Running this with a harness as the premise also helps prevent configuration accidents.
7.2. Split the Test Menu
In a failure-path test foundation, it is better not to do everything in one run. Splitting into roughly these three tracks keeps things readable.
- Normal path + Basics
- Inject no failures
- Confirm that no verifier stops occur
- Fault injection track
Low Resource Simulation- Target failures at
event/file/heap_alloc/virtual_alloc, etc.
- Heap deep-dive track
Heaps- full page heap
- Reproduce locally under the debugger
Splitting these keeps “is it broken under normal usage” and “does it only break under low resources” from getting tangled.
The presence or absence of fault injection in particular changes the code paths taken considerably. So you should run both the no-fault run and the with-fault run.
7.3. What to Collect
At minimum, you want to capture these.
| Category | What you want |
|---|---|
| App logs | cameraId, sessionId, phase, handleCount, error code |
| Process state | Handle Count, Private Bytes, Thread Count |
| Debugger info | !avrf, !htrace, and !heap -p -a as needed |
| Dumps | At verifier stops, or on abnormal termination |
| AppVerifier logs | Records of stops, exported to XML for aggregation if needed |
If needed, the AppVerifier-side logs can also be exported to XML and aggregated. But the cause rarely closes from those alone, so the practical premise is reading them side by side with your own logs.
A large volume of logs is not, in itself, a virtue. What matters is that the causality can be connected later.
7.4. Acceptance Criteria
“It didn’t crash” is also too weak as an acceptance criterion. In this context, we needed at least the following.
- No verifier stops in the normal path + Basics run
- Even with fault injection, the expected failures remain in the logs
- Half-initialized resources get cleaned up properly
- After reconnect / retry,
Handle Countreturns near the baseline - When a verifier stop occurs, it can be traced via
sessionId/phase/ stack - No failure ends up as “no idea what happened”
What matters here is to evaluate not breaking and being traceable when broken as separate things.
7.5. Caveats
Application Verifier is quite convenient, but it is not magic.
- Code paths not actually exercised are not verified
- Full page heap is heavy
- Stops can also occur inside third-party SDKs
- The code paths taken differ considerably with and without fault injection
- It is not a single tool for investigating pure managed heap leaks
So its position is this.
- Long-run slopes: your own logs and counters
- Native-boundary misuse: Application Verifier
- Reconstructing causality on failure: structured logs + dumps + debugger
This division of labor is the most practical.
8. A Rough Decision Guide
- Invalid handles or double closes are suspected
Handles+!htrace
- Heap corruption / use-after-free is suspected
Heaps+ full page heap +!heap -p -a
- You want to trigger memory- or resource-exhaustion-like phenomena
Low Resource Simulation
- Things break gradually under long-running operation
- Start with your own
Handle Count/Private Bytes/ lifecycle logs
- Start with your own
- You want to test a DLL
- Enable Application Verifier on the harness EXE that calls that DLL
Turning everything on from the start usually just produces a fog of logs. Applying the blade closest to the failure path you want to see is far clearer.
9. Summary
Application Verifier’s position is that of a runtime verifier for Windows’ native / Win32 boundary. Using Handles / Heaps / Locks / Memory / TLS / Low Resource Simulation and the rest, you can force rarely seen failure paths to be exercised ahead of time.
What paid off in this context was that handle anomalies became easy to trace with !htrace when they occurred, that memory- and resource-exhaustion-like phenomena could be triggered without wrecking the whole machine, and that we could confirm whether our own logs would genuinely be useful at that moment.
As for how to run it in practice: split the normal path + Basics run from the fault injection runs, prepare a harness EXE, and cycle scenarios through short-lived processes. On top of that, combine it with your own logs, dumps, and debugger information, while watching the slope of long-run leaks itself with your own counters — that is the division of labor.
Application Verifier is a tool for “going out to meet” rare anomalies, rather than “waiting around” for them to happen.
In equipment control apps, not breaking matters, but being able to explain what happened when things break matters just as much. In that sense, we think it is a thoroughly practical tool.
10. References
- Part 1: When an Industrial Camera Control App Suddenly Crashes After One Month (Part 1) - Finding Handle Leaks and Designing Logs for Long-Running Operation
- Application Verifier - Overview
- Application Verifier - Testing Applications
- Application Verifier - Tests within Application Verifier
- Application Verifier - Debugging Application Verifier Stops
- Application Verifier - Features
- !htrace (WinDbg)
- GetProcessHandleCount function (processthreadsapi.h)
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
Investigating Long-Run Crashes of an Industrial Camera App - The Handle Leak (Part 1)
How to look at a Windows app that suddenly crashes after long-running operation, using a case study of an industrial camera control app, ...
Why TCP Retransmissions Stall Industrial Camera Communication, and How to Isolate Them
How to isolate the cause when industrial camera communication stalls for several seconds due to TCP retransmissions, covering packet loss...
Windows App Outsourcing and Contract Development: What to Sort Out Before You Ask
Before commissioning Windows app outsourcing or contract development, here is how to sort out existing software modification, device inte...
Designing Windows Apps to Leave Logs and Dumps When They Crash
How to combine regular logging, a final crash marker, WER LocalDumps, and a watchdog process so that even when a Windows app dies from an...
An Introduction to Collecting Windows Crash Dumps - WER/ProcDump/WinDbg
To chase hard-to-reproduce Windows application crashes, we walk through when to use WER LocalDumps, ProcDump, MiniDumpWriteDump, and WinD...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Bug Investigation & Long-Run Failures
Topic page for intermittent failures, communication diagnosis, long-run crashes, and failure-path test foundations.
Related Case Study
This case-study page shows a similar structure for diagnosis, prioritization, or redesign.
Failure-Path Test Infrastructure with Application Verifier
Case-study page for building a failure-path testing foundation that makes future investigation easier.
Where This Topic Connects
This article connects naturally to the following service pages.
Bug Investigation & Root Cause Analysis
Application Verifier and failure-path test foundations are a central theme of our bug investigation and root-cause analysis service, which advances failure reproduction and cause identification.
Technical Consulting & Design Review
If you want to sort out how far failure-path testing and observation points should be woven into your design, this can be explored as a technical consulting and design review engagement.
Author Profile
Profile page for the article author.
Go Komura
Representative of KomuraSoft LLC
Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.
Public links