Building a Windows Failure-Path Test Foundation with Application Verifier

Application Verifier is a powerful tool when you want to surface, ahead of time, the anomalies that occur in Windows native code and at the Win32 boundary. Especially when you want to test handle anomalies, heap corruption, and low-resource failure paths, it can bring out problems quite quickly that normal-path testing alone would never show.

In Part 1, When an Industrial Camera Control App Suddenly Crashes After One Month (Part 1) - Finding Handle Leaks and Designing Logs for Long-Running Operation, we covered a case where investigating a control app that crashed after long-running operation revealed a handle leak as the cause. But strengthening the logs is only half the job. What you really want is to be able to test, in advance, whether you are in a state where “you can tell what happened” if an unexpected programming mistake ever causes a memory leak, a handle leak, a partial failure, or a missed release in the future.

That is where we used Application Verifier. It is a tool that lets you inject runtime checks and fault injection into code running in Windows native code and at the Win32 boundary. What is especially convenient in practice is that you can trigger memory-exhaustion-like and resource-exhaustion-like failure modes ahead of time, without actually devouring the machine’s memory.

In this second part, we organize what Application Verifier is, what it can do, and how to build it into a failure-path test foundation, in the context of an industrial camera control app.

The Conclusion First (In One Line)
What Is Application Verifier?
- 2.1. In One Sentence
- 2.2. Where It Shines
- 2.3. What You Gain
What Application Verifier Can Do
- 3.1. Basics: Handles / Heaps / Locks / Memory / TLS, etc.
- 3.2. Low Resource Simulation: Front-Loading Memory and Resource Exhaustion
- 3.3. Page Heap and the Debugger
- 3.4. !avrf / !htrace / Logs
Why We Introduced It This Time
- 4.1. The Goal Is Not Just “Finding Bugs”
- 4.2. Triggering Memory-Exhaustion-Like Phenomena
- 4.3. Verifying We Can Trace Handle Anomalies When They Occur
How to Trigger Memory- and Resource-Exhaustion-Like Phenomena
- 5.1. The Idea Behind Low Resource Simulation
- 5.2. What You Can Make Fail
- 5.3. How to Apply It in Practice
How to Look at Handle Anomalies
- 6.1. The Handles Check
- 6.2. Viewing Open / Close Stacks with !htrace
- 6.3. How to Combine It with Your Own Logs
How to Build a Failure-Path Test Foundation
- 7.1. Move the Execution Unit into a Harness
- 7.2. Split the Test Menu
- 7.3. What to Collect
- 7.4. Acceptance Criteria
- 7.5. Caveats
A Rough Decision Guide
Summary
References

1. The Conclusion First (In One Line)

Application Verifier is a tool that makes misuse at Windows’ unmanaged / native boundary easier to catch at runtime
Its value is not only “finding bugs,” but forcing rarely seen failure paths to occur ahead of time
Handles detects invalid handles, Heaps exposes heap corruption, and Low Resource Simulation performs fault injection of memory-exhaustion-like and resource-exhaustion-like situations
Delegating the leak investigation of a long-running resident EXE entirely to Application Verifier is a bad approach; combining it with your own Handle Count and resource-lifecycle logs is the realistic path
In a failure-path test foundation, it is easier to read results if you run a normal-path verifier run and a fault injection run separately
Even when you want to test a DLL, what you enable Application Verifier on is the test EXE that actually exercises that DLL

In short, Application Verifier is a tool for dragging the “nasty bugs” living around Windows’ native / Win32 boundary out into the open. It is an especially good fit in worlds like equipment control apps, where native SDKs, P/Invoke, and Win32 APIs routinely mix.

2. What Is Application Verifier?

2.1. In One Sentence

Application Verifier is a runtime verification tool for Windows user-mode applications. It monitors how a running app uses OS APIs and handles resources, detecting suspicious usage and letting you deliberately inject failures.

Unlike “static analysis” or “unit testing,” it is a tool for seeing how things break when that code path is actually exercised. That makes it well suited for flushing out failure paths that routine functional testing never reaches.

flowchart LR
    A[Test harness] --> B[Control app / SDK wrapper]
    B --> C[Application Verifier]
    C --> D[Win32 API / native DLL / OS resources]
    C --> E[verifier stop]
    C --> F[debugger output]
    C --> G[AppVerifier logs]
    B --> H[Own structured log]

2.2. Where It Shines

It tends to be especially effective in situations like these.

You call native DLLs or a camera SDK
You cross P/Invoke or COM boundaries
You use handles, heaps, locks, and virtual memory heavily, directly or indirectly
The app rarely crashes on the normal path, but lifetime management looks fragile on the failure paths
“Occasionally returns strange failures” shows up before “crashes”

Conversely, it is not a tool for tracing object graphs in the purely managed world. So even in a C# app, it pays off considerably if the native SDK or Win32 boundary is thick — but it is not a single tool for fully investigating pure managed heap leaks.

2.3. What You Gain

In practice, the benefits boil down to roughly these three.

Stop native-boundary misuse early
- invalid handles
- heap corruption
- lock misuse
- virtual memory API misuse, etc.
Front-load failure modes that only appear under low resources
- malloc-equivalents occasionally fail
- CreateEvent and CreateFile occasionally fail
- VirtualAlloc fails
Easier tracing when combined with a debugger
- !avrf
- !htrace
- !heap -p -a
- verifier stop logs

What hurts in equipment control apps is “not knowing what happened on the failure path.” Application Verifier is quite effective at reducing that “not knowing.”

3. What Application Verifier Can Do

3.1. Basics: Handles / Heaps / Locks / Memory / TLS, etc.

Application Verifier’s basic set is Basics. The checks you use most in practice are gathered here.

Layer	What it watches	How it applies in this context
`Handles`	Use of invalid handles	Whether you are stepping on closed / corrupted handles
`Heaps`	Heap corruption	Flushing out buffer corruption and use-after-free at the native SDK boundary
`Leak`	Resources not released at DLL unload	Tests of short-lived harnesses, and cases that include unloads
`Locks` / `SRWLock`	Lock misuse	Checking races between reconnect and shutdown
`Memory`	Misuse of `VirtualAlloc` / `MapViewOfFile`, etc.	Checking anomalies around large buffers and shared memory
`TLS`	Misuse of Thread Local Storage APIs	Insurance for native code with complex thread boundaries
`Threadpool`	Consistency of threadpool APIs and worker state	Backup when callbacks and async processing are abundant

The point is to stop suspicious usage on the spot, rather than “read about it after the crash.” For long-running defects, this front-loading pays off considerably.

3.2. Low Resource Simulation: Front-Loading Memory and Resource Exhaustion

This is the genuinely convenient part in practice. That is because you can trigger phenomena close to memory exhaustion and resource exhaustion without actually devouring the RAM.

The idea is simple.

Take a certain API call
With a certain probability
Make it fail on purpose

This lets you exercise error paths that are practically never taken otherwise.

Concretely, it becomes easy to trigger phenomena like these on purpose.

HeapAlloc and VirtualAlloc fail
CreateFile fails
CreateEvent fails
MapViewOfFile fails
OLE/COM allocations like SysAllocString fail

This is far more manageable than trying to genuinely exhaust memory and torturing the whole machine. What is more, you can target fault injection at specific DLLs only. For configurations like equipment control apps where your own wrappers mix with vendor SDKs, this is quite practical.

3.3. Page Heap and the Debugger

For heap corruption, the combination of Heaps and page heap is strong. Full page heap in particular has the advantage of using guard pages to stop close to the moment of corruption.

However, it is quite heavy. Rather than long brute-force runs, it is more usable to narrow down to scenarios close to the repro and run them under the debugger.

So as an operating practice, a split like this is realistic.

First apply Basics broadly
Once the heap looks suspicious, use full page heap
If it is too heavy, fall back to light page heap
For production-like long-run testing, rely primarily on your own logs

Ultimately, AppVerifier is not a magic wand but a tool whose blade you swap per situation.

3.4. `!avrf` / `!htrace` / Logs

Application Verifier does not just raise a stop and walk away. With its debugger extensions and logs, what happened becomes easier to chase.

!avrf
- View the current verifier settings and the stop currently raised
!htrace
- View the stacks of a handle’s open / close / invalid references
!heap -p -a
- Combined with page heap, trace the corrupted heap block
AppVerifier logs
- Logs can be kept for when a stop occurs

It is especially welcome that enabling Handles automatically enables handle tracing. This makes it much easier to trace, after the fact, “where this handle was opened and where it was closed.”

4. Why We Introduced It This Time

4.1. The Goal Is Not Just “Finding Bugs”

Our goal this time was not simply “find one bug with AppVerifier.” Put more practically, what we wanted to verify was the following.

When a resource leak happens again on some other failure path in the future
Will the logs properly retain the context?
Can we chase it down to the end, together with debugger information?
Will we avoid ending up in a “no idea what happened” state?

In other words, we used it not only as a detector, but as a test of our observation infrastructure.

4.2. Triggering Memory-Exhaustion-Like Phenomena

Genuinely causing memory exhaustion on a regular development machine is fairly tedious. Worse, once the whole machine becomes unstable, the test itself fills with noise.

So we used Low Resource Simulation to go in the direction of deliberately stepping on the failure paths that memory or resource exhaustion would likely trigger.

This makes it much easier to answer questions like these.

If CreateEvent fails, do cameraId and phase remain in the logs?
After a half-finished initialization, does cleanup actually run?
If VirtualAlloc fails, does the retry avoid corrupting state?
If CreateFile fails on the save path, does the handle come back?

What we want to emphasize is that causing the anomaly is not the goal; the goal is that the failure mode is readable when the anomaly occurs.

4.3. Verifying We Can Trace Handle Anomalies When They Occur

As with the handle leak in Part 1, with handles the place that finally crashes and the true cause easily drift apart.

So what we wanted to confirm was this.

When an invalid handle stop is raised, can we trace the open / close with !htrace?
Does it tie back to the resourceId / sessionId / phase in our own logs?
Does the handle count come back down after the failure?
When the harness is a short-lived process, are the leak deltas easy to read?

Once you can see this far, you can go from a mere “a bug appeared” to “which responsibility’s lifetime management broke down.”

5. How to Trigger Memory- and Resource-Exhaustion-Like Phenomena

5.1. The Idea Behind Low Resource Simulation

Low Resource Simulation is, in plain terms, fault injection. Rather than faithfully recreating a low-resource environment, the idea is to artificially mix in the representative API failures that occur under low resources.

So its use cases are quite clear-cut.

Verifying cleanup on failure paths
Verifying the robustness of retry / reconnect
Verifying initialization where partial successes and partial failures mix
Verifying that logs remain even for “failures that normally never happen”

The trick here is to not fail everything from the start. If you turn everything on at once, the logs explode and you lose track of “what you are even looking at.”

5.2. What You Can Make Fail

With Low Resource Simulation, you can probabilistically fail the following representative classes of APIs.

Class	Examples	Examples in an equipment control app
`Heap_Alloc`	Heap allocation	Temporary buffers, image metadata, SDK-wrapper internal allocations
`Virtual_Alloc`	Virtual memory allocation	Larger frame buffers, ring buffers
`File`	`CreateFile`, etc.	Opens of save paths and log files
`Event`	`CreateEvent`, etc.	Frame-ready notification, stop/reconnect synchronization
`MapView`	`CreateMapView`, etc.	Shared memory and memory-mapped files
`Ole_Alloc`	`SysAllocString`, etc.	COM / OLE boundary
`Wait`	`WaitForXXX` family	Around synchronization wait failures
`Registry`	Registry access	Reading/writing settings and driver-adjacent configuration

In practice, rather than opening everything at once, the key is to start narrow, with the classes closest to the failure path you want to look at this time.

5.3. How to Apply It in Practice

As a command-line sketch, it looks like this, for example.

appverif /verify CameraHarness.exe
appverif /verify CameraHarness.exe /faults
appverif -enable lowres -for CameraHarness.exe -with heap_alloc=20000 virtual_alloc=20000 file=20000 event=20000
appverif -query lowres -for CameraHarness.exe

The approach goes like this.

First run the normal path with Basics alone
Then add Low Resource Simulation and run with fault injection
If needed, assign probabilities only to the failures you want to see, such as file or event
If you want to target a specific DLL, scope the injection to that DLL

The /faults shortcut is convenient, but on its own it is centered on OLE_ALLOC and HEAP_ALLOC. If you want to look at the failure paths of CreateFile or CreateEvent, it is more reliable to spell out -enable lowres -with file=... event=....

In equipment control apps, it is often easier to read results when you scope to the camera wrapper or the save-path DLL, rather than scattering faults across the whole app.

For example, you can build scenarios like these.

CreateEvent failure right after a reconnect starts
CreateFile failure at the start of saving
Temporary buffer allocation failure
SysAllocString failure during COM conversion
Verifying the failure paths of the wait APIs

These are practically never reached by routine normal-path testing alone. That is exactly why deliberately stepping on them is worth it.

6. How to Look at Handle Anomalies

6.1. The `Handles` Check

For everything handle-related, start with Handles. This makes the use of invalid handles easier to detect.

The accidents it typically catches are these.

Using a handle again after it was closed
Passing a corrupted handle value
Using a handle left uninitialized by a partial failure
A broken lifetime leading to access from another thread

Where long-run operation would only show “an odd error appears occasionally,” under the verifier it can stop right on the spot. This front-loading helps a great deal.

6.2. Viewing Open / Close Stacks with `!htrace`

What makes Handles so welcome is that it pairs well with handle tracing.

windbg -xd av -xd ch -xd sov CameraHarness.exe
!avrf
!htrace 0x00000ABC

What you want to see with !htrace is roughly this.

Where that handle was opened
Where it was closed
Whether it was referenced as an invalid handle
Whether opens are piling up more than expected

What makes handle leaks and handle misuse troublesome is that the API that finally fell over is not the true cause. With !htrace, you can trace that handle’s history quite concretely.

6.3. How to Combine It with Your Own Logs

That said, Application Verifier alone is not enough. In particular, doing the leak investigation of a long-running resident EXE with it alone is quite painful.

So in practice we combine the following.

Periodic Handle Count
sessionId
resourceId
phase
Lifecycle logs of create/open and close/dispose
Dumps and debugger output at verifier stops

With this, you can chase the problem like so, for example.

The heartbeat shows the slope of Handle Count is suspicious
The lifecycle logs narrow down the resource that has a Create but no Close
A verifier run surfaces the invalid handle or misuse ahead of time
!htrace shows the open / close stacks

This combination makes things dramatically easier to chase.

7. How to Build a Failure-Path Test Foundation

7.1. Move the Execution Unit into a Harness

Application Verifier cannot be enabled retroactively on an already-running process. You configure first, then launch.

Moreover, the settings persist until you explicitly remove them. So in practice, it is easier to handle if you target a test harness EXE rather than the production app itself.

For example, a configuration like this.

flowchart LR
    A[Scenario Runner] --> B[CameraHarness.exe]
    B --> C[CameraSdkWrapper.dll]
    C --> D[Vendor SDK]
    B --> E[Structured Log]
    B --> F[Dump / Debugger]

With this, you get the advantages of:

Running one scenario per process
Leak deltas being easy to read
Easy toggling of the AppVerifier settings ON/OFF
Being able to test DLLs through the EXE side

The commands look like this.

appverif /verify CameraHarness.exe
appverif /n CameraHarness.exe

Enable before launch; disable explicitly. Running this with a harness as the premise also helps prevent configuration accidents.

In a failure-path test foundation, it is better not to do everything in one run. Splitting into roughly these three tracks keeps things readable.

Normal path + Basics
- Inject no failures
- Confirm that no verifier stops occur
Fault injection track
- Low Resource Simulation
- Target failures at event / file / heap_alloc / virtual_alloc, etc.
Heap deep-dive track
- Heaps
- full page heap
- Reproduce locally under the debugger

Splitting these keeps “is it broken under normal usage” and “does it only break under low resources” from getting tangled.

The presence or absence of fault injection in particular changes the code paths taken considerably. So you should run both the no-fault run and the with-fault run.

7.3. What to Collect

At minimum, you want to capture these.

Category	What you want
App logs	`cameraId`, `sessionId`, `phase`, `handleCount`, `error code`
Process state	`Handle Count`, `Private Bytes`, `Thread Count`
Debugger info	`!avrf`, `!htrace`, and `!heap -p -a` as needed
Dumps	At verifier stops, or on abnormal termination
AppVerifier logs	Records of stops, exported to XML for aggregation if needed

If needed, the AppVerifier-side logs can also be exported to XML and aggregated. But the cause rarely closes from those alone, so the practical premise is reading them side by side with your own logs.

A large volume of logs is not, in itself, a virtue. What matters is that the causality can be connected later.

7.4. Acceptance Criteria

“It didn’t crash” is also too weak as an acceptance criterion. In this context, we needed at least the following.

No verifier stops in the normal path + Basics run
Even with fault injection, the expected failures remain in the logs
Half-initialized resources get cleaned up properly
After reconnect / retry, Handle Count returns near the baseline
When a verifier stop occurs, it can be traced via sessionId / phase / stack
No failure ends up as “no idea what happened”

What matters here is to evaluate not breaking and being traceable when broken as separate things.

7.5. Caveats

Application Verifier is quite convenient, but it is not magic.

Code paths not actually exercised are not verified
Full page heap is heavy
Stops can also occur inside third-party SDKs
The code paths taken differ considerably with and without fault injection
It is not a single tool for investigating pure managed heap leaks

So its position is this.

Long-run slopes: your own logs and counters
Native-boundary misuse: Application Verifier
Reconstructing causality on failure: structured logs + dumps + debugger

This division of labor is the most practical.

8. A Rough Decision Guide

Invalid handles or double closes are suspected
- Handles + !htrace
Heap corruption / use-after-free is suspected
- Heaps + full page heap + !heap -p -a
You want to trigger memory- or resource-exhaustion-like phenomena
- Low Resource Simulation
Things break gradually under long-running operation
- Start with your own Handle Count / Private Bytes / lifecycle logs
You want to test a DLL
- Enable Application Verifier on the harness EXE that calls that DLL

Turning everything on from the start usually just produces a fog of logs. Applying the blade closest to the failure path you want to see is far clearer.

9. Summary

Application Verifier’s position is that of a runtime verifier for Windows’ native / Win32 boundary. Using Handles / Heaps / Locks / Memory / TLS / Low Resource Simulation and the rest, you can force rarely seen failure paths to be exercised ahead of time.

What paid off in this context was that handle anomalies became easy to trace with !htrace when they occurred, that memory- and resource-exhaustion-like phenomena could be triggered without wrecking the whole machine, and that we could confirm whether our own logs would genuinely be useful at that moment.

As for how to run it in practice: split the normal path + Basics run from the fault injection runs, prepare a harness EXE, and cycle scenarios through short-lived processes. On top of that, combine it with your own logs, dumps, and debugger information, while watching the slope of long-run leaks itself with your own counters — that is the division of labor.

Application Verifier is a tool for “going out to meet” rare anomalies, rather than “waiting around” for them to happen.

In equipment control apps, not breaking matters, but being able to explain what happened when things break matters just as much. In that sense, we think it is a thoroughly practical tool.

Part 1: When an Industrial Camera Control App Suddenly Crashes After One Month (Part 1) - Finding Handle Leaks and Designing Logs for Long-Running Operation

10. References

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

Investigating Long-Run Crashes of an Industrial Camera App - The Handle Leak (Part 1)

How to look at a Windows app that suddenly crashes after long-running operation, using a case study of an industrial camera control app, ...

Read Article

Why TCP Retransmissions Stall Industrial Camera Communication, and How to Isolate Them

How to isolate the cause when industrial camera communication stalls for several seconds due to TCP retransmissions, covering packet loss...

Read Article

Windows App Outsourcing and Contract Development: What to Sort Out Before You Ask

Before commissioning Windows app outsourcing or contract development, here is how to sort out existing software modification, device inte...

Read Article

Designing Windows Apps to Leave Logs and Dumps When They Crash

How to combine regular logging, a final crash marker, WER LocalDumps, and a watchdog process so that even when a Windows app dies from an...

Read Article

An Introduction to Collecting Windows Crash Dumps - WER/ProcDump/WinDbg

To chase hard-to-reproduce Windows application crashes, we walk through when to use WER LocalDumps, ProcDump, MiniDumpWriteDump, and WinD...

Read Article

Related Case Study

This case-study page shows a similar structure for diagnosis, prioritization, or redesign.

Failure-Path Test Infrastructure with Application Verifier

Case-study page for building a failure-path testing foundation that makes future investigation easier.

View Case Study

Where This Topic Connects

This article connects naturally to the following service pages.

Bug Investigation & Root Cause Analysis

Application Verifier and failure-path test foundations are a central theme of our bug investigation and root-cause analysis service, which advances failure reproduction and cause identification.

View Service Contact

Technical Consulting & Design Review

If you want to sort out how far failure-path testing and observation points should be woven into your design, this can be explored as a technical consulting and design review engagement.

View Service Contact

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

View Profile Contact

Public links

GitHub LinkedIn X COM_BLAS COM_BigDecimal

Table of Contents

1. The Conclusion First (In One Line)

2. What Is Application Verifier?

2.1. In One Sentence

2.2. Where It Shines

2.3. What You Gain

3. What Application Verifier Can Do

3.1. Basics: Handles / Heaps / Locks / Memory / TLS, etc.

3.2. Low Resource Simulation: Front-Loading Memory and Resource Exhaustion

3.3. Page Heap and the Debugger

3.4. !avrf / !htrace / Logs

4. Why We Introduced It This Time

4.1. The Goal Is Not Just “Finding Bugs”

4.2. Triggering Memory-Exhaustion-Like Phenomena

4.3. Verifying We Can Trace Handle Anomalies When They Occur

5. How to Trigger Memory- and Resource-Exhaustion-Like Phenomena

5.1. The Idea Behind Low Resource Simulation

5.2. What You Can Make Fail

5.3. How to Apply It in Practice

6. How to Look at Handle Anomalies

6.1. The Handles Check

6.2. Viewing Open / Close Stacks with !htrace

6.3. How to Combine It with Your Own Logs

7. How to Build a Failure-Path Test Foundation

7.1. Move the Execution Unit into a Harness

7.2. Split the Test Menu

7.3. What to Collect

7.4. Acceptance Criteria

7.5. Caveats

8. A Rough Decision Guide

9. Summary

10. References

Related Articles

Related Topics

Related Case Study

Where This Topic Connects

Author Profile

Go Komura

3.4. `!avrf` / `!htrace` / Logs

6.1. The `Handles` Check

6.2. Viewing Open / Close Stacks with `!htrace`