A Decision Table for Whether to Exit or Continue After an Unexpected Exception

Download the Excel checklist with Japanese and English sheets

When the topic of unexpected exceptions comes up, it is tempting to frame it as a binary choice: crash, or catch and keep going. In practice, though, that framing is a little crude.

What you really want to know is whether you can contain the range of what may have been corrupted.

Can you fail just that one operation and stop there?
Is it enough to reinitialize just that screen / connection / worker?
Or is the integrity of the entire process now in question?

Looking at it in that order makes things much easier to sort out.

In this article, assuming C# / .NET Windows apps, resident apps, Windows services, and device-integration tools, we put together a decision table for the conditions under which it is acceptable to continue after an unexpected exception, and the conditions under which it is better to exit.

1. The Conclusion First

Swallowing everything with catch (Exception) and carrying on is dangerous in most cases.
Continuing is acceptable only when three things hold together: you can discard the failed unit, you can restore shared state, and you can account for external side effects.
If the processing boundary is clear—one UI operation, one input record, one job—continuation is sometimes possible.
Conversely, if shared mutable state, resident loops, the main thread, startup code, native boundaries, or signs of memory corruption are involved, lean toward exiting.
Exceptions that call the health of the entire process into question—StackOverflowException, AccessViolationException, OutOfMemoryException—are safer not to treat as something you can continue from.
WPF and Windows Forms do offer ways to catch unhandled exceptions and appear to keep running, but being able to continue and being safe to continue are different things.
For long-running services and monitoring apps, crashing and being restarted is often safer—and easier to diagnose—than limping along half-broken.

In short, the axis of the decision is whether you can restore your invariants.

2. What “Unexpected Exception” Means in This Article

2.1 Separating Expected from Unexpected

First, a rare exception and an unexpected exception are not the same thing.

For example, these can be treated as expected even if they are infrequent:

The user selected a file that does not exist
The remote endpoint timed out temporarily
One row of an imported CSV was malformed
An OperationCanceledException was thrown by a cancel operation
A business-rule violation should fail just that one operation

These are the kind of failures whose handling can be decided up front in the design.

By contrast, the unexpected exceptions this article mainly deals with look like this:

An assumption in your own code broke and a NullReferenceException or InvalidOperationException was thrown
An exception flew out mid-update of shared state, and it is unclear how much was applied
The parent loop of a monitoring loop or message-processing loop died
Something went wrong at a COM / P/Invoke / vendor SDK boundary
The process itself fails its health check, as with AccessViolationException or StackOverflowException

In other words, these are cases where “after this exception, you no longer know whether the app’s state can still be trusted.”

2.2 It Looks Like Two Choices, But There Are Really Three

The culprit that makes this discussion confusing is treating “continue” as a single option.

In practice, it usually breaks down into three levels.

Choice	Meaning
Fail only that operation and continue	Keep the screen, but treat just this save or import as failed
Stop only the subsystem and continue	Reinitialize only the connection, screen, worker, or child process
Exit the process	The extent of state corruption cannot be determined, so assume a restart

Saying “the app continues” covers two very different things: carrying on as if nothing happened, and continuing after isolating the broken part.

3. The Decision Table to Look at First

3.1 The Big Picture

Start with this table and the general direction is usually settled.

Situation	First choice	Reason
Only one input, one screen operation, or one job failed, and its state can be discarded	Lean toward continuing	The failed unit can be contained
After the exception, the affected object or connection can be disposed and recreated	Lean toward subsystem reinitialization	The damaged area can be localized
Shared state was partially updated and it is unclear how much was applied	Lean toward exiting	Invariants may have been broken
External side effects—DB / files / device commands—are half-done and you cannot account for duplicates or missing writes	Lean toward exiting	Consistency with the outside world cannot be determined
The monitoring loop, reconnection loop, or parent message-processing loop died from an unexpected exception	Lean toward exiting	Silently losing part of the functionality tends to create a zombie process
Startup, configuration loading, DI composition, or initialization of a required dependency failed	Lean toward exiting as a startup failure	Starting half-initialized is more dangerous
`AccessViolationException`, `StackOverflowException`, a severe `OutOfMemoryException`, or signs of corruption on the native side	Lean toward immediate exit	The health of the entire process is in question
The dangerous work is isolated in a separate process and the parent process is untouched	Parent continues, restart the child	The fault domain is already isolated

flowchart TD
    A["Unexpected exception"] --> B{"Signs of memory corruption / stack exhaustion / fatal resource exhaustion?"}
    B -- "Yes" --> Z["Exit / FailFast / restart"]
    B -- "No" --> C{"Can the failed unit be discarded?"}
    C -- "No" --> Y["Lean toward exiting"]
    C -- "Yes" --> D{"Can shared state be rolled back / reinitialized?"}
    D -- "No" --> X["Stop the subsystem or exit"]
    D -- "Yes" --> E{"Can external side effects be accounted for?"}
    E -- "No" --> X
    E -- "Yes" --> W["Continue, failing only that operation"]

3.2 What to Check Before the Exception Type

It is better not to decide on the exception type alone. These are the things to check first.

Aspect	What to confirm
Where it happened	A UI event, a single job, a parent loop, startup code, or a native boundary
How far it got	Whether in-memory state, the DB, files, or device state changed partway through
Possible blast radius	Just that object, the whole screen, or the whole process
Rollback possible?	Can it be disposed and recreated, or rolled back with a transaction
External side effects	Sent or not sent, whether double execution is safe, whether compensation is possible
Monitoring / restart	Whether there is automatic restart or a recovery path after exiting

3.3 High-Risk Exceptions

You do not need to go through every exception type in detail, but some should never be viewed with continuation in mind.

Exception / symptom	First choice	Why it matters
`StackOverflowException`	Lean toward immediate exit	The call stack has collapsed; normal recovery cannot be assumed
`AccessViolationException`	Lean toward immediate exit	Illegal access to protected memory; native boundaries or memory corruption are suspect
`OutOfMemoryException`	Lean toward exiting	Recovery code that itself needs further allocations tends to be unstable
Unexpected `NullReferenceException` / `InvalidOperationException`	Context-dependent, but lean toward exiting	Your own assumptions broke, and partial changes may remain
An unexpected exception that escaped a parent loop	Lean toward exiting	The core of the feature is dead while the process risks staying alive
Failures originating in COM / P/Invoke / vendor SDK callbacks	Immediate exit to strongly exit-leaning	Safety is hard to judge from the managed side alone

4. Deciding by Where It Happened

4.1 UI Events

UI events such as a button click, screen navigation, search, or file selection have relatively large room for continuation. There are conditions, however.

Continuation is easier in cases like these:

The failure happened before loading, and business state has not been touched yet
Only transient state inside a dialog is broken, and closing it discards everything
The ViewModel or connection can be recreated after the exception
You can honestly tell the user “this operation failed”

Conversely, you should lean toward exiting once things look like this:

Both the screen and the domain state were partially updated
Shared state visible to other screens—static / singleton / caches—was touched
After the exception, button enablement or selection state is left over and consistency is unclear
An unexpected exception occurred on the UI thread, and it is unclear how far rendering or notifications progressed

4.2 Jobs / Requests Processed One at a Time

This is a boundary where continuation is easy.

One message
One file
One HTTP request
One import job
One batch item

If units like these are well defined, you can fail just that one item and move on to the next.

There are prerequisites, though:

The unit of failure is clear from the outside
Partial changes are tidied up by transactions or compensation
Running the same processing again does not corrupt the result
Failures can be routed to a quarantine queue or error log

4.3 Resident Loops / Monitoring / Queue Processing

This is the worst place to continue carelessly.

For example:

Reconnection loops
Monitoring loops
Queue-consumption loops
Periodic polling
Device status monitoring
Background processing in a tray app

The scary failure mode here is that the parent loop dies from a single unexpected exception while the process alone survives.

Here it pays to split the policy:

Catch expected exceptions at the boundary of each item’s processing
If an unexpected exception escapes the parent loop, lean toward terminating the process

4.4 Startup

Treating a startup failure as “start up anyway and figure it out later” almost always ends in tears.

Required configuration cannot be read
Version migration failed
A required folder or certificate is missing
Initialization of a core service failed
The dependency configuration is broken

In cases like these, exiting as a startup failure is the clearer choice.

4.5 Native Boundaries / COM / P/Invoke / unsafe

This area deserves its own category and a somewhat stricter eye.

COM
P/Invoke
Code beyond C++/CLI
Vendor SDKs
Native-side code coming back through callbacks
Anything involving unsafe

Lean toward exiting especially when you see any of these:

AccessViolationException
Symptoms suggesting heap corruption or a double free
Handle anomalies, signs of use-after-free
Sudden death at a callback boundary

5. Conditions Under Which Continuing Is Acceptable

Summarized, the conditions under which continuation is acceptable look like this. The premise is that most of them hold at the same time.

Condition	Meaning
The unit of failure is clear	You know what to discard: one operation, one screen, one job, one connection
State can be discarded	It can be disposed and recreated, or treated as never applied
Shared state is protected	The contamination does not spread to other features
External side effects can be accounted for	You know whether it was sent / not sent / safe to resend
You can be honest with the user	You can display “this operation failed”
It is observable	Logs, metrics, and dumps allow follow-up investigation

6. Conditions Under Which Exiting Is Better

Conversely, if any of these apply, lean toward exiting.

You do not know what was changed partway through
Shared mutable state was touched and consistency cannot be determined
Lifetime management of locks, queues, threads, or monitoring loops is broken
Duplicated / missing / half-done external side effects cannot be accounted for
Startup or initialization of core infrastructure failed
Native boundaries or memory corruption are suspect

At this level, engineering for an easy recovery after crashing beats engineering a graceful continuation.

7. Recommendations by Typical Pattern

Pattern	Recommendation	Reason
A nonexistent path was specified via the file-open button	Continue, failing only that operation	The state damage is local
Only one row of a CSV import was malformed	Continue with one row failed or one file failed	The unit of failure is easy to contain
An unexpected `NullReferenceException` occurred midway through saving a screen	Recreate the screen, leaning toward exit	It is unclear how much of the ViewModel / business state changed
One queue message violated a business rule	Continue, failing only that message	It can be routed to a quarantine queue
The parent queue-consumption loop died from an unexpected exception	Lean toward exiting the process	The lifetime of the entire worker is broken
Required configuration cannot be read at startup	Exit as a startup failure	A half-initialized start is more dangerous
An `AccessViolationException` around a vendor SDK callback	Lean toward immediate exit	The possibility of memory corruption cannot be ignored
Only a non-essential telemetry send failed	Disable just that feature and continue	The fault domain can be separated from the main functionality

8. Common Anti-Patterns

8.1 `catch (Exception)` That Just Logs and Continues

This is quite dangerous. It hides the cause while keeping the broken state alive.

8.2 Trying to Recover in the Last-Chance Unhandled-Exception Handler

AppDomain.UnhandledException, Application.ThreadException, DispatcherUnhandledException, and the like are useful as the place to record things last, but they are not magic recovery points.

8.3 Casually Retrying When External Side Effects Are Involved

If you retry device commands, email sends, billing, file moves, or DB updates without re-execution safety, double-execution incidents become the new headline.

8.4 Keeping the UI Alive After the Monitoring Loop Died

An app that looks alive but is doing no work is a serious nuisance.

8.5 Saying “We Don’t Want It to Crash” Without Designing for Crashes

If you do not want it to crash, there are things to put in place first.

Automatic restart
Session restore
Saving intermediate results
Re-execution safety
Fault-domain isolation

9. Points to Sort Out at Implementation Time

9.1 Push catch Sites to Boundaries

Rather than catching everything in deep layers, it is easier to keep things organized by catching at places where a unit of failure can be defined, such as:

UI operation boundaries
Per-request boundaries
Per-job boundaries
Per-connection boundaries
The process boundary

9.2 Separate Expected from Unexpected Exceptions

Expected: validation, not found, timeout, cancel, business-rule violations
Unexpected: broken assumptions, escapes from parent loops, native-boundary failures, signs of memory corruption

9.3 Keep Shared State Small

The larger your shared mutable state, the harder the continuation decision becomes. Conversely, the more you can confine state inside one screen, one session, one worker, the easier it is to confine failures as well.

9.4 Move Dangerous Work to a Separate Process

For anything where you do not want a crash to spread—COM / ActiveX / vendor SDKs / unsafe code / heavy image processing / external device control—putting it in a separate process pays off considerably.

9.5 Unhandled-Exception Handlers Are for “Recording,” Not “Recovery”

Exception details
The operation context
The last important log entries
Configuration / version / connection targets
A path to collecting dumps

Getting these in place and prioritizing a setup where you can investigate after the crash leads to better stability in the end.

9.6 Do Not Over-Trust the WPF / WinForms Unhandled-Exception Events

In WPF, setting Handled = true in DispatcherUnhandledException does let you keep running after an unhandled exception. In Windows Forms, on the main UI thread, Application.ThreadException and the SetUnhandledExceptionMode setting let you choose how the app stops.

But whether you can keep running and whether the conditions for recovery are met are separate questions.

10. Summary

When an unexpected exception occurs, the question to ask is not “can this exception be caught” but whether the app’s state can still be trusted afterward.

As a decision sequence, this is usually enough:

Can the failed unit be discarded?
Can shared state be restored or recreated?
Can external side effects be accounted for?
Can the health of memory / threads / native boundaries be trusted?

If you are confident in all four, you can continue. If you are not, lean toward exiting.

Especially for long-running apps, monitoring apps, services, and device integration, there are plenty of situations where staying alive broken is more dangerous than crashing honestly.

Exception handling is not the art of never crashing. It is designing so that failures stay small, the app stops honestly when broken, and recovery is easy.

11. References

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

Where Should catch and Logging Go in Exception Handling?

To avoid broad catches in deep helpers, duplicate logs at every layer, and result-mapping that hides root causes, we organize the respons...

Read Article

A Minimum Security Checklist for Windows App Development

A checklist-style guide to the security basics for WPF / WinForms / WinUI / C++ / C# business apps: privileges, signing, updates, secrets...

Read Article

Minimum Requirements for a Custom Logger, with an Integration Test Checklist

To make a custom app's diagnostic logs trustworthy, we lay out UTF-8 JSON Lines, the required fields, flush, rotation, and failure behavi...

Read Article

Where to Draw the Line Between Unit Tests and Integration Tests

We organize the boundary between unit tests and integration tests along the axes of pure logic, formats, wiring, environment differences,...

Read Article

Designing Windows Apps to Leave Logs and Dumps When They Crash

How to combine regular logging, a final crash marker, WER LocalDumps, and a watchdog process so that even when a Windows app dies from an...

Read Article

Where This Topic Connects

This article connects naturally to the following service pages.

Technical Consulting & Design Review

This topic covers exception-handling policy, fault boundaries, restart strategy, and criteria for deciding whether to continue, so it pairs well with technical consulting and design reviews.

View Service Contact

Bug Investigation & Root Cause Analysis

Working out whether to continue or exit after an unexpected exception—including state corruption and external side effects—maps naturally onto bug investigation and root-cause analysis.

View Service Contact

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

View Profile Contact

Public links

GitHub LinkedIn X COM_BLAS COM_BigDecimal

1. The Conclusion First

2. What “Unexpected Exception” Means in This Article

2.1 Separating Expected from Unexpected

2.2 It Looks Like Two Choices, But There Are Really Three

3. The Decision Table to Look at First

3.1 The Big Picture

3.2 What to Check Before the Exception Type

3.3 High-Risk Exceptions

4. Deciding by Where It Happened

4.1 UI Events

4.2 Jobs / Requests Processed One at a Time

4.3 Resident Loops / Monitoring / Queue Processing

4.4 Startup

4.5 Native Boundaries / COM / P/Invoke / unsafe

5. Conditions Under Which Continuing Is Acceptable

6. Conditions Under Which Exiting Is Better

7. Recommendations by Typical Pattern

8. Common Anti-Patterns

8.1 catch (Exception) That Just Logs and Continues

8.2 Trying to Recover in the Last-Chance Unhandled-Exception Handler

8.3 Casually Retrying When External Side Effects Are Involved

8.4 Keeping the UI Alive After the Monitoring Loop Died

8.5 Saying “We Don’t Want It to Crash” Without Designing for Crashes

9. Points to Sort Out at Implementation Time

9.1 Push catch Sites to Boundaries

9.2 Separate Expected from Unexpected Exceptions

9.3 Keep Shared State Small

9.4 Move Dangerous Work to a Separate Process

9.5 Unhandled-Exception Handlers Are for “Recording,” Not “Recovery”

9.6 Do Not Over-Trust the WPF / WinForms Unhandled-Exception Events

10. Summary

11. References

Related Articles

Related Topics

Where This Topic Connects

Author Profile

Go Komura

8.1 `catch (Exception)` That Just Logs and Continues