Telling GC Lag from a Memory Leak in .NET — A Practical Procedure for Observing, Comparing, and Proving Memory Growth

· · .NET, C#, GC, Memory Leak, Diagnostics, dotnet-counters, dotnet-dump, Operations, Legacy Asset Reuse

1. What to Understand First

When operating a .NET application, you sometimes see memory usage creep steadily upward.

Task Manager or top shows the process’s memory growing. The container’s memory usage is growing too. The Working Set or RSS graph in monitoring trends up and to the right.

Seeing this, the immediate temptation is to think “memory leak.” But in .NET, a process’s memory growing and a memory leak are not the same thing.

.NET has garbage collection. Memory is not returned to the OS the instant an object becomes unnecessary. The GC operates based on allocation patterns, heap thresholds, memory pressure, generations, and workload conditions.

As a result, states like these occur:

  • Objects that are no longer needed have not been collected yet
  • The GC has run, but the process’s Working Set does not drop right away
  • Memory grows once due to first access, JIT, caches, or connection pools, then stabilizes
  • The managed heap is stable, but native memory, threads, sockets, or an image processing library is growing
  • Objects that genuinely should be unnecessary are still being referenced from somewhere

This article covers how to identify the last case — the genuine leak. What you need to look at is not raw memory usage, but these three things:

  1. Is the memory that survives GC growing?
  2. Which types are growing?
  3. Who is referencing those objects?

A .NET memory leak investigation is the work of going beyond “memory is growing” to “objects of this type are growing, and they are still referenced from this root.”

The code in this article is published on GitHub as a complete buildable, runnable sample set (a library of typical leak patterns, a demo observing the difference between GC lag and survivors, and unit tests verifying retention and collection with WeakReference).

dotnet-gc-or-memory-leak - komurasoft-blog-samples (GitHub)

2. First, Agree on What “Memory Leak” Means

A memory leak in .NET is not only the C/C++ shape of “forgot to free allocated memory.”

In managed code, the GC collects objects. Whether the GC can collect an object is determined by whether any reachable reference to it still exists.

In other words, the typical .NET memory leak is this:

An object is no longer needed by the business logic, yet it is still referenced — from a static field, a cache, an event, a Timer, a collection, a DI lifetime, an async context — so from the GC’s point of view it still looks in use.

The GC is smart, but it cannot know whether something is needed by the business. If it is referenced, the GC judges it alive.

That is why, in .NET, it is clearer to think of “unintended retention” rather than “leak.”

On the other hand, the following states do not immediately indicate a memory leak:

State Why it is not necessarily a leak
Working Set / RSS is growing This is memory the OS has assigned to the process; it does not match the amount of live objects on the managed heap
Total Allocated is growing This is the cumulative amount allocated since startup; it grows whenever the app does anything
GC Heap spikes momentarily Uncollected objects may simply remain until the next GC
Memory grows right after startup Commonly caused by JIT, type loading, initial caches, connection pools, template expansion
LOH is large May reflect reuse of large arrays and buffers, fragmentation, or pooling strategy
Memory does not go down Even after the GC collects, the process does not necessarily return memory to the OS immediately

Conversely, the more of the following you observe together, the stronger the suspicion of a memory leak:

Observation Meaning
The post-GC heap grows every time you repeat the same operation Surviving objects are increasing
Gen 2 or LOH size keeps growing Long-lived objects, or large objects, are sticking around
The same type’s Count / Size grows across multiple dumps You can identify the growing type
gcroot shows references from statics, events, caches, or long-lived services You can explain why the GC cannot collect
Memory does not return after stopping load, even after enough time or a verification GC Likely not just temporary allocation

3. Distinguish Which “Memory” You Are Looking At

The first source of confusion in a memory investigation is that different memory metrics get mixed up. They are all “memory,” but they mean different things.

Metric What it measures How to read it
Working Set / RSS The process’s pages resident in physical memory The OS’s view of memory; not the GC heap itself
Private Bytes / Commit Committed memory privately owned by the process Also includes native memory, stacks, JIT code, GC segments
GC Heap Size The amount of objects on the managed heap The entry point for looking at GC-managed memory in .NET
Total Allocated Cumulative amount allocated since startup Always grows; never use it alone to judge a leak
Gen 0 / Gen 1 / Gen 2 Heaps by generation What remains in Gen 2 is long-lived
LOH The heap for large objects of 85,000 bytes or more Tends to grow with large arrays, strings, buffers
POH The heap for pinned objects A clue for the effects of native interop and pinning
Finalization Queue Objects awaiting finalization A clue for missed Dispose calls or a clogged finalizer

You do not need to examine all of these in detail from the start. First, break the question down:

The process's memory is growing
  ↓
Is the managed heap also growing?
  ↓
Is the amount that survives GC growing?
  ↓
Which types are growing?
  ↓
Who is referencing them?

Following this order makes it much harder to confuse “memory growth that merely looks bad” with “a real leak.”

4. The Decision Flow

In practice, the following flow makes the triage easier:

1. Define the reproduction conditions
   - Which API, screen, job, or batch makes it grow
   - How many executions make it grow
   - What happens when the load stops

2. Look at trends with dotnet-counters
   - Working Set
   - GC Heap
   - Gen 2 / LOH
   - Total Allocated
   - GC counts

3. Compare over time
   - Right after startup
   - After warm-up
   - Under load
   - After load stops
   - After repeating the same operation N times

4. Take two or more dumps
   - before
   - after
   - if possible, also after load stops

5. Find the growing types
   - dumpheap -stat
   - gcdump report
   - Visual Studio / PerfView

6. Check who references them
   - gcroot
   - gchandles
   - finalizequeue

7. Make a determination
   - Waiting on GC
   - Normal cache growth
   - Managed memory leak
   - Native memory problem
   - LOH fragmentation or temporary large allocations

What matters is not judging from a single number. A memory leak is “a trend of continuous growth,” so compare over time under the same conditions, not at a single point.

5. Tools to Use

This article mainly uses these tools:

Tool When to use it
dotnet-counters View GC and Working Set trends in a running process
dotnet-gcdump Take lightweight statistics of live managed objects
dotnet-dump Inspect the heap in detail and trace references with dumpheap and gcroot
Visual Studio Memory Usage When you want GUI comparison on Windows
PerfView When you want deep GC / heap / trace analysis on Windows
dotnet-trace When you want to follow allocations and GC events over time

First, install the CLI tools.

dotnet tool install --global dotnet-counters
dotnet tool install --global dotnet-dump
dotnet tool install --global dotnet-gcdump
dotnet tool install --global dotnet-trace

If they are already installed, update them.

dotnet tool update --global dotnet-counters
dotnet tool update --global dotnet-dump
dotnet tool update --global dotnet-gcdump
dotnet tool update --global dotnet-trace

Find the process to investigate.

dotnet-counters ps

In the examples that follow, the target process ID is written as <PID>.

On Linux, macOS, and in container environments, the diagnostic tools must run as the same user as the target process. Depending on the environment, you may also be affected by TMPDIR, the diagnostic port, and the container’s PID namespace.

When working against production, do not jump straight to taking dumps — first verify the load and impact in a test environment.

The first thing to look at is not a detailed dump, but the trend.

dotnet-counters monitor \
  --process-id <PID> \
  --refresh-interval 3 \
  --counters System.Runtime

The output varies slightly by .NET version. On .NET 9 and later, counters may appear under the System.Runtime Meter names; on .NET 8 and earlier, under the traditional EventCounter names.

The main items to watch:

Item What to look at
dotnet.process.memory.working_set The process’s resident memory from the OS’s perspective
dotnet.gc.last_collection.heap.size Per-generation heap sizes after the most recent GC
dotnet.gc.last_collection.memory.committed_size The amount of memory the GC has committed
dotnet.gc.heap.total_allocated Cumulative allocations since startup
dotnet.gc.collections GC counts by generation
dotnet.gc.pause.time Cumulative GC pause time

You can also narrow the monitoring to specific counters:

dotnet-counters monitor \
  --process-id <PID> \
  --refresh-interval 3 \
  --counters System.Runtime[dotnet.process.memory.working_set,dotnet.gc.last_collection.heap.size,dotnet.gc.last_collection.memory.committed_size,dotnet.gc.heap.total_allocated,dotnet.gc.collections]

To review later, save to CSV:

dotnet-counters collect \
  --process-id <PID> \
  --refresh-interval 5 \
  --format csv \
  --output counters.csv \
  --counters System.Runtime

At this stage, what you want to distinguish are the following patterns.

6.1 Only Total Allocated grows

dotnet.gc.heap.total_allocated is a cumulative value. If the application processes requests, it allocates objects, and even if those objects immediately become garbage and are collected, the cumulative allocation count grows.

So Total Allocated growing on its own does not mean a memory leak. What matters is whether allocations stick around afterward.

Total Allocated: grows
GC Heap Size:    fluctuates somewhat but stays stable
Gen 2 / LOH:     does not keep growing

In this case, it is not a leak — it is an application that allocates a lot.

The remedy is not a leak fix, but allocation reduction: reusing buffers, reviewing heavy LINQ usage, reducing string creation, revisiting serialization code, and so on.

6.2 Working Set grows but GC Heap stays stable

Sometimes Working Set or RSS grows even though the GC Heap is stable. In this case, it is not necessarily a managed object leak.

Possible factors include:

  • JIT-compiled code
  • Loaded assemblies
  • Thread stacks
  • Memory used by native libraries
  • Unmanaged memory such as Marshal.AllocHGlobal
  • Native-side buffers in imaging, compression, cryptography, DB drivers
  • Internal buffers for sockets, file handles, SSL, HTTP/2, gRPC
  • The OS simply not reclaiming physical pages from the process yet

In this state, you can stare at dumpheap all day and not find a culprit.

The rule of thumb:

Working Set / RSS: grows
GC Heap Size:      stable
Gen 2 / LOH:       stable

In this case, suspect native memory, handles, thread count, sockets, and external libraries rather than a .NET managed heap leak.

Do not stop at dotnet-counters — also look at OS tools, container metrics, handle counts, thread counts, the native heap, and external library metrics.

6.3 GC Heap grows under load but returns after load stops

It is natural for the GC Heap to grow under load.

Many requests. Many temporary objects. Handling large JSON. Building temporary lists and arrays.

In such cases, the heap grows until the next GC. When the load stops, a GC runs and the heap may return.

Under load:       GC Heap grows
After load stops: GC Heap drops, or returns to a steady value
After repeats:    the baseline does not keep rising

In this case, you can conclude “it just hadn’t been GC’d yet” or “there are many temporary allocations.”

That said, if temporary allocations under load are excessive, GC counts and pause times rise and become a performance problem. Even if it is not a leak, it is still a target for performance improvement.

6.4 Post-GC Gen 2 / LOH keeps growing

This is the pattern to watch out for:

Repeat the same operation
  ↓
Gen 2 grows
  ↓
LOH grows
  ↓
Stopping the load does not bring it back
  ↓
The next measurement shows it even higher

Gen 2 is the generation where long-lived objects reside. The LOH is where large arrays and strings tend to land.

If these keep growing, suspect a leak, an unbounded cache, retained giant buffers, missed event unsubscriptions, static collections, or retention by long-lived services.

At this stage, move on to the next step.

7. How to Think About “Is It Just Waiting on GC?”

To check whether something “just hasn’t been GC’d yet,” look at the state after the GC has had ample opportunity to run.

However, do not casually put GC.Collect() into production code.

GC.Collect() forces a GC. A full blocking GC across all generations in particular creates application pause time. In normal operation, leaving it to the GC is the rule.

Even so, during an investigation you may check “does it survive a forced GC?” in a controlled verification environment.

In a verification console app or reproduction environment, code like this lets you inspect the state after a full GC:

static void ForceFullGcForDiagnosticsOnly()
{
    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();
}

The point is to never use this as a solution. It is for investigation only.

What you want to confirm is this flow:

Before the operation
  ↓
Repeat the operation N times
  ↓
Stop the load
  ↓
Wait long enough, or induce a full GC in the verification environment
  ↓
Does the post-GC heap return close to the pre-operation value?

If it returns, GC lag or temporary allocation is likely. If it does not return — and the baseline rises every time you repeat the same operation — something is surviving. You then go looking for that “something” in a dump.

8. Lightweight Comparison with dotnet-gcdump

For the initial comparison, dotnet-gcdump is convenient.

dotnet-gcdump captures a GC dump from a running .NET process, useful for viewing per-type statistics of the heap.

dotnet-gcdump collect --process-id <PID> --output before.gcdump

After applying load, capture again.

dotnet-gcdump collect --process-id <PID> --output after.gcdump

You can produce a simple report from the CLI:

dotnet-gcdump report before.gcdump > before-heap.txt
dotnet-gcdump report after.gcdump  > after-heap.txt

What you look at is each type’s Count and Size.

For example, if types like the following grew sharply in after, they become investigation targets:

Size (Bytes)   Count       Type
============   =====       ====
180,000,000    2,000,000   System.String
120,000,000    1,000,000   MyApp.Models.Customer
 90,000,000       25,000   System.Byte[]

What matters is not “the big types” but “the types that grew.”

System.String and System.Byte[] rank high in most applications. Being at the top does not make them the culprit.

The comparison lens:

before → after Reading
Count roughly unchanged That type is probably not the main culprit
Both Count and Size grow A candidate
MyApp.* types grow Suspect retention in business logic
System.Byte[] grows Suspect buffers, serialization, images, compression, HTTP, DB
System.String grows Suspect caches, logging, JSON, dictionary keys, duplicate strings
Task, Timer, CancellationTokenSource grow Suspect async work, Timers, missed cancellations/unsubscriptions

dotnet-gcdump is an easy entry point for comparison, but capturing one induces a Gen 2 GC. In environments with large heaps or strict latency requirements, watch out for pause time and additional memory consumption.

On Windows, you can open .gcdump files in Visual Studio or PerfView and compare them. On non-Windows environments, the practical path is to view type statistics with the CLI report and move to dotnet-dump for digging into references.

9. Inspect the Heap and References with dotnet-dump

Once “the growing types” are visible, the next question is “why are they not being collected?”

For that, capture a dump with dotnet-dump and analyze it with SOS commands.

dotnet-dump collect \
  --process-id <PID> \
  --type Heap \
  --output myapp-1.dmp

Wait a while and capture again.

dotnet-dump collect \
  --process-id <PID> \
  --type Heap \
  --output myapp-2.dmp

Dump capture is a heavy operation. Full / Heap dumps in particular are large and put load on the process and the container. When taking them in production, mind the time of day, disk space, container memory limits, and the possibility of personal or confidential data ending up in the dump.

Analyze the captured dump:

dotnet-dump analyze myapp-2.dmp

First look at overall heap statistics:

> dumpheap -stat

The output is counts and sizes per type:

MT               Count       TotalSize Class Name
00007f...        120000      3840000   MyApp.Models.Order
00007f...        250000      8000000   System.String
00007f...         10000     40000000   System.Byte[]

Narrow down to a specific type:

> dumpheap -stat -type MyApp.Models.Order

Or filter by a specific MethodTable:

> dumpheap -mt <MT>

Once you have an instance address, look up its references:

> gcroot <OBJECT_ADDRESS>

This is the most important part. With gcroot, you confirm why that object is alive.

Say you see a reference path like this:

static MyApp.CustomerCache._items
  -> System.Collections.Concurrent.ConcurrentDictionary<string, Customer>
  -> MyApp.Models.Customer
  -> System.String

In this case, the reason the GC won’t collect is clear: Customer is referenced from a static cache, so from the GC’s perspective it is still in use.

Only now can you make the next set of judgments:

  • Is that cache really needed?
  • Does it have a size limit?
  • Does it have an expiration?
  • Is the design such that keys keep growing?
  • Is it keyed on tenant, user, date, or request ID and growing without bound?

The crucial thing in a leak investigation is not to stop at dumpheap -stat. dumpheap -stat tells you “what there is a lot of”; gcroot tells you “why it remains.” It is the latter that leads to a fix.

10. Quick Reference for Diagnosis

Here are the patterns that come up most in practice:

Observation Possibility What to look at next
Only Total Allocated grows Normal allocation, or excessive allocation Allocation Rate, GC counts, CPU, dotnet-trace
Working Set grows but GC Heap is stable Native memory, JIT, stacks, OS-side retention Thread count, handle count, native tools, external libraries
GC Heap grows only under load, returns after Waiting on GC, temporary allocations Gen 2 / LOH after load stops, GC counts
Post-GC Gen 2 keeps growing Retention of long-lived objects dumpheap -stat, gcroot
LOH keeps growing Large arrays, buffers, fragmentation, giant strings System.Byte[], System.Char[], LOH, Free regions
System.String is large String caches, JSON, logging, dictionary keys Find your own types holding the strings
System.Byte[] is large Buffers, serialization, images, compression, networking Owning types, missed ArrayPool returns, native interop
Task grows Async work that never completes, waiting queues async coordination, cancellation, channels, queues
Timer grows Missed Timer disposal Dispose, deregistration, long-lived services
CancellationTokenSource grows Missed CTS disposal, too many linked tokens Dispose, unlinking, where timeouts are created
EventHandler or delegates remain Missed event unsubscription Lifetime gap between publisher / subscriber
Finalization Queue grows Missed Dispose, clogged finalizer finalizequeue, the finalizer thread
Many pinned handles Pinned buffers, native interop gchandles, POH, pinning locations

11. Common Leak Shapes

11.1 Static collections

The most straightforward shape.

public static class CustomerStore
{
    private static readonly List<Customer> Customers = new();

    public static void Add(Customer customer)
    {
        Customers.Add(customer);
    }
}

In this code, every Customer added to Customers stays around as long as the process lives. Even if it was meant as temporary storage, the GC will not collect anything referenced from a static.

The direction of the fix depends on the use case:

  • Impose a size limit
  • Add an expiration
  • Use a caching mechanism such as MemoryCache
  • Remove items explicitly
  • Drop the static and move to a service with the appropriate lifetime
  • If the goal is persistence, move to a DB or external storage

The point is not that “statics are bad” — it is understanding and respecting the property that whatever you put in a static becomes long-lived.

11.2 Unbounded caches

A cache uses memory intentionally, so growth as designed is not a leak. But a cache without a limit or expiration becomes a de facto memory leak.

public sealed class ReportCache
{
    private readonly Dictionary<string, Report> _cache = new();

    public Report GetOrCreate(string userId, DateTime date)
    {
        var key = $"{userId}:{date:O}";

        if (_cache.TryGetValue(key, out var report))
        {
            return report;
        }

        report = BuildReport(userId, date);
        _cache[key] = report;
        return report;
    }
}

In this example, if the combinations of userId and date keep growing, the cache keeps growing too.

Especially dangerous are keys that include values like these:

  • Request IDs
  • The current time
  • GUIDs
  • Session IDs
  • User input used as a string without normalization
  • SQL or search conditions stringified verbatim

For any cache, settle these conditions:

Condition Example
Maximum count Up to 10,000 entries
Maximum size Up to 256MB
Sliding expiration 30 minutes after last access
Absolute expiration 6 hours after creation
Eviction triggers Tenant deletion, user deletion, configuration change
Monitoring items Count, estimated size, hit rate, eviction count

The rule is not “it’s a cache, so it may grow” but “decide how far it may grow.”

11.3 Missed event unsubscriptions

Events leak when a long-lived publisher keeps referencing a short-lived subscriber.

public sealed class OrderViewModel
{
    private readonly OrderService _service;

    public OrderViewModel(OrderService service)
    {
        _service = service;
        _service.OrderChanged += OnOrderChanged;
    }

    private void OnOrderChanged(object? sender, OrderChangedEventArgs e)
    {
        // update view model
    }
}

If OrderService is a singleton and an OrderViewModel is created per screen, the OrderService event keeps referencing the OrderViewModel. Closing the screen does not free the ViewModel unless it unsubscribes.

A fix:

public sealed class OrderViewModel : IDisposable
{
    private readonly OrderService _service;

    public OrderViewModel(OrderService service)
    {
        _service = service;
        _service.OrderChanged += OnOrderChanged;
    }

    public void Dispose()
    {
        _service.OrderChanged -= OnOrderChanged;
    }

    private void OnOrderChanged(object? sender, OrderChangedEventArgs e)
    {
        // update view model
    }
}

In gcroot, this can appear as references via delegates or event handlers.

This pattern shows up frequently in WPF, WinForms, long-lived services, message brokers, and event aggregators.

11.4 Missed Timer disposal

System.Threading.Timer, PeriodicTimer, and Reactive Extensions subscriptions also stick around if not disposed.

public sealed class PollingWorker
{
    private readonly Timer _timer;

    public PollingWorker()
    {
        _timer = new Timer(_ => Poll(), null, TimeSpan.Zero, TimeSpan.FromSeconds(10));
    }

    private void Poll()
    {
        // polling
    }
}

If this PollingWorker is meant to be a temporary object, the design must dispose the Timer.

public sealed class PollingWorker : IDisposable
{
    private readonly Timer _timer;

    public PollingWorker()
    {
        _timer = new Timer(_ => Poll(), null, TimeSpan.Zero, TimeSpan.FromSeconds(10));
    }

    public void Dispose()
    {
        _timer.Dispose();
    }

    private void Poll()
    {
        // polling
    }
}

A Timer holds its callback delegate, and from there a reference chain can lead to the target object.

11.5 Missed IDisposable releases

Missed IDisposable releases do not necessarily show up as managed heap leaks.

They can surface as resource problems with files, sockets, DB connections, native handles, and buffers.

public async Task<string> ReadAsync(string path)
{
    var stream = File.OpenRead(path);
    using var reader = new StreamReader(stream);
    return await reader.ReadToEndAsync();
}

In this example, StreamReader closes the stream, so it is often not a major problem — but in code where ownership is ambiguous, leaks happen.

The rule is to make ownership explicit with using / await using:

public async Task<string> ReadAsync(string path)
{
    await using var stream = File.OpenRead(path);
    using var reader = new StreamReader(stream);
    return await reader.ReadToEndAsync();
}

Missed Dispose shows up as symptoms like these:

  • Handle counts increase
  • Sockets accumulate
  • Files never get closed
  • Native memory grows
  • The Finalization Queue grows
  • Process memory grows while the GC Heap stays stable

In this case, dumpheap alone is not enough. Also look at OS handles, sockets, and the state of external libraries.

11.6 AsyncLocal and context retention

AsyncLocal<T> is convenient, but if what you store in it is large, it can persist for a long time.

A small value like a log correlation ID rarely causes problems. But putting user info, request bodies, large DTOs, or DB contexts in it leads to unintended retention.

public static class RequestContext
{
    public static readonly AsyncLocal<RequestInfo?> Current = new();
}

Because AsyncLocal rides the async flow, it can be harder to spot than a plain static field.

Keep what you store small and well-defined, and consider a design that resets it to null when no longer needed.

11.7 Mismatched DI lifetimes

In DI containers such as ASP.NET Core’s, singleton, scoped, and transient have different lifetimes.

If a long-lived singleton holds per-request data, objects can remain after the request ends.

public sealed class AuditBuffer
{
    private readonly List<RequestAudit> _items = new();

    public void Add(RequestAudit item)
    {
        _items.Add(item);
    }
}

If this is a singleton, _items lives as long as the application.

If buffering is intentional, the design needs a limit, flushing, removal, and backpressure. If it is merely “we might look at it later,” it should go to logs or external storage instead.

12. The LOH Is Especially Easy to Misread

LOH stands for Large Object Heap. In .NET, large objects are placed on a separate heap from ordinary small objects. The classic example is a large array.

var buffer = new byte[1024 * 1024 * 10]; // 10MB

The three common LOH problems:

  1. Creating large objects frequently
  2. Holding large objects for a long time
  3. Fragmentation from creating and discarding large objects

A growing LOH does not immediately mean a leak. With a design that reuses large buffers, it may grow to a certain size and then stabilize — and even when the GC collects, the Working Set does not necessarily drop right away.

However, you should be suspicious of these states:

  • System.Byte[] grows with each operation
  • System.Char[] or giant String instances grow
  • Memory does not return after image, PDF, Excel, ZIP, cryptography, or compression work
  • Arrays from ArrayPool<T>.Rent are not being returned
  • Entire large responses are being loaded into memory
  • Heavy use of MemoryStream.ToArray()

When using ArrayPool<T>, always return what you rent:

var pool = ArrayPool<byte>.Shared;
var buffer = pool.Rent(1024 * 1024);

try
{
    // use buffer
}
finally
{
    pool.Return(buffer);
}

That said, returning to the pool does not necessarily reduce the process’s memory right away. Pools may retain memory for reuse.

Here too, what you should look at is: does it keep growing, is there a limit, and is the memory being reused?

13. How to Read gcroot

gcroot shows where an object is referenced from.

Typical roots, summarized:

Root Meaning
static field Referenced from a type’s static field
local variable / stack Referenced from a running thread’s stack
GC handle Referenced via GCHandle, pinning, delegates, interop
finalization queue Held pending finalization
thread / async state machine Held by running or awaiting async work

The key thing to look for is the lifetime gap:

Long-lived object
  -> object that should be short-lived

When this shape appears, you have a leak candidate.

For example, this is suspicious:

SingletonService
  -> List<RequestContext>
  -> RequestContext
  -> LargeDto

SingletonService lives for the whole application. If per-request RequestContext objects are piling up inside it, the design needs rethinking.

On the other hand, a root like this can be perfectly normal depending on timing:

Thread stack
  -> Controller action local variable
  -> RequestDto

If a request is in flight, of course local variables are still alive.

That is why dump timing matters.

Take dumps not only under load, but also after load stops, after queues drain, and after a period of idleness — that makes the judgment far easier.

14. “It Dropped After a Forced GC” Does Not Mean “Solved”

During an investigation, you call GC.Collect() and memory drops. It is dangerous to conclude “then we should just call GC.Collect() periodically.”

A forced GC does not remove the root cause. It merely collected, on the spot, objects that had not yet been collected.

If a high allocation rate is the problem, forced GC increases pause time and worsens performance. If it is a real leak, objects with surviving references are not collected by a forced GC either.

What to look at during the investigation is this distinction:

After a forced GC Judgment
Drops sharply, then the baseline stays stable GC lag or temporary allocations were the main cause
Drops a little, but the floor rises with each repetition Some objects are surviving. Leak candidate
Barely drops Still referenced, or the main cause is outside the GC heap
GC Heap drops but Working Set does not Possible OS / GC segment / native-side retention

Before scheduling GC.Collect() in production, always identify what is actually growing.

15. A Field Procedure for the Investigation

From here, let’s lay this out as the procedure you actually follow.

15.1 Fix the reproduction scenario

First, pin down the investigation conditions.

Target:          /api/report/export
Operation:       run 100 times with identical conditions
Sample interval: 5 seconds
Window:          5 min warm-up + 10 min load + 5 min idle
Environment:     staging / Release build / production-equivalent settings

You cannot draw conclusions from a memory investigation while doing something different each time. Pin down “what we did when it grew.”

15.2 Take a baseline

Use the post-warm-up state as the baseline, not right after startup.

The reason is that one-time growth happens right after startup:

  • JIT
  • DI container construction
  • Configuration loading
  • First DB connection
  • First TLS / HTTP connection
  • JSON serializer metadata generation
  • Razor / template initialization
  • Logger and metrics initialization

The order:

1. Start the app
2. Hit health checks and representative APIs a few times
3. Wait 1-5 minutes
4. Capture counters and a dump as the baseline

15.3 Collect counters under load

dotnet-counters collect \
  --process-id <PID> \
  --refresh-interval 5 \
  --format csv \
  --output report-export-counters.csv \
  --counters System.Runtime

Run the reproduction in parallel. What you want is the shape of the graph.

Near-normal shape:
  grows under load
  oscillates with GCs
  returns after load stops
  the baseline does not keep rising

Suspicious shape:
  grows in proportion to operation count
  the Gen 2 / LOH floor rises
  does not return after load stops
  the floor rises further with the next load

15.4 Take two dumps

Take them before and after the load.

dotnet-dump collect --process-id <PID> --type Heap --output before.dmp
# apply load
dotnet-dump collect --process-id <PID> --type Heap --output after.dmp

If you have capacity, take one after the load stops too.

# after load stops, queues are empty, and you have waited a while
dotnet-dump collect --process-id <PID> --type Heap --output idle-after.dmp

When comparing, idle-after matters as much as before and after.

Even if memory grew under load, returning after idle may mean it is not a leak.

15.5 Look at the types that grew

dotnet-dump analyze after.dmp
> dumpheap -stat

Inspect the before side the same way. Doing this by hand is fine — start by comparing the top types.

Things to look for:

  • Are types in your own namespaces growing?
  • Is one of your own types hiding behind System.String?
  • Who owns the System.Byte[] instances?
  • Are List<T> or Dictionary<TKey,TValue> growing?
  • Are Task instances or async state machines growing?
  • Are Timer or CancellationTokenSource instances growing?

15.6 Look at the references

Pick up candidate object addresses and run gcroot.

> dumpheap -type MyApp.Models.ReportResult
> gcroot <OBJECT_ADDRESS>

From the gcroot output, find the retaining parent.

MyApp.Services.ReportCache
  -> Dictionary<string, ReportResult>
  -> ReportResult

At this point, the code review targets become visible:

  • Is ReportCache a singleton?
  • Does it have a limit?
  • Do entries get removed?
  • Do the keys keep growing?
  • Is ReportResult too large?
  • Should this go to a DB or files instead of a cache?

16. When to Use dotnet-trace

dotnet-dump is a point-in-time snapshot, suited for seeing “what ended up remaining.” When you instead want to see “when and where the heavy allocations happen,” use dotnet-trace.

For example, trace with GC-related events included:

dotnet-trace collect \
  --process-id <PID> \
  --duration 00:00:01:00 \
  --clrevents gc+gchandle \
  --clreventlevel informational \
  --output gc-trace.nettrace

If you want allocation sampling too, the event volume increases, so start with short runs in a verification environment:

dotnet-trace collect \
  --process-id <PID> \
  --duration 00:00:00:30 \
  --clrevents gc+gcsampledobjectallocationhigh \
  --clreventlevel informational \
  --output allocation-trace.nettrace

Traces help from a different angle than dumps:

What you want to see The right tool
What remains dump / gcdump
Who is referencing it dump + gcroot
When heavy allocations happened trace
When GCs occurred counters / trace
Whether pause time is the problem counters / trace

In a leak investigation, it is efficient to first use dumps to see “what is remaining,” and then, if needed, traces to see “where it is being created.”

17. Expose Verification Metrics in Code

Serious diagnostics should be done with external tools, but having simple diagnostic logging in the application helps.

For example, exposing GC information via an admin endpoint or periodic log:

public static class GcDiagnostics
{
    public static object Snapshot()
    {
        var info = GC.GetGCMemoryInfo();

        return new
        {
            TotalMemory = GC.GetTotalMemory(forceFullCollection: false),
            HeapSizeBytes = info.HeapSizeBytes,
            FragmentedBytes = info.FragmentedBytes,
            MemoryLoadBytes = info.MemoryLoadBytes,
            HighMemoryLoadThresholdBytes = info.HighMemoryLoadThresholdBytes,
            Gen0Collections = GC.CollectionCount(0),
            Gen1Collections = GC.CollectionCount(1),
            Gen2Collections = GC.CollectionCount(2)
        };
    }
}

This information alone cannot determine a leak. But during an incident, it makes the following judgments easier:

  • Is Gen 2 spiking?
  • Is HeapSize growing?
  • Is FragmentedBytes growing?
  • Is the gap between TotalMemory and process memory large?
  • Did the trend change after a deployment?

If you put this in application logs, watch the volume. Heavy diagnostics at high frequency become a load in themselves.

18. The Bar for Declaring “Memory Leak”

At the end of the investigation, you should be able to explain it in this form:

Symptom:
  After running /api/report/export 100 times, the GC Heap stays 300MB higher even after load stops.

Observation:
  In dotnet-counters, Gen 2 heap size grew in proportion to the operation count.
  Not just Working Set — the GC Heap was growing too.

Comparison:
  Comparing before.dmp and after.dmp, MyApp.Models.ReportResult grew by 12,000 instances.

References:
  gcroot showed references from MyApp.Services.ReportCache._items.

Cause:
  ReportCache was a singleton keyed on user ID + current time, with no removal, expiration, or limit.

Fix:
  Replaced with MemoryCache, configured a size limit and expiration.
  Added the cache entry count as a metric.

If you can explain this far, the report is no longer just “memory is growing” — it connects reproduction conditions, observations, the growing type, the references, the cause, and the fix.

19. Caveats During the Investigation

19.1 Investigate with Release builds

Debug builds can look different from production due to optimizations, local variable lifetimes, and debug information.

For production-equivalent investigation, verify with a Release build, near-production settings, and similar data volumes.

19.2 Don’t judge from right after startup

Right after startup, memory grows due to all sorts of initialization.

Take a post-warm-up baseline, and look at growth from there.

19.3 Don’t convict based on a single dump

The types at the top of the heap are not necessarily the culprits.

System.String and System.Byte[] look large in most applications.

What matters is whether they grew over time, and who holds them.

19.4 Dumps contain sensitive data

Memory dumps may contain requests, credentials, connection strings, personal information, and business data.

Establish rules for storage location, removal from premises, sharing, and deletion.

19.5 In containers, dump capture is itself a risk

Under tight container memory limits, the extra memory and page-ins caused by dump capture can get the container OOM-killed.

Before capturing in a production container, try it in staging and check limits, disk space, permissions, and the PID namespace.

19.6 There are leaks outside the GC heap too

Investigating .NET does not mean everything shows up on the GC heap.

In problems like the following, process memory can grow while the GC Heap stays stable:

  • Native libraries
  • P/Invoke
  • COM
  • Image processing
  • Compression libraries
  • Cryptography
  • DB drivers
  • Sockets
  • Marshal.AllocHGlobal
  • NativeMemory.Alloc
  • Excessive threads

In this case, dotnet-dump’s dumpheap is not enough. You need OS-side diagnostics, external library metrics, handles, threads, and native memory.

20. Verifying After the Fix

Once you fix the suspected leak, re-measure with the same procedure.

Before the fix:
  After 100 runs, Gen 2 grew +300MB
  ReportResult grew +12,000 instances

After the fix:
  After 100 runs, Gen 2 stays stable within +20MB
  ReportResult returns to baseline after load stops
  Cache entry count stable at the 1,000-entry limit

When verifying a fix, always compare under identical conditions:

  • Same data volume
  • Same number of runs
  • Same load duration
  • Same warm-up
  • Same sampling interval
  • Same tools

A memory investigation is not convincing if the before/after comparison is weak.

21. Conclusion

When memory is growing in .NET, don’t immediately declare a leak — triage in this order:

  1. Don’t judge from Working Set / RSS alone
  2. Look at GC Heap, Gen 2, LOH, and GC counts with dotnet-counters
  3. Compare under load, after load stops, and over time
  4. Look at the growing types with dotnet-gcdump or dotnet-dump
  5. Look at the references with gcroot
  6. Check statics, caches, events, Timers, DI lifetimes, async contexts
  7. If the GC Heap is stable, also suspect native memory and OS-side issues

The difference between “just not GC’d yet” and “leaking” is, in the end, decided by references.

If an object that is no longer needed has no references, it gets collected when the GC runs. If something that should be unnecessary is still being referenced, the GC cannot collect it.

In other words, the goal of the investigation is:

What is growing?
Does it survive every GC?
Who is referencing it?
Is that reference required by the design?

Once you know this much, you can stop being whipsawed by memory graphs and translate the problem into concrete fixes in the code.

References

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

These topic pages place the article in a broader service and decision context.

This article connects naturally to the following service pages.

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

Back to the Blog