What Is Roslyn? Reading, Fixing, and Generating C# Code from the Compiler's Point of View
· Go Komura · .NET, C#, Roslyn, Analyzer, Source Generator, Compiler, Static Analysis, Code Generation, Legacy Asset Reuse
1. What to Understand First
Situations where you want to process C# source code come up more often than you might expect. For example, tasks like these.
Forbid a particular way of using an API
Mechanically find outdated coding patterns
Collect a list of methods and classes
Investigate dependencies across an entire project
Generate boilerplate code at compile time
Warn at build time when your in-house library is misused
Carry out large-scale replacements or migrations safely
In these situations, it is tempting to simply open the *.cs files and process them with string searches or regular expressions.
But C# is not a string. These two snippets look similar, yet they mean entirely different things.
Console.WriteLine("Hello");
MyCompany.Logging.Console.WriteLine("Hello");
The name Console can also refer to a different type.
using Console = MyCompany.Logging.Console;
Console.WriteLine("Hello");
Viewed as strings, all of these look like Console.WriteLine.
But from the compiler’s point of view, whether it is System.Console.WriteLine or some other type cannot be determined without name resolution.
This is where Roslyn comes in. Roslyn is a platform that exposes the information held by the C# and Visual Basic compilers as APIs that applications and tools can use.
Put simply, Roslyn lets you treat C# code in the following ways.
Read it as syntax, not as strings
Read it as meaning, not as appearance
Read it as projects and solutions, not as single files
Produce warnings, fixes, and generated code based on what you read
This article walks through the overall picture of Roslyn: Syntax Trees, SemanticModel, Workspaces, Analyzers, Source Generators, and where each fits in real-world work.
All the code in this article is published on GitHub as a complete buildable, runnable sample set (a library that works with Syntax Trees / SemanticModel, an Analyzer that warns on DateTime.Now, a Source Generator, a demo that analyzes an entire solution, and unit tests that check for false positives and missed detections).
roslyn-dotnet-compiler-platform - komurasoft-blog-samples (GitHub)
2. What Is Roslyn?
Roslyn is officially called the .NET Compiler Platform. It is the compiler implementation for C# and Visual Basic, and at the same time a set of APIs for building code analysis tools.
Traditionally, compilers tended to be treated as black boxes like this.
Source code goes in
The compiler processes it
A DLL / EXE comes out
Developers normally had no access to the information created inside the compiler.
In reality, though, a compiler does not simply translate text into machine code or IL. During compilation it builds up information like this.
This string of text is a class declaration
This identifier is a local variable
This method call refers to this method on this type
The return type of this expression is string
This code is a syntax error
This reference points to a type in assembly A
This using directive is never actually used
Roslyn makes this kind of information available to developers. That is why Roslyn is not just a compiler — it is a platform for understanding code.
3. What Can You Do with Roslyn?
With Roslyn, you can mainly do the following.
Parse C# / VB syntax
Perform semantic analysis of types and methods
Obtain compilation information
Analyze entire projects and solutions
Build custom Analyzers
Build Code Fixes
Build Source Generators
Build refactoring tools
Generate code
Transform code
In more practical terms, the use cases look like this.
Turn use of a forbidden API into a build warning
List every location that uses an outdated API
Check naming rules for async methods
Detect missed handling of IDisposable
Guide developers toward the correct use of your in-house framework
Generate DTO and mapping code at compile time
Generate boilerplate code from configuration files or attributes
Assist with migration analysis from .NET Framework to .NET
What makes Roslyn important is that you can write code that processes C# source on the same foundation as the compiler itself.
If you try to read C# with regular expressions or a home-grown parser, you hit the limits quickly. For example, handling these elements correctly is not easy.
using aliases
Extension methods
partial classes
partial methods
global usings
Nullable annotations
Generic types
Overload resolution
Conditional compilation
Preprocessor directives
Rewriting code while preserving comments and whitespace
Roslyn provides APIs for handling all of these in line with the C# language specification.
4. Roslyn Separates “Syntax” from “Semantics”
When learning Roslyn, the first distinction to internalize is this one.
Syntax: how the code is written
Semantics: what the code refers to
For example, look at this code.
Console.WriteLine(message);
Viewed as syntax, it has this shape.
Expression statement
Invocation expression
Member access expression
Identifier Console
Identifier WriteLine
Argument message
But this alone tells you nothing about meaning. Which type Console is, which overload WriteLine resolves to, and what the type of message is cannot be determined from syntax alone.
To see the meaning, you need all of this information.
The state of using directives
Referenced assemblies
Type definitions within the same project
References to other projects
Type inference
Overload resolution
Language version
Nullable context
In Roslyn, this distinction is reflected in the APIs themselves.
Syntax Tree : represents the syntax of the code
SemanticModel : represents what the syntax means
Compilation : represents all the information needed to compile
Workspace : handles solutions, projects, and documents
Once you grasp this separation, Roslyn becomes much easier to navigate.
5. What Is a Syntax Tree?
A Syntax Tree is a tree that represents the syntactic structure of source code. Suppose you have code like this.
class User
{
public string Name { get; set; }
public void Rename(string name)
{
Name = name;
}
}
From Roslyn’s point of view, this code has roughly the following structure.
CompilationUnit
ClassDeclaration: User
PropertyDeclaration: Name
MethodDeclaration: Rename
Parameter: name
Block
ExpressionStatement
AssignmentExpression
A Syntax Tree is not text split into lines. It is a structure organized into C# syntax elements: classes, methods, properties, expressions, statements, arguments, operators, and so on.
Let’s look at a simple example.
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
var source = """
class User
{
public string Name { get; set; }
public void Rename(string name)
{
Name = name;
}
}
""";
var tree = CSharpSyntaxTree.ParseText(source);
var root = tree.GetCompilationUnitRoot();
var methods = root
.DescendantNodes()
.OfType<MethodDeclarationSyntax>();
foreach (var method in methods)
{
Console.WriteLine(method.Identifier.Text);
}
This code finds method declarations in the source and prints their names. In this example it returns Rename.
The important point is that we are not searching for the string void — we are looking for “method declarations” as C# syntax.
6. Nodes, Tokens, and Trivia
When working with Syntax Trees, three terms come up constantly.
SyntaxNode
SyntaxToken
SyntaxTrivia
SyntaxNode
A SyntaxNode is a syntactic unit. For example, things like these.
Class declarations
Method declarations
Property declarations
if statements
for statements
Assignment expressions
Invocation expressions
Lambda expressions
Anything that can have further child elements in C# syntax is a Node.
SyntaxToken
A SyntaxToken is the smallest unit that makes up syntax. For example, things like these.
The class keyword
The public keyword
The identifier User
The identifier Rename
{ and }
; and ,
String literals
Numeric literals
Tokens are the leaf elements of the syntax tree.
SyntaxTrivia
SyntaxTrivia is information that is not directly relevant to ordinary semantic analysis, but is necessary to reproduce the source code. For example, things like these.
Whitespace
Line breaks
Comments
Preprocessor directives
Because of trivia, Roslyn can handle source code with high fidelity, including comments and whitespace.
Trivia is critically important when doing code formatting, refactoring, or mechanical rewrites.
If all you wanted was an AST, it might seem fine to throw comments away. But in real-world code transformations, not breaking comments and line breaks matters a great deal.
7. Syntax Trees Are Immutable
Roslyn’s Syntax Trees are immutable. That is, instead of mutating a syntax tree you obtained, you create a new tree with the changes applied.
For example, even when you want to rename a method, you do not modify the existing MethodDeclarationSyntax in place.
var newMethod = oldMethod.WithIdentifier(
SyntaxFactory.Identifier("NewName"));
You create a new node, like this.
Immutability brings several benefits.
Easy to use across multiple threads
Snapshots of code being edited in the IDE can be handled safely
Easy to produce diffs
Easy to compare before and after a change
It may feel a little cumbersome at first. But in a world where multiple processes — the IDE, the build, Analyzers, Source Generators — reference the same code simultaneously, immutability is a major advantage.
8. What Is the SemanticModel?
A Syntax Tree only tells you what the code looks like. To know what it means, you use the SemanticModel.
For example, consider this code.
Console.WriteLine("Hello");
From the Syntax Tree, you can tell that there is an identifier Console and an identifier WriteLine. But you cannot tell whether they refer to System.Console.WriteLine(string?) or to a method on some other type.
With the SemanticModel, you can find out which symbol a syntax node resolved to.
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
var source = """
using System;
class Program
{
static void Main()
{
Console.WriteLine("Hello");
}
}
""";
var tree = CSharpSyntaxTree.ParseText(source);
var compilation = CSharpCompilation.Create(
assemblyName: "Sample",
syntaxTrees: new[] { tree },
references: new[]
{
MetadataReference.CreateFromFile(typeof(object).Assembly.Location),
MetadataReference.CreateFromFile(typeof(Console).Assembly.Location)
});
var semanticModel = compilation.GetSemanticModel(tree);
var root = tree.GetCompilationUnitRoot();
var invocation = root
.DescendantNodes()
.OfType<InvocationExpressionSyntax>()
.First();
var symbolInfo = semanticModel.GetSymbolInfo(invocation);
var method = (IMethodSymbol?)symbolInfo.Symbol;
Console.WriteLine(method?.ContainingType.ToDisplayString());
Console.WriteLine(method?.Name);
With this, you can find out which method Console.WriteLine actually resolved to.
Being able to use the compiler’s name-resolution results, not just syntax, is Roslyn’s great strength.
9. What Is a Symbol?
In Roslyn, types, methods, properties, fields, parameters, local variables, and so on are treated as Symbols.
Here are the main interfaces.
INamedTypeSymbol : classes, structs, interfaces, etc.
IMethodSymbol : methods
IPropertySymbol : properties
IFieldSymbol : fields
IParameterSymbol : parameters
ILocalSymbol : local variables
INamespaceSymbol : namespaces
A Symbol represents the meaning resolved by the compiler, not the appearance in the source code.
For example, these two snippets look different.
System.Console.WriteLine("Hello");
using System;
Console.WriteLine("Hello");
But if both refer to the same System.Console.WriteLine, Roslyn’s semantic analysis treats them as the same method symbol.
This property makes the following kinds of checks possible.
Is this call really the API our company forbids?
Does this type implement a particular interface?
Is this method async?
Is this return value nullable?
Is this attribute actually applied?
Does this class inherit from a particular base class?
You get analysis based on the compiler’s judgment, not mere string searching.
10. What Is a Compilation?
A Compilation bundles together everything needed to compile a C# or Visual Basic program.
Concretely, it holds information like this.
The set of SyntaxTrees
Referenced assemblies
Compilation options
Language version
Predefined symbols
Information about types and members
Diagnostic information
If you only need to read a single file as syntax, a SyntaxTree is enough. But to do type resolution or reference resolution, you need a Compilation.
For example, when you want to do things like this.
Find out which type's method this method call refers to
Find out whether this class implements IDisposable
Find out which attribute type this attribute actually is
Find out the return type of this expression
Obtain compilation errors and warnings
None of these can be determined from syntax alone. You have to take project references and compilation options into account.
11. What Is a Workspace?
When you want to work with whole solutions or projects rather than single files, you use a Workspace.
A Workspace deals with these units.
Solution
Project
Document
For example, loading every project from a solution and analyzing every document looks like this.
using Microsoft.Build.Locator;
using Microsoft.CodeAnalysis.MSBuild;
MSBuildLocator.RegisterDefaults();
using var workspace = MSBuildWorkspace.Create();
var solution = await workspace.OpenSolutionAsync("Sample.sln");
foreach (var project in solution.Projects)
{
Console.WriteLine(project.Name);
foreach (var document in project.Documents)
{
var root = await document.GetSyntaxRootAsync();
Console.WriteLine($" {document.Name}: {root?.DescendantNodes().Count()} nodes");
}
}
Tools like this are useful for surveying existing codebases and assisting migrations.
For example, in uses like these.
Export a list of calls to a particular API as CSV
List every location using an old namespace
Produce a list of public APIs
Investigate inter-project dependencies
Check a huge solution for coding-convention violations
Perform mechanical code transformations
Analyzers are a mechanism that runs integrated with the IDE and the build. Console tools built on Workspaces, by contrast, are well suited to investigations and bulk migrations.
The two are similar, but it pays to keep their roles separate.
12. The Main Ways to Use Roslyn
Broadly speaking, there are four ways to use Roslyn.
1. Use it as a library
2. Build an Analyzer
3. Build a Code Fix
4. Build a Source Generator
Each serves a different purpose.
Using It as a Library
You call the Roslyn APIs from your own console apps or internal tools.
Suitable uses include the following.
Codebase investigation
Bulk conversion
Metrics collection
Migration support
Report generation
In this form, you run the tool whenever you like. Since it does not need to run while someone is typing in the IDE, somewhat heavy processing is more acceptable.
Building an Analyzer
An Analyzer is a mechanism that analyzes code and produces warnings or errors.
For example, you can build rules like these.
Use DateTimeOffset.UtcNow instead of DateTime.Now
Async method names must end with Async
Do not call a library's initialization APIs in the wrong order
Do not use a particular namespace in new code
Forbid the use of Task.Result / Wait
Analyzers can run in Visual Studio and at build time. They let you detect violations of team conventions and library usage rules mechanically, without relying on a reviewer’s memory.
Building a Code Fix
A Code Fix is a mechanism that proposes corrections for problems an Analyzer has found.
Think of the fixes you can apply from the light bulb icon in Visual Studio.
For example, suppose an Analyzer detects this code.
DateTime.Now
A Code Fix can offer a correction like this.
DateTimeOffset.UtcNow
The strength of Code Fixes is that they automate not just “raising a warning” but “the safe way to fix it.”
Building a Source Generator
A Source Generator is a mechanism that generates code at compile time and adds that code to the same compilation.
For example, it has uses like these.
Generate boilerplate code from classes marked with an attribute
Generate type-safe accessors from configuration files
Generate DTO mapping code
Generate serializer code
Generate enum-to-string conversion code
Generate routing or DI registration code
Processing that used to gather information via reflection at run time can sometimes be replaced with code generated at compile time. This can reduce startup cost and improve compatibility with AOT.
13. Analyzers Work as “Automated Code Review”
In practice, the easiest way to think about Analyzers is as “automated code review.”
In code reviews, the same comments often come up over and over.
Please don't use this API
This method name doesn't follow our conventions
This catch swallows the exception
This null check is unnecessary
This call has performance problems
If a human points these out every time, some of them can become Analyzers.
Rules especially well suited to Analyzers look like this.
Good and bad can be judged unambiguously
There are few exceptions
The fix policy is settled
The whole team wants to enforce it
It comes up frequently in reviews
It is acceptable to fail the build on it
Conversely, some things are not suited to Analyzers.
Judgment depends heavily on context
A design decision is required
There are too many exceptions
Opinions differ from person to person
So many warnings that nobody looks at them anymore
Analyzers are a powerful tool. Precisely because they are powerful, adding too many degrades the development experience. It is best to start with a small number of important rules.
14. Start with the Analyzers Included in the .NET SDK
Before writing your own Analyzer, the realistic first step is to look at the Analyzers included in the .NET SDK.
In .NET 5 and later projects, .NET code analysis is enabled by default.
Two families of diagnostic IDs come up frequently.
CAxxxx : code quality, reliability, performance, security, etc.
IDExxxx: code style, IDE assistance, etc.
Analyzer severities can be adjusted in .editorconfig.
# Example: make unused usings a warning
dotnet_diagnostic.IDE0005.severity = warning
# Example: make CA2000 an error
dotnet_diagnostic.CA2000.severity = error
You may also enable or tighten analysis in the project file.
<PropertyGroup>
<EnableNETAnalyzers>true</EnableNETAnalyzers>
<AnalysisLevel>latest</AnalysisLevel>
<TreatWarningsAsErrors>false</TreatWarningsAsErrors>
</PropertyGroup>
When you introduce analysis to an existing project, you may initially get a flood of warnings. In that case, rather than turning everything into errors at once, proceed in stages.
First get a picture of the total warning count
Adopt a policy of not adding new warnings in new code
Set only the important rules to warning
Set only the rules you truly want to enforce to error
Reduce existing violations on a planned schedule
Think of custom Analyzers as supplementing this foundation with rules specific to your company.
15. A Minimal Analyzer
An Analyzer finds particular syntax or symbols and reports Diagnostics.
As an example, consider an Analyzer that warns on the use of DateTime.Now.
Production code would need to handle type resolution and edge cases carefully, but the minimal sketch looks like this.
using System.Collections.Immutable;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Diagnostics;
[DiagnosticAnalyzer(LanguageNames.CSharp)]
public sealed class NoDateTimeNowAnalyzer : DiagnosticAnalyzer
{
private static readonly DiagnosticDescriptor Rule = new(
id: "CMP001",
title: "Do not use DateTime.Now directly",
messageFormat: "Instead of DateTime.Now, consider DateTimeOffset.UtcNow or another option appropriate to the use case",
category: "Usage",
defaultSeverity: DiagnosticSeverity.Warning,
isEnabledByDefault: true);
public override ImmutableArray<DiagnosticDescriptor> SupportedDiagnostics
=> ImmutableArray.Create(Rule);
public override void Initialize(AnalysisContext context)
{
context.ConfigureGeneratedCodeAnalysis(GeneratedCodeAnalysisFlags.None);
context.EnableConcurrentExecution();
context.RegisterSyntaxNodeAction(
AnalyzeMemberAccess,
SyntaxKind.SimpleMemberAccessExpression);
}
private static void AnalyzeMemberAccess(SyntaxNodeAnalysisContext context)
{
var memberAccess = (MemberAccessExpressionSyntax)context.Node;
if (memberAccess.Name.Identifier.Text != "Now")
{
return;
}
var symbol = context.SemanticModel.GetSymbolInfo(memberAccess).Symbol;
if (symbol is not IPropertySymbol propertySymbol)
{
return;
}
if (propertySymbol.Name == "Now" &&
propertySymbol.ContainingType.ToDisplayString() == "System.DateTime")
{
var diagnostic = Diagnostic.Create(Rule, memberAccess.GetLocation());
context.ReportDiagnostic(diagnostic);
}
}
}
The important point in this example is that it is not simply searching for the string DateTime.Now.
It uses the SemanticModel to confirm that the expression actually refers to System.DateTime.Now.
That makes it much less likely to flag unrelated code like this.
MyCompany.DateTime.Now
In Analyzers, this flow — narrow candidates by syntax, then confirm with semantic analysis — is the common pattern.
Find candidates quickly via Syntax
Determine precisely via SemanticModel
Report the location and message via a Diagnostic
16. Code Fixes Distribute the “How to Fix It”
Analyzers find problems; Code Fixes propose how to fix them.
For example, when DateTime.Now is detected, you could offer fixes like these.
Replace it with DateTimeOffset.UtcNow
Replace it with an abstraction such as IClock.Now
Code Fixes need careful design, though. It is not always correct to replace DateTime.Now with DateTimeOffset.UtcNow. Displaying a local time and handling a timestamp for storage or comparison call for different types and different time-zone handling.
So Code Fixes are appropriate when conditions like these are met.
The meaning after the fix is unambiguous
Side effects are small
The fix can be applied safely as a mechanical transformation
A human can easily verify it
For example, this kind of fix pairs well with Code Fixes.
Replace an old API name with the new one
Add a missing using
Rename to match naming conventions
Add an attribute
Remove an unnecessary argument
On the other hand, for fixes that require design judgment, it is sometimes better to emit only a warning rather than an automatic fix.
17. Source Generators Are “Compile-Time Code Generation”
A Source Generator runs at compile time and adds the C# code it generates to the same compilation.
The flow looks like this.
Read the user's source code
Examine attributes and type definitions
Generate the necessary C# code
Add the generated code to the compilation
Here is a simple Source Generator example.
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.Text;
using System.Text;
[Generator]
public sealed class BuildInfoGenerator : IIncrementalGenerator
{
public void Initialize(IncrementalGeneratorInitializationContext context)
{
context.RegisterPostInitializationOutput(static ctx =>
{
var source = """
namespace Generated;
public static class BuildInfo
{
public static string Tool => "Roslyn Source Generator";
}
""";
ctx.AddSource(
"BuildInfo.g.cs",
SourceText.From(source, Encoding.UTF8));
});
}
}
A project that references this generator can use the type even though no source file was written for it.
Console.WriteLine(Generated.BuildInfo.Tool);
A Source Generator does not rewrite the user’s existing code. What it can do is generate additional source code and have it participate in the compilation.
So it helps to think of it this way.
Not something that transforms existing code
Something that looks at existing code and produces additional code
If you want to bulk-rewrite existing code, consider a Roslyn-based migration tool or a Code Fix instead of a Source Generator.
18. How to Reference a Source Generator
When you reference a generator project from another project during development, the handling differs from an ordinary library reference.
That is because a generator is not a library referenced at run time — it is loaded as an Analyzer at compile time.
In a project reference, you specify it like this.
<ItemGroup>
<ProjectReference Include="..\BuildInfoGenerator\BuildInfoGenerator.csproj"
OutputItemType="Analyzer"
ReferenceOutputAssembly="false" />
</ItemGroup>
Setting ReferenceOutputAssembly="false" prevents the generator DLL from being treated as a normal reference assembly.
When distributing as a NuGet package, you likewise lay it out so it is loaded as an Analyzer / Source Generator.
Source Generators are convenient, but their lifecycle differs from a normal library.
Normal library: used by the app at run time
Source Generator: used by the compiler at compile time
Keeping this difference in mind is important.
19. What Source Generators Are Good For
A Source Generator is not a license to generate everything.
It is suited to code like this.
Tedious and error-prone to write by hand
Mechanically determined by the input
Generated output is readable
Reduces run-time reflection
Improves compatibility with AOT and trimming
Improves type safety
Some examples.
Metadata for JSON serialization
DI registration code
Configuration value accessors
API clients
Enum conversion code
Type generation from SQL or CSV definitions
Helper code for INotifyPropertyChanged
Be aware, though, that if the generated code is too complex, it becomes hard to trace when problems occur.
When using Source Generators, keep these points in mind.
Make the generated code inspectable
Keep the names of generated code stable
Make generation deterministic
Make Diagnostics for error cases easy to understand
Avoid producing huge diffs from small input changes
Generated code should not look like magic. It is important to emit code that future maintainers can read.
20. What Source Generators Are Not Good For
Some kinds of processing are unsuitable for Source Generators.
Processing that depends on run-time state
Processing that requires network access
Processing that depends on the current value of an external service
Processing whose result changes on every run
Rewriting existing source
Huge whole-solution analysis
Generators run at compile time, so a slow generator degrades build times and the IDE experience.
Environment-dependent generators also cause problems like these.
Passes on the developer's PC but fails on CI
Passes on CI but fails on another OS
Results change depending on cache state
A network outage breaks the build
A Source Generator should be kept as pure as possible.
Input: source code, AdditionalFiles, AnalyzerConfigOptions
Output: generated C# code, Diagnostics
The clearer this relationship, the more stable the generator.
21. Code Investigation Tools Built on Roslyn
Analyzers and Source Generators are not the only uses for Roslyn. Using Roslyn from your own console tools is also effective in practice.
For example, requirements like these come up.
List every location that uses an old API
Count public classes per project
Export classes marked with a particular attribute to CSV
Investigate the namespaces a huge solution depends on
Identify Windows-dependent APIs before a .NET Framework migration
In cases like these, it is often easier to build a one-off or periodically run investigation tool than an Analyzer.
As an example, here is a simple sketch that enumerates the public classes in a solution.
using Microsoft.Build.Locator;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.MSBuild;
MSBuildLocator.RegisterDefaults();
using var workspace = MSBuildWorkspace.Create();
var solution = await workspace.OpenSolutionAsync(args[0]);
foreach (var project in solution.Projects)
{
foreach (var document in project.Documents)
{
var root = await document.GetSyntaxRootAsync();
if (root is null)
{
continue;
}
var classes = root.DescendantNodes()
.OfType<ClassDeclarationSyntax>()
.Where(c => c.Modifiers.Any(m => m.Text == "public"));
foreach (var cls in classes)
{
Console.WriteLine($"{project.Name},{document.FilePath},{cls.Identifier.Text}");
}
}
}
This example looks only at syntax. If you wanted “public classes that inherit from a particular base class,” you would need the SemanticModel to examine the type’s inheritance hierarchy.
If names and shapes are enough, use Syntax
If you need types and resolved references, use SemanticModel
If you need to handle whole projects, use Workspace
This division of labor is the basic principle.
22. How It Differs from Regular Expressions
Regular expressions are handy, but they are not suited to handling the meaning of C# code.
For example, consider this code.
// Console.WriteLine("debug");
A regex search for Console.WriteLine might pick up the text inside the comment.
There are also string literals like this.
var text = "Console.WriteLine";
Or the call may be split across lines.
Console
.WriteLine("Hello");
And aliases may be used.
using C = System.Console;
C.WriteLine("Hello");
Handling all of these correctly with regular expressions is hard.
With Roslyn, you can distinguish comments, string literals, syntactic method invocations, and the actually resolved method.
Of course, for a quick survey, grep or ripgrep is often enough. But if you will be making design decisions or applying automated fixes based on the results, Roslyn is the safer choice.
For a rough search, string matching is fine
For correct judgments as C#, use Roslyn
23. Using Roslyn to Survey Existing Assets
When migrating from .NET Framework to modern .NET, the first thing you need is an accurate picture of the current state. Roslyn is genuinely useful here.
For example, surveys like these.
A list of System.Web dependencies
A list of code that assumes App.config / Web.config
Locations using Windows Forms / WPF specific APIs
Locations using Remoting / BinaryFormatter
Whether COM references exist
A list of P/Invokes
I/O processing that has not been made asynchronous
Locations using old cryptography APIs
A simple string search can produce candidates. But with Roslyn, you can produce lists based on results resolved as types and methods.
For example, searching for the string BinaryFormatter also picks up comments and documentation.
If you use Roslyn to look for uses of the type System.Runtime.Serialization.Formatters.Binary.BinaryFormatter, you get much more accurate candidates.
In a migration effort, you do not need to build a perfect Analyzer from day one. Even a console investigation tool that can emit a CSV like this already has value.
Project,File,Line,Symbol,Kind
Legacy.Web,Controllers/HomeController.cs,42,System.Web.HttpContext.Current,Property
Legacy.Core,Serialization/OldStore.cs,18,System.Runtime.Serialization.Formatters.Binary.BinaryFormatter,Type
With a list like this in hand, planning the migration becomes much easier.
24. Roslyn for Library Authors
Roslyn helps not only application developers but library authors as well. Every library has a correct way to be used.
For example, rules like these.
An initialization method must be called first
A particular attribute must be applied
Dispose must be called
A particular option setting is dangerous
A deprecated API should not be used in new code
If you convey these only through documentation, users will miss them.
If you bundle an Analyzer into the library’s NuGet package, you can raise warnings directly in the user’s code.
For example, given an in-house library Company.Messaging, you can detect misuse like this.
var client = new MessageClient();
client.Send(message); // Sending before Configure has been called
The Analyzer can emit a warning like this.
CMP1001: Call Configure before calling MessageClient.Send
A Code Fix can further offer fix candidates and sample code. This improves the experience of using the library.
Deliver what the documentation says, right in the user's editor
This idea is one of Roslyn’s biggest sources of value.
25. Things to Consider When Shipping an Analyzer via NuGet
Analyzers can be distributed as NuGet packages. But they need to be thought about separately from ordinary run-time libraries, because an Analyzer is not needed at application run time — it is used at build time and inside the IDE.
So package design involves questions like these.
Should the run-time library and the Analyzer ship in the same package?
Should the Analyzer be a separate package?
Should it emit warnings by default?
What severity should it have?
Should it be controllable via .editorconfig?
Will existing users suddenly see a flood of warnings?
For internal use, fairly strict rules may be acceptable.
If you distribute a public library, you must take care not to suddenly break users’ builds.
It is often easier to start at Info or Warning and let users raise it to Error on their side as needed.
26. Roslyn and IDE Features
In Visual Studio and other .NET development environments, Roslyn’s ideas are deeply involved in IDE features.
For example, features like these.
IntelliSense
Go to Definition
Find All References
Rename
Extract Method
Quick Actions
Code style warnings
Unused using detection
None of these can be implemented with plain string search. In Rename, for example, you must not accidentally change a different symbol that happens to have the same name.
class User
{
public string Name { get; set; }
}
class Product
{
public string Name { get; set; }
}
When you want to rename User.Name, you must not also change Product.Name. This requires distinguishing them as symbols, not just as syntax.
The Roslyn APIs provide the foundation for bringing these IDE-grade capabilities into your own tools.
27. Performance Considerations
Roslyn is powerful, but if you write heavy processing it will of course be slow. Analyzers and Source Generators in particular may run while developers are typing or while a build is in progress.
So pay attention to points like these.
Avoid unnecessary SemanticModel retrieval
Narrow candidates by Syntax before semantic analysis
Avoid file I/O
Do no network access
Avoid heavy reflection
Honor cancellation requests
Be aware of parallel execution
Do not bring whole-solution analysis into an Analyzer
In an Analyzer, register as narrow a set of targets as possible in Initialize.
A bad example.
Look at every SyntaxNode and then discriminate with piles of if statements inside
A better direction.
Register only the SyntaxKinds you need
First filter cheaply by name or shape
Only confirm with the SemanticModel when necessary
An Analyzer may live permanently inside a user’s development environment. So lightness is part of its quality, not just correctness.
28. Designing Diagnostics
The Diagnostics an Analyzer emits are not just about raising a warning. When developers see one, they need to be able to tell the following.
What is the problem?
Why is it a problem?
Where should it be fixed?
How should it be fixed?
Are there exceptions?
An example of a bad message.
CMP001: This is forbidden
This tells you nothing about what is wrong.
An example of a better direction.
CMP001: DateTime.Now depends on the local time of the execution environment. For timestamps used in storage or comparison, use DateTimeOffset.UtcNow or a time provider.
It is also worth designing your Diagnostic IDs.
CMP0001-CMP0999: common rules
CMP1000-CMP1999: rules for library A
CMP2000-CMP2999: migration support rules
If you can provide documentation pages, setting HelpLinkUri on the DiagnosticDescriptor is also effective.
Warnings are communication with developers. If the messages are sloppy, the rules themselves stop being trusted.
29. Designing Severities
Choose Analyzer severities carefully. The typical levels are these.
Hidden / Silent
Info
Suggestion
Warning
Error
In practice, it is usually better not to jump straight to Error. Especially with a large existing codebase, starting at Error stalls the rollout.
Realistically, staged adoption like this is easiest.
1. Introduce the rule at Warning
2. Make the warning count visible in CI
3. Stop adding new violations
4. Raise only the critical rules to Error
5. Make a plan to reduce existing violations
The goal of an Analyzer is not to torment developers, but to raise the quality of the codebase without undue friction.
30. Debugging Source Generators
Source Generators execute in a different place than a normal application, so debugging them has its quirks.
The basic investigation methods are these.
Look at the generated source
Emit Diagnostics
Write tests
Attach a debugger if necessary
In SDK-style projects, a setting that emits the generated files makes them easier to inspect.
<PropertyGroup>
<EmitCompilerGeneratedFiles>true</EmitCompilerGeneratedFiles>
<CompilerGeneratedFilesOutputPath>$(BaseIntermediateOutputPath)Generated</CompilerGeneratedFilesOutputPath>
</PropertyGroup>
This makes the generated .g.cs files easy to find.
$(BaseIntermediateOutputPath) normally points under obj/.
If you specify a path directly under the project, such as Generated, then because SDK-style projects include **/*.cs in compilation by default, the already-emitted .g.cs files get pulled back in as ordinary sources on the next build, which can cause duplicate type and member errors.
If you really must output under the project root, explicitly exclude the files from compilation, e.g. <Compile Remove="Generated/**/*.cs" />.
Generator tests commonly take the form of comparing input code with the generated output.
Prepare the input source
Run the generator
Verify the generated source
Verify the expected Diagnostics
A Source Generator validated only by manual checks breaks quickly. The more complex the generation logic, the more important the tests.
31. Testing with Roslyn
Analyzers and Source Generators should be grown with tests. For Analyzers in particular, both false positives and missed detections are problems.
Prepare test patterns like these.
Code that should be detected
Code that must not be detected
Code that uses using aliases
Code that uses fully qualified names
Code that uses a different type with a similar name
Code treated as generated code
Code with nullable enabled
For example, for an Analyzer that forbids System.DateTime.Now, you would verify cases like these.
// Should be detected
var x = System.DateTime.Now;
// Should also be detected when a using is present
using System;
var x = DateTime.Now;
// Must not be detected when it is a different type
namespace MyCompany;
public static class DateTime
{
public static string Now => "now";
}
var x = DateTime.Now;
That last case is exactly the kind a string search gets wrong. A Roslyn Analyzer avoids it by confirming the target symbol through the SemanticModel.
32. Can It Be Used with .NET Framework Projects?
Roslyn is not exclusive to modern .NET. But the caveats vary depending on how you intend to use it.
As an Investigation Tool
Building a Roslyn tool as a .NET 8 or .NET 10 console app and loading and analyzing a .NET Framework solution is a realistic option.
In this case, the tool itself runs on modern .NET while the analysis target can be .NET Framework code.
However, to load a solution with MSBuildWorkspace, you need an environment that can build the target projects: MSBuild, the SDK, reference assemblies, and NuGet restore.
In other words, Roslyn alone cannot read everything — resolving a real project’s configuration requires a build environment.
As an Analyzer
Analyzers run by being loaded into the compiler or the IDE.
Even if the target project is .NET Framework, you can use them as long as the environment’s compiler can load Analyzers.
That said, with old csproj formats, old Visual Studio, old MSBuild, or packages.config-based configurations, adoption and operation may not be as smooth as with modern SDK-style projects.
When introducing them into an existing .NET Framework project, check points like these first.
Visual Studio / MSBuild versions
Whether PackageReference can be used
Whether the same Analyzers run on CI
Whether warnings appear in the build log
Whether .editorconfig takes effect
As a Source Generator
A Source Generator is a mechanism the compiler loads at compile time.
So what matters is less the target project’s run-time framework than the support status of the compiler and SDK used to build.
While modern .NET SDK-style projects handle them easily, old .NET Framework projects require caution depending on the project format and build environment.
For existing .NET Framework assets, it is often safer to start with Roslyn-based investigation tools and Analyzers rather than wiring in Source Generators from the start.
33. Choosing Versions Carefully
Roslyn-related NuGet packages include the Microsoft.CodeAnalysis.* family.
The main ones are these.
Microsoft.CodeAnalysis.CSharp
Microsoft.CodeAnalysis.CSharp.Workspaces
Microsoft.CodeAnalysis.Workspaces.MSBuild
Microsoft.CodeAnalysis.Analyzers
Microsoft.CodeAnalysis.CSharp.CodeFix.Testing
Microsoft.CodeAnalysis.CSharp.SourceGenerators.Testing
The point to be careful about here is that Analyzers and Source Generators are loaded into the consumer’s compiler.
That is, if developers’ machines or the CI environment have an old SDK / Visual Studio, an Analyzer / Generator built against too-new Roslyn APIs may fail to run.
If it is for internal use and you can standardize the build environment, you can use relatively new APIs comfortably.
For externally distributed libraries, on the other hand, you need to choose the Microsoft.CodeAnalysis version you depend on conservatively, with the breadth of consumer environments in mind.
As a policy, think of it this way.
Internal only: align CI and dev environments, then use newer APIs
External distribution: choose conservatively for the consumers' SDK/VS range
Generators: design as Incremental Generators where possible
Analyzers: prioritize lightness that does not hurt the IDE experience
Roslyn sits close to the compiler, so it is sensitive to version differences.
34. Don’t Try to Do Everything with Roslyn
Roslyn is powerful, but it is not a tool that solves every problem. For example, these problems cannot be solved with Roslyn alone.
Which branch is taken at run time
What values arrive with production data
Methods invoked dynamically via reflection
The run-time registration results of a DI container
Processing that changes depending on configuration files
Values returned from external services
Roslyn is fundamentally a tool for source code and compilation information. To learn about run-time behavior, you need other means: tests, logs, tracing, profiling, dump analysis, and so on.
So it is best to think of Roslyn’s role this way.
Handle what can be known statically, with high precision
If you force Roslyn to answer questions that can only be answered dynamically, you end up with a complex, inaccurate system.
35. The Order of Adoption
If you are starting to use Roslyn at work, this order is recommended.
1. Get the existing .NET Analyzers and .editorconfig in order
2. Write a small investigation tool using Syntax Trees
3. Try type resolution with the SemanticModel
4. Read a solution with MSBuildWorkspace
5. Build a small team-specific Analyzer
6. Add a Code Fix if needed
7. Consider a Source Generator where boilerplate is heavy
You do not need to go straight to Source Generators. In most teams, Analyzers and investigation tools deliver value sooner.
Especially with large existing assets, a flow like this is realistic.
Understand the current state with an investigation tool
Turn the frequent problems into Analyzers
Turn only the safely fixable ones into Code Fixes
Turn repeatedly written boilerplate into Generators
Roslyn is a tool you can adopt incrementally.
36. A Small Sample: Listing Method Invocations
Finally, let’s look at Roslyn usage in a slightly more practical form. Here is a sketch that lists the method invocations in a solution.
using Microsoft.Build.Locator;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.MSBuild;
MSBuildLocator.RegisterDefaults();
using var workspace = MSBuildWorkspace.Create();
var solution = await workspace.OpenSolutionAsync(args[0]);
foreach (var project in solution.Projects)
{
var compilation = await project.GetCompilationAsync();
if (compilation is null)
{
continue;
}
foreach (var document in project.Documents)
{
var tree = await document.GetSyntaxTreeAsync();
if (tree is null)
{
continue;
}
var root = await tree.GetRootAsync();
var semanticModel = compilation.GetSemanticModel(tree);
var invocations = root
.DescendantNodes()
.OfType<InvocationExpressionSyntax>();
foreach (var invocation in invocations)
{
var symbol = semanticModel.GetSymbolInfo(invocation).Symbol as IMethodSymbol;
if (symbol is null)
{
continue;
}
var lineSpan = invocation.GetLocation().GetLineSpan();
var line = lineSpan.StartLinePosition.Line + 1;
Console.WriteLine(string.Join(",", new[]
{
project.Name,
document.FilePath ?? document.Name,
line.ToString(),
symbol.ContainingType.ToDisplayString(),
symbol.Name
}));
}
}
}
Extend a tool like this a little and you can run surveys like these.
Extract only calls to a particular method
List locations using deprecated APIs
Report usage frequency per project
Build a list of APIs targeted for migration
Once you can read source code from the compiler’s point of view, surveying existing code becomes dramatically easier.
37. Cautions When Rewriting Code with Roslyn
Roslyn can also be used to rewrite code via the syntax tree.
For example, you can rename a method, add an attribute, or add a using.
But code rewriting should be done carefully. The main cautions are these.
Verify that the meaning does not change
Do not break comments or whitespace
Do not let the diff grow too large
Keep formatting consistent
Do not do too many transformations at once
Keep the Git diff easy to review
Because Roslyn’s Syntax Trees preserve trivia, transformations that maintain comments and whitespace are possible. But if you construct nodes carelessly, the formatting of the generated code can fall apart.
When building a rewriting tool, this policy is the safe one.
First do detection only
Review the diff between before and after
Start with small transformations
Write tests for the transformation tool itself
Start in detection-only mode on CI
For large-scale mechanical transformations, Roslyn is powerful — but human review is still needed at the end.
38. Roslyn and AI Coding Assistance
AI-driven code generation and review assistance have become commonplace in recent years, but Roslyn’s value has not diminished. AI is good at natural language and surrounding context; Roslyn, as a compiler, is good at precise syntactic and semantic information. The two are complementary rather than competing.
For example, a division of labor like this is conceivable.
Use Roslyn to extract the target locations precisely
Use AI to generate the fix policy and explanations
Use Roslyn to verify that the proposed fix compiles
Use an Analyzer to prevent recurrence
Rather than asking an AI to “fix all the old APIs in this codebase,” it can be safer to extract the target locations precisely with Roslyn first.
Then using AI for weighing fix strategies and assisting review is the more practical approach.
Let the compiler handle what the compiler can know. Let humans and AI focus on the judgments above that.
This division of responsibility is what matters.
39. A Practical Checklist
Before using Roslyn, it is worth checking points like these.
Is the goal investigation, warnings, fixes, or generation?
Is syntax enough, or is semantic analysis needed?
Is a single file enough, or is the whole project needed?
Does it need to run inside the IDE, or is a one-off tool enough?
Is an impact on build time acceptable?
Will it run on CI?
Will it flood existing code with warnings?
What severity should the Analyzer have?
Can the Code Fix be applied safely?
Can the Source Generator's output be inspected?
Are the consumers' SDK / Visual Studio versions aligned?
When in doubt, divide it up like this.
Want to investigate -> a console tool built on Roslyn
Want it always enforced -> an Analyzer
The fix is settled -> a Code Fix
Want boilerplate generated -> a Source Generator
With this division, it becomes hard to pick the wrong place to apply Roslyn.
40. Summary
Roslyn opens up the C# and Visual Basic compilers as APIs that developers can use.
With Roslyn, you can handle source code not as mere strings, but in forms like these.
Read syntax as Syntax Trees
Read meaning via the SemanticModel
Handle the whole compilation as a Compilation
Handle solutions and projects via Workspaces
Emit warnings as an Analyzer
Propose corrections as a Code Fix
Generate code as a Source Generator
In practice it is especially useful in situations like these.
Surveying existing codebases
Supporting migration from .NET Framework to .NET
Automated checking of team conventions
Guiding library consumers
Generating boilerplate code
Quality assurance in the IDE and on CI
What matters most is not to treat Roslyn as some intimidating “advanced compiler technology.”
To start, reading a single file with CSharpSyntaxTree.ParseText and enumerating the method names is plenty. From there, expand to SemanticModel, Workspaces, Analyzers, and Source Generators.
If we had to put Roslyn in one sentence, it would be this.
It lets you treat C# code not as strings, but as the structure the compiler understood.
With this perspective, automating code review, migration, investigation, and generation all become a step easier.
References
- The complete sample code for this article (library, Analyzer / Source Generator, demo, unit tests) https://github.com/gomurin0428/komurasoft-blog-samples/tree/main/roslyn-dotnet-compiler-platform
- dotnet/roslyn - GitHub
- The .NET Compiler Platform SDK - Microsoft Learn
- Work with syntax - Microsoft Learn
- Work with semantics - Microsoft Learn
- Work with a workspace - Microsoft Learn
- Overview of .NET source code analysis - Microsoft Learn
- Code analysis using .NET compiler platform analyzers - Microsoft Learn
- Get started with Roslyn analyzers - Microsoft Learn
- Introducing C# Source Generators - .NET Blog
- Source Generator Cookbook - GitHub
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
What Is a PDB (Program Database)? — Understanding Debug Information, Symbols, and Source Link
What a PDB (Program Database) is, what it does and does not contain, and how it relates to Debug / Release, Portable PDBs, Source Link, s...
Windows App Outsourcing and Contract Development: What to Sort Out Before You Ask
Before commissioning Windows app outsourcing or contract development, here is how to sort out existing software modification, device inte...
Using Algebraic Data Types in .NET Framework / .NET — Designing States and Results with Types
How to use algebraic data types — especially sum types and discriminated unions — in .NET Framework and .NET, and what you gain: F#, C# c...
Handling Windows Impersonation Tokens Correctly — Borrowing Privileges per Thread and Reverting Safely
A practical guide to Windows impersonation tokens — access tokens, primary tokens, thread tokens, impersonation levels, RevertToSelf, and...
The Misconception That TCP Lets You Receive in the Same Units You Send — Designing Reception Around a Byte Stream
Assuming TCP delivers data in the same units as Send or Write leads to fragmentation, coalescing, garbled text, and broken protocols. Thi...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Where This Topic Connects
This article connects naturally to the following service pages.
Windows App Development
We support Windows desktop applications that involve resident processing, device integration, operational logging, and maintainable structure.
Author Profile
Profile page for the article author.
Go Komura
Representative of KomuraSoft LLC
Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.
Public links