A Practical Procedure for Identifying the Scheme Behind a Hash String

There are plenty of situations where you look at a string like 5f4dcc3b5aa765d61d8327deb882cf99 or $2b$12$... left in logs or a database and want to determine “what kind of hash is this?” In migrations of existing systems, investigations of authentication methods, log analysis, and integrations with third-party systems, it is not unusual to get stuck right here.

The dangerous move, though, is jumping to a conclusion based on length alone. Looking at a 64-character hex string and declaring “that’s SHA-256” is premature. SHA3-256, SHA-512/256, BLAKE2s-256, and BLAKE3’s default 32-byte output can all be the same length. Conversely, storage formats that include a prefix and parameters, like $2b$ or $argon2id$ , can be identified with quite high accuracy from the string alone.

In this article we use the word hash broadly, covering not only message digests like MD5 / SHA-2 / SHA-3 but also string representations used for password storage such as bcrypt / scrypt / Argon2 / PBKDF2. The content is organized based on the RFCs, NIST publications, Linux crypt(5), Apache, Django, Spring Security, and other official materials publicly available as of April 2026.

1. The Conclusion First

Here is the short version up front.

Storage formats with prefixes or separators are fairly easy to identify from the string alone. Examples: $argon2id$..., $2b$..., $5$..., $6$..., {SHA}..., pbkdf2_sha256$...
Plain hex strings or bare Base64 usually only get you as far as “narrowing the candidates.” Example: 32 hex = could be MD5, but could also come from MD4 / NTLM
The character set is as important as the length. If you see + / =, it looks like RFC 4648 Base64; if it contains . and is $-delimited, it looks like the crypt(3) family — distinctions like these really work.
If you want 100% certainty, you need context. Whether it lives in /etc/shadow, .htpasswd, Django’s auth_user, or Spring Security changes the story.

In short, “schemes you can identify from the string alone” and “schemes where the string only gives you a candidate set” are two different things. Just keeping these separate changes how an investigation proceeds.

2. At-a-Glance Identification Tables

2.1 Formats nearly pinned down by a prefix or format marker

“Confidence” in the table is used in this sense.

Strong: nearly identifiable from the string alone
Medium: candidates narrow considerably, but watch for implementation differences
Weak: cannot be determined from length or appearance alone

Visual feature	First suspect	Confidence	Notes	Example
`$argon2id$...`	Argon2id	Strong	PHC string format. Often followed by `v=`, `m=`, `t=`, `p=`	`$argon2id$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$uKZLaN6muIyoyIYr5waqw3y+zaDbe9aLSPj6Ln/rbz4`
`$argon2i$...`	Argon2i	Strong	Same as above	`$argon2i$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$Kx1koF/7n8EytGJYTS5krh+ag+FlG5ksM4xOsjOSDvo`
`$argon2d$...`	Argon2d	Strong	Same as above	`$argon2d$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$HLIGA+T1bwK8akx3LGOco+Df+PvxX6cIXhycO7O7t6c`
`$2a$...` / `$2b$...` / `$2y$...`	bcrypt	Strong	2-digit cost + crypt-family alphabet	`$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO`
`$1$...`	md5crypt	Strong	The Unix-family MD5 password storage format	`$1$vA7mQ9xZ$Erz32JUFnZ9991KdU5.N3.`
`$5$...`	sha256crypt	Strong	Not plain SHA-256	`$5$rounds=5000$N3v8Kx2Lq9Rt$uOTla5GAHaRH2aHlUSjkrZUBCuFiahQZ36O/seB39r3`
`$6$...`	sha512crypt	Strong	Not plain SHA-512	`$6$rounds=5000$N3v8Kx2Lq9Rt$6LUcSUAELX3aC/.60pTB.TFLTQi1mOGRCwKqNCqtRSaXjorxj01HJ9oNni97Kci1uDt7a/Kn4t3OS20Dw/.vi1`
`$7$...`	scrypt (crypt family)	Strong	Seen in Linux `crypt(5)`-family implementations	`$7$CU..../....k2XAnEHBqQ1Ct2aMXFKNa/$y3Q0e/UlCHacIGWQshgvvz6UIbP.BCja.5BfVWP2Ml8`
`$y$...`	yescrypt	Strong	Seen on newer Linux systems	`$y$j9T$k2XAnEHBqQ1Ct2aMXFKNa/$OVYXzjlkiQpWT/F1CUE0JrvV4phLY8FB.ofDttnrSQ7`
`$apr1$...`	Apache APR1-MD5	Strong	Often seen in `.htpasswd`	`$apr1$vA7mQ9xZ$ZE64.ohiyK11sPZmtnJZQ.`
`{SHA}...`	Base64 representation of a SHA-1 digest	Strong	Often seen in Apache / LDAP contexts	`{SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE=`
`{SSHA}...`	salted SHA-1	Strong	LDAP family	`{SSHA}/OczD0GNNkOAUPbYhA3L9fjmcyBCbHVlTWVzYTQyIQ==`
`{MD5}...` / `{SMD5}...`	MD5 / salted MD5	Strong	LDAP family	`{MD5}X03MO1qnZdYdgyfeuILPmQ==` `{SMD5}fOn1rOv4ZH0OrO/KT9H0fEJsdWVNZXNhNDIh`
`pbkdf2_sha256$...`	PBKDF2-HMAC-SHA256	Medium–Strong	Django and others prepend the format name	`pbkdf2_sha256$600000$N3v8Kx2Lq9Rt$CLxGB+zTiV1IdOt2y4m9JpaAONzHuRTOd96xKQwRQAs`
`{bcrypt}$2b$...`	bcrypt	Strong	Wrapped in Spring Security’s `{id}` prefix	`{bcrypt}$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO`
`{pbkdf2}...` / `{scrypt}...`	Implementation-labeled schemes	Medium–Strong	Spring Security and similar; identify the wrapper format rather than the underlying algorithm	`{pbkdf2}sha256$600000$Qmx1ZU1lc2E0MiE$4eNuai1qNkgs1kXz3+tBUMzAexVsSUz9SrQKEhbk0Cw` `{scrypt}ln=14,r=8,p=1$Qmx1ZU1lc2E0MiE$xAgBRhXbMtHB1UHUR0br5bI+1XdXWKbwauiFv5VRQBY`

The point of this table is that formats where the first few characters carry meaning are strong. Strings delimited by $...$ in particular are very likely Unix crypt(3) / MCF / PHC-family formats, and it is faster to look at the prefix before the length.

2.2 Narrowing candidates by length for plain hex / Base64

This table is for bare digest strings without a prefix. For representations containing :, -, or whitespace, first strip the separators and then count the length.

Raw byte length	Hex chars	Base64 chars (with / without `=`)	Main candidates	Example
4	8	8 / 6	Checksums such as CRC32	`cbf43926`
16	32	24 / 22	MD5, MD4, NTLM family	`5f4dcc3b5aa765d61d8327deb882cf99`
20	40	28 / 27	SHA-1, RIPEMD-160	`da39a3ee5e6b4b0d3255bfef95601890afd80709`
28	56	40 / 38	SHA-224, SHA-512/224, SHA3-224	`d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f`
32	64	44 / 43	SHA-256, SHA-512/256, SHA3-256, BLAKE2s-256, BLAKE3’s default 32-byte output	`e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855`
48	96	64 / 64	SHA-384, SHA3-384, BLAKE2b-384	`38b060a751ac96384cd9327eb1b1e36a21fdb71114be07434c0cc7bf63f6e1da274edebfe76f65fbd51ad2f14898b95b`
64	128	88 / 86	SHA-512, SHA3-512, BLAKE2b-512, Whirlpool	`cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e`

The key takeaway here is that a matching length does not uniquely determine the scheme. Hex strings of 32 / 64 / 128 characters in particular have many candidates, and declaring a winner from this alone misses often.

2.3 Classic examples that trip people up

What the string looks like	Common snap judgment	The right way to read it	Example
`5f4dcc3b5aa765d61d8327deb882cf99`	Definitely MD5	Looks like MD5, but could also be MD4 / NTLM family or an app-specific use of MD5	`8846f7eaee8fb117ad06bdd830b7586c`
64 hex like `2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824`	Definitely SHA-256	SHA-256 is a candidate, but SHA3-256 / SHA-512/256 / BLAKE2s-256 / BLAKE3 are also possible	`e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855`
`$6$rounds=5000$salt$hash`	A hex representation of SHA-512	No — it is a password hash string called sha512crypt	`$6$rounds=5000$N3v8Kx2Lq9Rt$6LUcSUAELX3aC/.60pTB.TFLTQi1mOGRCwKqNCqtRSaXjorxj01HJ9oNni97Kci1uDt7a/Kn4t3OS20Dw/.vi1`
`{SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE=`	Some kind of “SHA”	In Apache / LDAP contexts this usually means a Base64-encoded SHA-1 digest	`{SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE=`
`{bcrypt}$2b$12$...`	A proprietary scheme called `{bcrypt}`	bcrypt wrapped by Spring Security	`{bcrypt}$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO`

3. The Practical Identification Procedure

From here, let’s organize how to actually look at a string, step by step. The recommended order is prefix → separators → character set → length → context.

3.1 Look at the leading characters first

The first one to ten characters narrow things down considerably.

$argon2id$ / $argon2i$ / $argon2d$ Strongly suspect Argon2’s PHC string format. The components are easy to follow in the Example column of section 2.1.
$2a$ / $2b$ / $2y$ Strongly suspect bcrypt.
$1$ / $5$ / $6$ / $7$ / $y$ Suspect a Unix crypt(3)-family password hash.
{SHA} / {SSHA} / {MD5} / {SMD5} Suspect LDAP / Apache-family representations.
{bcrypt} / {pbkdf2} / {scrypt} Suspect an implementation-labeled storage format like Spring Security’s.

The trick here is to look not just at the underlying algorithm but at the storage format. For example, $6$ is not “a SHA-512 digest” — it is “a password hash string that uses SHA-512.” Mixing these up will skew the rest of the investigation.

3.2 Look at the number of separators

Next, look at separators such as $, :, {}, ,, and =.

Multiple $ characters Suspect a format that carries parameters, salt, and hash together. Argon2, bcrypt, sha256crypt, and sha512crypt are typical.
Starts with {name} Suspect a wrapper that names the scheme explicitly, as in LDAP / Spring Security.
Shapes like algo:salt:hash or algo$iterations$salt$hash Suspect a framework- or app-specific format. Django’s pbkdf2_sha256$iterations$salt$hash is the classic example.

The more separators a string has, the easier the scheme is to identify. Conversely, a single lump of bare hex or Base64 stays quite ambiguous.

3.3 Look at the character set

The character set is as important as the length.

Hex representation

If the string consists only of [0-9a-fA-F], suspect a hex representation first. In this case, character count ÷ 2 = raw byte length.

32 hex → 16 bytes
40 hex → 20 bytes
64 hex → 32 bytes
128 hex → 64 bytes

RFC 4648 Base64 / Base64url

If you see + / =, suspect ordinary Base64 first. If you see - _, suspect Base64url. Padding = may be omitted, so lengths come in “either is possible” pairs like 43 / 44 and 86 / 88.

crypt-family radix64

If . and / appear and the string is delimited with $...$ , it is more natural to suspect a crypt-family alphabet than ordinary Base64. bcrypt, sha256crypt, sha512crypt, md5crypt, yescrypt, scrypt, and friends use this family of character sets.

This is unglamorous but very effective. If you read “there’s a . in it, so it’s broken Base64”, you will easily overlook bcrypt and the crypt(3) family.

3.4 Count the length

After the character set, look at the length. The reasoning is simple.

For hex: raw byte length = character count / 2
For Base64: character count ≈ 4 × ceil(raw byte length / 3) Note that omitting the = padding makes it 0–2 characters shorter

At this stage you narrow the candidates. But it is safer not to make the leap of “64 hex, therefore confirmed SHA-256.”

3.5 Confirm with context

What clinches it in the end is context. This is where you approach 100%.

Found in /etc/shadow Suspect Linux password hash formats such as $y$ , $6$ , $5$ , $1$
Found in .htpasswd Suspect Apache-family formats such as $apr1$ , {SHA}, bcrypt
Found in Django settings or auth_user.password Suspect Django formats such as pbkdf2_sha256$... or argon2$...
Found in a Spring Security authentication table Suspect {id}-prefixed formats such as {bcrypt}... or {pbkdf2}...
32 hex found around SMB / AD integration Seriously consider NTLM / MD4 family

In practice, looking at the product, framework, or configuration file the string came from is often faster than staring at the string itself.

4. Common Misidentifications

4.1 Hard-coding `64 hex = SHA-256`

This one is very common. SHA-256 is of course a strong candidate, but multiple schemes produce the same 32-byte output. SHA3-256, SHA-512/256, BLAKE2s-256, and BLAKE3’s default output are all the same length.

Length is material for building a candidate set, not material for a verdict.

4.2 Mistaking $6$ for plain SHA-512

$6$... is the sha512crypt prefix. It is not “a hex SHA-512 digest” — it is a password hash string that includes a salt and rounds.

Likewise:

$5$ is sha256crypt
$1$ is md5crypt

The moment a prefix is present, it is no longer “just a digest.”

4.3 Reading `{SHA}` as “either SHA-256 or SHA-512”

In Apache or LDAP contexts, {SHA} does not vaguely mean “the SHA family.” In most cases it means a Base64-encoded SHA-1 digest. {SSHA} is salted SHA-1.

If you handle {SHA} loosely as “some kind of SHA” based on appearance, you will get verification code and migration logic wrong.

4.4 Treating password hashes and content hashes as the same thing

They are both “hash strings,” but their purposes differ.

Digests for file integrity checks
Digests for API signing
Hash / KDF strings for password storage

These three look similar but are handled differently. Password hashes in particular often embed salt, rounds, memory cost, parallelism, and so on into the string, so the “compare raw digests” mindset will not see through them.

4.5 Forgetting XOFs and variable-length digests

SHAKE128 / SHAKE256 are XOFs, so the output length can be chosen freely. BLAKE2 also allows a configurable digest length, and BLAKE3 has extendable output as well.

In other words, the inference “this length, therefore this scheme” misses whenever it leans too hard on the assumption of classic fixed-length digests.

5. The Verification Order When You Need 100% Certainty

Migrations and authentication integrations ultimately require certainty. In that case, checking in the following order keeps you out of trouble.

5.1 Identify the storage source

First, establish where the string came from.

Linux shadow?
Apache / Nginx basic auth?
LDAP?
Django / Spring Security?
A custom application’s DB?

The specification of the storage source is often stronger evidence than the string alone.

5.2 Look up the “storage format” in official documentation

Next, look up the storage format, not the algorithm name.

Django password format
Spring Security password storage format
crypt(5) sha512crypt format
Apache htpasswd password formats

Searching with format / storage / encoding as keywords makes these easy to find.

5.3 If you have a known plaintext, actually verify against the candidate schemes

If you have a test account or a known plaintext, the fastest route is to compute with the candidate schemes and compare. For password hashes, this means extracting the salt and rounds from the string and recomputing.

5.4 Check the implementation code or configuration

If the system under investigation is your own, looking at the code and configuration is ultimately the most reliable.

The library in use
The framework configuration
The options used at generation time
The output encoding (hex / Base64 / Base64url / crypt alphabet)

Looking here usually settles the matter.

5.5 For the future, store with a scheme label

If you are the one designing going forward, choosing a format that embeds the scheme into the string makes future migrations much easier.

Argon2’s PHC string format
Spring Security’s {id}encodedPassword
Django’s algo$iterations$salt$hash
Unix crypt(3)-family prefixed formats

Done this way, whoever looks at it later will rarely be confused. Conversely, a design that puts “just 64 hex” in the DB is unkind to your future self.

6. Summary

When identifying the scheme behind the string representation of a hash, it helps to look in this order.

Is there a prefix?
What are the separators?
What is the character set?
How many bytes does the length correspond to?
What is the context of the storage source?

The two most important points are:

Prefixed storage formats are fairly easy to pin down
Plain hex / Base64 often only gets you to a candidate set

So the practical judgment goes like this.

With $argon2id$..., $2b$..., $6$..., {SHA}..., pbkdf2_sha256$..., the string alone takes you quite far
With only 32 / 40 / 64 / 128 hex digits, think “narrow the candidates,” not “declare a verdict”
If you truly need certainty, go look at the source product, configuration, and implementation

Following this order speeds up an investigation considerably. Snap judgments based on length alone, on the other hand, quietly send you the long way around.

Technical Consulting & Design Review

Identifying the scheme of password hashes left in an existing DB, migrating an authentication platform, and investigating logs in mixed Windows / Web systems all require organizing not just the look of the strings but the storage source’s implementation and the migration policy. Looking at everything together, from scheme identification through migration design, makes accidents easier to avoid.

Bug Investigation & Root Cause Analysis

Investigations stuck on “we can’t make progress on verification because we don’t know what this string is” are not unusual. Pinning down where the scheme is decided — logs, configuration files, DB schema, or application implementation — speeds up root cause identification considerably.

8. References

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

What to Do Before Disposing of a Windows PC — A Practical Checklist for Data Erasure, Account Unlinking, and Backups

What to do before disposing of, transferring, selling, or returning a leased Windows PC — covering backups, data erasure, BitLocker, Micr...

Read Article

Handling Windows Impersonation Tokens Correctly — Borrowing Privileges per Thread and Reverting Safely

A practical guide to Windows impersonation tokens — access tokens, primary tokens, thread tokens, impersonation levels, RevertToSelf, and...

Read Article

What Is MFC on Windows? Foundational Knowledge for Maintaining Existing Assets

An overview of the Microsoft Foundation Classes (MFC): its relationship to Win32, application structure, message maps, Document/View, DDX...

Read Article

What Is a PDB (Program Database)? — Understanding Debug Information, Symbols, and Source Link

What a PDB (Program Database) is, what it does and does not contain, and how it relates to Debug / Release, Portable PDBs, Source Link, s...

Read Article

What Is Roslyn? Reading, Fixing, and Generating C# Code from the Compiler's Point of View

An overview of Roslyn (the .NET Compiler Platform): Syntax Trees, SemanticModel, Workspaces, Analyzers, Source Generators, and where they...

Read Article

Where This Topic Connects

This article connects naturally to the following service pages.

Technical Consulting & Design Review

This suits engagements where we separate the logs, DB, authentication method, and storage format of an existing system, identify the hash scheme, and organize the decisions for migration or investigation.

View Service Contact

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

View Profile Contact

Public links

GitHub LinkedIn X COM_BLAS COM_BigDecimal

Table of Contents

1. The Conclusion First

2. At-a-Glance Identification Tables

2.1 Formats nearly pinned down by a prefix or format marker

2.2 Narrowing candidates by length for plain hex / Base64

2.3 Classic examples that trip people up

3. The Practical Identification Procedure

3.1 Look at the leading characters first

3.2 Look at the number of separators

3.3 Look at the character set

Hex representation

RFC 4648 Base64 / Base64url

crypt-family radix64

3.4 Count the length

3.5 Confirm with context

4. Common Misidentifications

4.1 Hard-coding 64 hex = SHA-256

4.2 Mistaking $6$ for plain SHA-512

4.3 Reading {SHA} as “either SHA-256 or SHA-512”

4.4 Treating password hashes and content hashes as the same thing

4.5 Forgetting XOFs and variable-length digests

5. The Verification Order When You Need 100% Certainty

5.1 Identify the storage source

5.2 Look up the “storage format” in official documentation

5.3 If you have a known plaintext, actually verify against the candidate schemes

5.4 Check the implementation code or configuration

5.5 For the future, store with a scheme label

6. Summary

7. Services Related to This Theme

Technical Consulting & Design Review

Bug Investigation & Root Cause Analysis

8. References

Related Articles

What to Do Before Disposing of a Windows PC — A Practical Checklist for Data Erasure, Account Unlinking, and Backups

Handling Windows Impersonation Tokens Correctly — Borrowing Privileges per Thread and Reverting Safely

What Is MFC on Windows? Foundational Knowledge for Maintaining Existing Assets

What Is a PDB (Program Database)? — Understanding Debug Information, Symbols, and Source Link

What Is Roslyn? Reading, Fixing, and Generating C# Code from the Compiler's Point of View

Related Topics

Windows Technical Topics

Where This Topic Connects

Technical Consulting & Design Review

Author Profile

Go Komura

4.1 Hard-coding `64 hex = SHA-256`

4.3 Reading `{SHA}` as “either SHA-256 or SHA-512”