A Practical Procedure for Identifying the Scheme Behind a Hash String

· · Hash, Security, Passwords, Legacy Asset Reuse, Technical Investigation

There are plenty of situations where you look at a string like 5f4dcc3b5aa765d61d8327deb882cf99 or $2b$12$... left in logs or a database and want to determine “what kind of hash is this?” In migrations of existing systems, investigations of authentication methods, log analysis, and integrations with third-party systems, it is not unusual to get stuck right here.

The dangerous move, though, is jumping to a conclusion based on length alone. Looking at a 64-character hex string and declaring “that’s SHA-256” is premature. SHA3-256, SHA-512/256, BLAKE2s-256, and BLAKE3’s default 32-byte output can all be the same length. Conversely, storage formats that include a prefix and parameters, like $2b$ or $argon2id$, can be identified with quite high accuracy from the string alone.

In this article we use the word hash broadly, covering not only message digests like MD5 / SHA-2 / SHA-3 but also string representations used for password storage such as bcrypt / scrypt / Argon2 / PBKDF2. The content is organized based on the RFCs, NIST publications, Linux crypt(5), Apache, Django, Spring Security, and other official materials publicly available as of April 2026.

Table of Contents

  1. The Conclusion First
  2. At-a-Glance Identification Tables
  3. The Practical Identification Procedure
  4. Common Misidentifications
  5. The Verification Order When You Need 100% Certainty
  6. Summary
  7. Services Related to This Theme
  8. References

1. The Conclusion First

Here is the short version up front.

  • Storage formats with prefixes or separators are fairly easy to identify from the string alone. Examples: $argon2id$..., $2b$..., $5$..., $6$..., {SHA}..., pbkdf2_sha256$...

  • Plain hex strings or bare Base64 usually only get you as far as “narrowing the candidates.” Example: 32 hex = could be MD5, but could also come from MD4 / NTLM

  • The character set is as important as the length. If you see + / =, it looks like RFC 4648 Base64; if it contains . and is $-delimited, it looks like the crypt(3) family — distinctions like these really work.

  • If you want 100% certainty, you need context. Whether it lives in /etc/shadow, .htpasswd, Django’s auth_user, or Spring Security changes the story.

In short, “schemes you can identify from the string alone” and “schemes where the string only gives you a candidate set” are two different things. Just keeping these separate changes how an investigation proceeds.

2. At-a-Glance Identification Tables

2.1 Formats nearly pinned down by a prefix or format marker

“Confidence” in the table is used in this sense.

  • Strong: nearly identifiable from the string alone
  • Medium: candidates narrow considerably, but watch for implementation differences
  • Weak: cannot be determined from length or appearance alone
Visual feature First suspect Confidence Notes Example
$argon2id$... Argon2id Strong PHC string format. Often followed by v=, m=, t=, p= $argon2id$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$uKZLaN6muIyoyIYr5waqw3y+zaDbe9aLSPj6Ln/rbz4
$argon2i$... Argon2i Strong Same as above $argon2i$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$Kx1koF/7n8EytGJYTS5krh+ag+FlG5ksM4xOsjOSDvo
$argon2d$... Argon2d Strong Same as above $argon2d$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$HLIGA+T1bwK8akx3LGOco+Df+PvxX6cIXhycO7O7t6c
$2a$... / $2b$... / $2y$... bcrypt Strong 2-digit cost + crypt-family alphabet $2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO
$1$... md5crypt Strong The Unix-family MD5 password storage format $1$vA7mQ9xZ$Erz32JUFnZ9991KdU5.N3.
$5$... sha256crypt Strong Not plain SHA-256 $5$rounds=5000$N3v8Kx2Lq9Rt$uOTla5GAHaRH2aHlUSjkrZUBCuFiahQZ36O/seB39r3
$6$... sha512crypt Strong Not plain SHA-512 $6$rounds=5000$N3v8Kx2Lq9Rt$6LUcSUAELX3aC/.60pTB.TFLTQi1mOGRCwKqNCqtRSaXjorxj01HJ9oNni97Kci1uDt7a/Kn4t3OS20Dw/.vi1
$7$... scrypt (crypt family) Strong Seen in Linux crypt(5)-family implementations $7$CU..../....k2XAnEHBqQ1Ct2aMXFKNa/$y3Q0e/UlCHacIGWQshgvvz6UIbP.BCja.5BfVWP2Ml8
$y$... yescrypt Strong Seen on newer Linux systems $y$j9T$k2XAnEHBqQ1Ct2aMXFKNa/$OVYXzjlkiQpWT/F1CUE0JrvV4phLY8FB.ofDttnrSQ7
$apr1$... Apache APR1-MD5 Strong Often seen in .htpasswd $apr1$vA7mQ9xZ$ZE64.ohiyK11sPZmtnJZQ.
{SHA}... Base64 representation of a SHA-1 digest Strong Often seen in Apache / LDAP contexts {SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE=
{SSHA}... salted SHA-1 Strong LDAP family {SSHA}/OczD0GNNkOAUPbYhA3L9fjmcyBCbHVlTWVzYTQyIQ==
{MD5}... / {SMD5}... MD5 / salted MD5 Strong LDAP family {MD5}X03MO1qnZdYdgyfeuILPmQ==
{SMD5}fOn1rOv4ZH0OrO/KT9H0fEJsdWVNZXNhNDIh
pbkdf2_sha256$... PBKDF2-HMAC-SHA256 Medium–Strong Django and others prepend the format name pbkdf2_sha256$600000$N3v8Kx2Lq9Rt$CLxGB+zTiV1IdOt2y4m9JpaAONzHuRTOd96xKQwRQAs
{bcrypt}$2b$... bcrypt Strong Wrapped in Spring Security’s {id} prefix {bcrypt}$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO
{pbkdf2}... / {scrypt}... Implementation-labeled schemes Medium–Strong Spring Security and similar; identify the wrapper format rather than the underlying algorithm {pbkdf2}sha256$600000$Qmx1ZU1lc2E0MiE$4eNuai1qNkgs1kXz3+tBUMzAexVsSUz9SrQKEhbk0Cw
{scrypt}ln=14,r=8,p=1$Qmx1ZU1lc2E0MiE$xAgBRhXbMtHB1UHUR0br5bI+1XdXWKbwauiFv5VRQBY

The point of this table is that formats where the first few characters carry meaning are strong. Strings delimited by $...$ in particular are very likely Unix crypt(3) / MCF / PHC-family formats, and it is faster to look at the prefix before the length.

2.2 Narrowing candidates by length for plain hex / Base64

This table is for bare digest strings without a prefix. For representations containing :, -, or whitespace, first strip the separators and then count the length.

Raw byte length Hex chars Base64 chars (with / without =) Main candidates Example
4 8 8 / 6 Checksums such as CRC32 cbf43926
16 32 24 / 22 MD5, MD4, NTLM family 5f4dcc3b5aa765d61d8327deb882cf99
20 40 28 / 27 SHA-1, RIPEMD-160 da39a3ee5e6b4b0d3255bfef95601890afd80709
28 56 40 / 38 SHA-224, SHA-512/224, SHA3-224 d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f
32 64 44 / 43 SHA-256, SHA-512/256, SHA3-256, BLAKE2s-256, BLAKE3’s default 32-byte output e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
48 96 64 / 64 SHA-384, SHA3-384, BLAKE2b-384 38b060a751ac96384cd9327eb1b1e36a21fdb71114be07434c0cc7bf63f6e1da274edebfe76f65fbd51ad2f14898b95b
64 128 88 / 86 SHA-512, SHA3-512, BLAKE2b-512, Whirlpool cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e

The key takeaway here is that a matching length does not uniquely determine the scheme. Hex strings of 32 / 64 / 128 characters in particular have many candidates, and declaring a winner from this alone misses often.

2.3 Classic examples that trip people up

What the string looks like Common snap judgment The right way to read it Example
5f4dcc3b5aa765d61d8327deb882cf99 Definitely MD5 Looks like MD5, but could also be MD4 / NTLM family or an app-specific use of MD5 8846f7eaee8fb117ad06bdd830b7586c
64 hex like 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 Definitely SHA-256 SHA-256 is a candidate, but SHA3-256 / SHA-512/256 / BLAKE2s-256 / BLAKE3 are also possible e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
$6$rounds=5000$salt$hash A hex representation of SHA-512 No — it is a password hash string called sha512crypt $6$rounds=5000$N3v8Kx2Lq9Rt$6LUcSUAELX3aC/.60pTB.TFLTQi1mOGRCwKqNCqtRSaXjorxj01HJ9oNni97Kci1uDt7a/Kn4t3OS20Dw/.vi1
{SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE= Some kind of “SHA” In Apache / LDAP contexts this usually means a Base64-encoded SHA-1 digest {SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE=
{bcrypt}$2b$12$... A proprietary scheme called {bcrypt} bcrypt wrapped by Spring Security {bcrypt}$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO

3. The Practical Identification Procedure

From here, let’s organize how to actually look at a string, step by step. The recommended order is prefix → separators → character set → length → context.

3.1 Look at the leading characters first

The first one to ten characters narrow things down considerably.

  • $argon2id$ / $argon2i$ / $argon2d$ Strongly suspect Argon2’s PHC string format. The components are easy to follow in the Example column of section 2.1.

  • $2a$ / $2b$ / $2y$ Strongly suspect bcrypt.

  • $1$ / $5$ / $6$ / $7$ / $y$ Suspect a Unix crypt(3)-family password hash.

  • {SHA} / {SSHA} / {MD5} / {SMD5} Suspect LDAP / Apache-family representations.

  • {bcrypt} / {pbkdf2} / {scrypt} Suspect an implementation-labeled storage format like Spring Security’s.

The trick here is to look not just at the underlying algorithm but at the storage format. For example, $6$ is not “a SHA-512 digest” — it is “a password hash string that uses SHA-512.” Mixing these up will skew the rest of the investigation.

3.2 Look at the number of separators

Next, look at separators such as $, :, {}, ,, and =.

  • Multiple $ characters Suspect a format that carries parameters, salt, and hash together. Argon2, bcrypt, sha256crypt, and sha512crypt are typical.

  • Starts with {name} Suspect a wrapper that names the scheme explicitly, as in LDAP / Spring Security.

  • Shapes like algo:salt:hash or algo$iterations$salt$hash Suspect a framework- or app-specific format. Django’s pbkdf2_sha256$iterations$salt$hash is the classic example.

The more separators a string has, the easier the scheme is to identify. Conversely, a single lump of bare hex or Base64 stays quite ambiguous.

3.3 Look at the character set

The character set is as important as the length.

Hex representation

If the string consists only of [0-9a-fA-F], suspect a hex representation first. In this case, character count ÷ 2 = raw byte length.

  • 32 hex → 16 bytes
  • 40 hex → 20 bytes
  • 64 hex → 32 bytes
  • 128 hex → 64 bytes

RFC 4648 Base64 / Base64url

If you see + / =, suspect ordinary Base64 first. If you see - _, suspect Base64url. Padding = may be omitted, so lengths come in “either is possible” pairs like 43 / 44 and 86 / 88.

crypt-family radix64

If . and / appear and the string is delimited with $...$, it is more natural to suspect a crypt-family alphabet than ordinary Base64. bcrypt, sha256crypt, sha512crypt, md5crypt, yescrypt, scrypt, and friends use this family of character sets.

This is unglamorous but very effective. If you read “there’s a . in it, so it’s broken Base64”, you will easily overlook bcrypt and the crypt(3) family.

3.4 Count the length

After the character set, look at the length. The reasoning is simple.

  • For hex: raw byte length = character count / 2
  • For Base64: character count ≈ 4 × ceil(raw byte length / 3) Note that omitting the = padding makes it 0–2 characters shorter

At this stage you narrow the candidates. But it is safer not to make the leap of “64 hex, therefore confirmed SHA-256.”

3.5 Confirm with context

What clinches it in the end is context. This is where you approach 100%.

  • Found in /etc/shadow Suspect Linux password hash formats such as $y$, $6$, $5$, $1$

  • Found in .htpasswd Suspect Apache-family formats such as $apr1$, {SHA}, bcrypt

  • Found in Django settings or auth_user.password Suspect Django formats such as pbkdf2_sha256$... or argon2$...

  • Found in a Spring Security authentication table Suspect {id}-prefixed formats such as {bcrypt}... or {pbkdf2}...

  • 32 hex found around SMB / AD integration Seriously consider NTLM / MD4 family

In practice, looking at the product, framework, or configuration file the string came from is often faster than staring at the string itself.

4. Common Misidentifications

4.1 Hard-coding 64 hex = SHA-256

This one is very common. SHA-256 is of course a strong candidate, but multiple schemes produce the same 32-byte output. SHA3-256, SHA-512/256, BLAKE2s-256, and BLAKE3’s default output are all the same length.

Length is material for building a candidate set, not material for a verdict.

4.2 Mistaking $6$ for plain SHA-512

$6$... is the sha512crypt prefix. It is not “a hex SHA-512 digest” — it is a password hash string that includes a salt and rounds.

Likewise:

  • $5$ is sha256crypt
  • $1$ is md5crypt

The moment a prefix is present, it is no longer “just a digest.”

4.3 Reading {SHA} as “either SHA-256 or SHA-512”

In Apache or LDAP contexts, {SHA} does not vaguely mean “the SHA family.” In most cases it means a Base64-encoded SHA-1 digest. {SSHA} is salted SHA-1.

If you handle {SHA} loosely as “some kind of SHA” based on appearance, you will get verification code and migration logic wrong.

4.4 Treating password hashes and content hashes as the same thing

They are both “hash strings,” but their purposes differ.

  • Digests for file integrity checks
  • Digests for API signing
  • Hash / KDF strings for password storage

These three look similar but are handled differently. Password hashes in particular often embed salt, rounds, memory cost, parallelism, and so on into the string, so the “compare raw digests” mindset will not see through them.

4.5 Forgetting XOFs and variable-length digests

SHAKE128 / SHAKE256 are XOFs, so the output length can be chosen freely. BLAKE2 also allows a configurable digest length, and BLAKE3 has extendable output as well.

In other words, the inference “this length, therefore this scheme” misses whenever it leans too hard on the assumption of classic fixed-length digests.

5. The Verification Order When You Need 100% Certainty

Migrations and authentication integrations ultimately require certainty. In that case, checking in the following order keeps you out of trouble.

5.1 Identify the storage source

First, establish where the string came from.

  • Linux shadow?
  • Apache / Nginx basic auth?
  • LDAP?
  • Django / Spring Security?
  • A custom application’s DB?

The specification of the storage source is often stronger evidence than the string alone.

5.2 Look up the “storage format” in official documentation

Next, look up the storage format, not the algorithm name.

  • Django password format
  • Spring Security password storage format
  • crypt(5) sha512crypt format
  • Apache htpasswd password formats

Searching with format / storage / encoding as keywords makes these easy to find.

5.3 If you have a known plaintext, actually verify against the candidate schemes

If you have a test account or a known plaintext, the fastest route is to compute with the candidate schemes and compare. For password hashes, this means extracting the salt and rounds from the string and recomputing.

5.4 Check the implementation code or configuration

If the system under investigation is your own, looking at the code and configuration is ultimately the most reliable.

  • The library in use
  • The framework configuration
  • The options used at generation time
  • The output encoding (hex / Base64 / Base64url / crypt alphabet)

Looking here usually settles the matter.

5.5 For the future, store with a scheme label

If you are the one designing going forward, choosing a format that embeds the scheme into the string makes future migrations much easier.

  • Argon2’s PHC string format
  • Spring Security’s {id}encodedPassword
  • Django’s algo$iterations$salt$hash
  • Unix crypt(3)-family prefixed formats

Done this way, whoever looks at it later will rarely be confused. Conversely, a design that puts “just 64 hex” in the DB is unkind to your future self.

6. Summary

When identifying the scheme behind the string representation of a hash, it helps to look in this order.

  1. Is there a prefix?
  2. What are the separators?
  3. What is the character set?
  4. How many bytes does the length correspond to?
  5. What is the context of the storage source?

The two most important points are:

  • Prefixed storage formats are fairly easy to pin down
  • Plain hex / Base64 often only gets you to a candidate set

So the practical judgment goes like this.

  • With $argon2id$..., $2b$..., $6$..., {SHA}..., pbkdf2_sha256$..., the string alone takes you quite far
  • With only 32 / 40 / 64 / 128 hex digits, think “narrow the candidates,” not “declare a verdict”
  • If you truly need certainty, go look at the source product, configuration, and implementation

Following this order speeds up an investigation considerably. Snap judgments based on length alone, on the other hand, quietly send you the long way around.

Technical Consulting & Design Review

Identifying the scheme of password hashes left in an existing DB, migrating an authentication platform, and investigating logs in mixed Windows / Web systems all require organizing not just the look of the strings but the storage source’s implementation and the migration policy. Looking at everything together, from scheme identification through migration design, makes accidents easier to avoid.

Bug Investigation & Root Cause Analysis

Investigations stuck on “we can’t make progress on verification because we don’t know what this string is” are not unusual. Pinning down where the scheme is decided — logs, configuration files, DB schema, or application implementation — speeds up root cause identification considerably.

8. References

  1. RFC 1321 - The MD5 Message-Digest Algorithm
  2. NIST FIPS 180-4 - Secure Hash Standard (SHA-1, SHA-2, SHA-512/224, SHA-512/256)
  3. NIST FIPS 202 - SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions
  4. PHC string format specification
  5. Argon2 reference implementation
  6. RFC 7693 - The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)
  7. BLAKE3 C README - default output length and extendable output
  8. crypt(5) - prefixes and hashed passphrase formats
  9. Apache HTTP Server 2.4 - Password Formats
  10. slappasswd(8) - RFC 2307 schemes such as {SHA} and {SSHA}
  11. Django documentation - example of pbkdf2_sha256$...
  12. Spring Security - DelegatingPasswordEncoder storage format {id}encodedPassword

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

These topic pages place the article in a broader service and decision context.

This article connects naturally to the following service pages.

Author Profile

Profile page for the article author.

Go Komura

Representative of KomuraSoft LLC

Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.

Back to the Blog