A Practical Procedure for Identifying the Scheme Behind a Hash String
· Go Komura · Hash, Security, Passwords, Legacy Asset Reuse, Technical Investigation
There are plenty of situations where you look at a string like 5f4dcc3b5aa765d61d8327deb882cf99 or $2b$12$... left in logs or a database and want to determine “what kind of hash is this?” In migrations of existing systems, investigations of authentication methods, log analysis, and integrations with third-party systems, it is not unusual to get stuck right here.
The dangerous move, though, is jumping to a conclusion based on length alone.
Looking at a 64-character hex string and declaring “that’s SHA-256” is premature. SHA3-256, SHA-512/256, BLAKE2s-256, and BLAKE3’s default 32-byte output can all be the same length. Conversely, storage formats that include a prefix and parameters, like $2b$ or $argon2id$, can be identified with quite high accuracy from the string alone.
In this article we use the word hash broadly, covering not only message digests like MD5 / SHA-2 / SHA-3 but also string representations used for password storage such as bcrypt / scrypt / Argon2 / PBKDF2.
The content is organized based on the RFCs, NIST publications, Linux crypt(5), Apache, Django, Spring Security, and other official materials publicly available as of April 2026.
Table of Contents
- The Conclusion First
- At-a-Glance Identification Tables
- The Practical Identification Procedure
- Common Misidentifications
- The Verification Order When You Need 100% Certainty
- Summary
- Services Related to This Theme
- References
1. The Conclusion First
Here is the short version up front.
-
Storage formats with prefixes or separators are fairly easy to identify from the string alone. Examples:
$argon2id$...,$2b$...,$5$...,$6$...,{SHA}...,pbkdf2_sha256$... -
Plain hex strings or bare Base64 usually only get you as far as “narrowing the candidates.” Example:
32 hex = could be MD5, but could also come from MD4 / NTLM -
The character set is as important as the length. If you see
+/=, it looks like RFC 4648 Base64; if it contains.and is$-delimited, it looks like thecrypt(3)family — distinctions like these really work. -
If you want 100% certainty, you need context. Whether it lives in
/etc/shadow,.htpasswd, Django’sauth_user, or Spring Security changes the story.
In short, “schemes you can identify from the string alone” and “schemes where the string only gives you a candidate set” are two different things. Just keeping these separate changes how an investigation proceeds.
2. At-a-Glance Identification Tables
2.1 Formats nearly pinned down by a prefix or format marker
“Confidence” in the table is used in this sense.
- Strong: nearly identifiable from the string alone
- Medium: candidates narrow considerably, but watch for implementation differences
- Weak: cannot be determined from length or appearance alone
| Visual feature | First suspect | Confidence | Notes | Example |
|---|---|---|---|---|
$argon2id$... |
Argon2id | Strong | PHC string format. Often followed by v=, m=, t=, p= |
$argon2id$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$uKZLaN6muIyoyIYr5waqw3y+zaDbe9aLSPj6Ln/rbz4 |
$argon2i$... |
Argon2i | Strong | Same as above | $argon2i$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$Kx1koF/7n8EytGJYTS5krh+ag+FlG5ksM4xOsjOSDvo |
$argon2d$... |
Argon2d | Strong | Same as above | $argon2d$v=19$m=65536,t=3,p=4$MDEyMzQ1Njc4OWFiY2RlZg$HLIGA+T1bwK8akx3LGOco+Df+PvxX6cIXhycO7O7t6c |
$2a$... / $2b$... / $2y$... |
bcrypt | Strong | 2-digit cost + crypt-family alphabet | $2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO |
$1$... |
md5crypt | Strong | The Unix-family MD5 password storage format | $1$vA7mQ9xZ$Erz32JUFnZ9991KdU5.N3. |
$5$... |
sha256crypt | Strong | Not plain SHA-256 | $5$rounds=5000$N3v8Kx2Lq9Rt$uOTla5GAHaRH2aHlUSjkrZUBCuFiahQZ36O/seB39r3 |
$6$... |
sha512crypt | Strong | Not plain SHA-512 | $6$rounds=5000$N3v8Kx2Lq9Rt$6LUcSUAELX3aC/.60pTB.TFLTQi1mOGRCwKqNCqtRSaXjorxj01HJ9oNni97Kci1uDt7a/Kn4t3OS20Dw/.vi1 |
$7$... |
scrypt (crypt family) | Strong | Seen in Linux crypt(5)-family implementations |
$7$CU..../....k2XAnEHBqQ1Ct2aMXFKNa/$y3Q0e/UlCHacIGWQshgvvz6UIbP.BCja.5BfVWP2Ml8 |
$y$... |
yescrypt | Strong | Seen on newer Linux systems | $y$j9T$k2XAnEHBqQ1Ct2aMXFKNa/$OVYXzjlkiQpWT/F1CUE0JrvV4phLY8FB.ofDttnrSQ7 |
$apr1$... |
Apache APR1-MD5 | Strong | Often seen in .htpasswd |
$apr1$vA7mQ9xZ$ZE64.ohiyK11sPZmtnJZQ. |
{SHA}... |
Base64 representation of a SHA-1 digest | Strong | Often seen in Apache / LDAP contexts | {SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE= |
{SSHA}... |
salted SHA-1 | Strong | LDAP family | {SSHA}/OczD0GNNkOAUPbYhA3L9fjmcyBCbHVlTWVzYTQyIQ== |
{MD5}... / {SMD5}... |
MD5 / salted MD5 | Strong | LDAP family | {MD5}X03MO1qnZdYdgyfeuILPmQ=={SMD5}fOn1rOv4ZH0OrO/KT9H0fEJsdWVNZXNhNDIh |
pbkdf2_sha256$... |
PBKDF2-HMAC-SHA256 | Medium–Strong | Django and others prepend the format name | pbkdf2_sha256$600000$N3v8Kx2Lq9Rt$CLxGB+zTiV1IdOt2y4m9JpaAONzHuRTOd96xKQwRQAs |
{bcrypt}$2b$... |
bcrypt | Strong | Wrapped in Spring Security’s {id} prefix |
{bcrypt}$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO |
{pbkdf2}... / {scrypt}... |
Implementation-labeled schemes | Medium–Strong | Spring Security and similar; identify the wrapper format rather than the underlying algorithm | {pbkdf2}sha256$600000$Qmx1ZU1lc2E0MiE$4eNuai1qNkgs1kXz3+tBUMzAexVsSUz9SrQKEhbk0Cw{scrypt}ln=14,r=8,p=1$Qmx1ZU1lc2E0MiE$xAgBRhXbMtHB1UHUR0br5bI+1XdXWKbwauiFv5VRQBY |
The point of this table is that formats where the first few characters carry meaning are strong.
Strings delimited by $...$ in particular are very likely Unix crypt(3) / MCF / PHC-family formats, and it is faster to look at the prefix before the length.
2.2 Narrowing candidates by length for plain hex / Base64
This table is for bare digest strings without a prefix.
For representations containing :, -, or whitespace, first strip the separators and then count the length.
| Raw byte length | Hex chars | Base64 chars (with / without =) |
Main candidates | Example |
|---|---|---|---|---|
| 4 | 8 | 8 / 6 | Checksums such as CRC32 | cbf43926 |
| 16 | 32 | 24 / 22 | MD5, MD4, NTLM family | 5f4dcc3b5aa765d61d8327deb882cf99 |
| 20 | 40 | 28 / 27 | SHA-1, RIPEMD-160 | da39a3ee5e6b4b0d3255bfef95601890afd80709 |
| 28 | 56 | 40 / 38 | SHA-224, SHA-512/224, SHA3-224 | d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f |
| 32 | 64 | 44 / 43 | SHA-256, SHA-512/256, SHA3-256, BLAKE2s-256, BLAKE3’s default 32-byte output | e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 |
| 48 | 96 | 64 / 64 | SHA-384, SHA3-384, BLAKE2b-384 | 38b060a751ac96384cd9327eb1b1e36a21fdb71114be07434c0cc7bf63f6e1da274edebfe76f65fbd51ad2f14898b95b |
| 64 | 128 | 88 / 86 | SHA-512, SHA3-512, BLAKE2b-512, Whirlpool | cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e |
The key takeaway here is that a matching length does not uniquely determine the scheme. Hex strings of 32 / 64 / 128 characters in particular have many candidates, and declaring a winner from this alone misses often.
2.3 Classic examples that trip people up
| What the string looks like | Common snap judgment | The right way to read it | Example |
|---|---|---|---|
5f4dcc3b5aa765d61d8327deb882cf99 |
Definitely MD5 | Looks like MD5, but could also be MD4 / NTLM family or an app-specific use of MD5 | 8846f7eaee8fb117ad06bdd830b7586c |
64 hex like 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 |
Definitely SHA-256 | SHA-256 is a candidate, but SHA3-256 / SHA-512/256 / BLAKE2s-256 / BLAKE3 are also possible | e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 |
$6$rounds=5000$salt$hash |
A hex representation of SHA-512 | No — it is a password hash string called sha512crypt | $6$rounds=5000$N3v8Kx2Lq9Rt$6LUcSUAELX3aC/.60pTB.TFLTQi1mOGRCwKqNCqtRSaXjorxj01HJ9oNni97Kci1uDt7a/Kn4t3OS20Dw/.vi1 |
{SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE= |
Some kind of “SHA” | In Apache / LDAP contexts this usually means a Base64-encoded SHA-1 digest | {SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE= |
{bcrypt}$2b$12$... |
A proprietary scheme called {bcrypt} |
bcrypt wrapped by Spring Security | {bcrypt}$2b$12$9YQ2u/e5Y/ArOnG.gJKxK.0makLATcYLP1q.Nsabzrw7XErYCfoYO |
3. The Practical Identification Procedure
From here, let’s organize how to actually look at a string, step by step. The recommended order is prefix → separators → character set → length → context.
3.1 Look at the leading characters first
The first one to ten characters narrow things down considerably.
-
$argon2id$/$argon2i$/$argon2d$Strongly suspect Argon2’s PHC string format. The components are easy to follow in theExamplecolumn of section 2.1. -
$2a$/$2b$/$2y$Strongly suspect bcrypt. -
$1$/$5$/$6$/$7$/$y$Suspect a Unixcrypt(3)-family password hash. -
{SHA}/{SSHA}/{MD5}/{SMD5}Suspect LDAP / Apache-family representations. -
{bcrypt}/{pbkdf2}/{scrypt}Suspect an implementation-labeled storage format like Spring Security’s.
The trick here is to look not just at the underlying algorithm but at the storage format.
For example, $6$ is not “a SHA-512 digest” — it is “a password hash string that uses SHA-512.” Mixing these up will skew the rest of the investigation.
3.2 Look at the number of separators
Next, look at separators such as $, :, {}, ,, and =.
-
Multiple
$characters Suspect a format that carries parameters, salt, and hash together. Argon2, bcrypt, sha256crypt, and sha512crypt are typical. -
Starts with
{name}Suspect a wrapper that names the scheme explicitly, as in LDAP / Spring Security. -
Shapes like
algo:salt:hashoralgo$iterations$salt$hashSuspect a framework- or app-specific format. Django’spbkdf2_sha256$iterations$salt$hashis the classic example.
The more separators a string has, the easier the scheme is to identify. Conversely, a single lump of bare hex or Base64 stays quite ambiguous.
3.3 Look at the character set
The character set is as important as the length.
Hex representation
If the string consists only of [0-9a-fA-F], suspect a hex representation first.
In this case, character count ÷ 2 = raw byte length.
- 32 hex → 16 bytes
- 40 hex → 20 bytes
- 64 hex → 32 bytes
- 128 hex → 64 bytes
RFC 4648 Base64 / Base64url
If you see + / =, suspect ordinary Base64 first.
If you see - _, suspect Base64url.
Padding = may be omitted, so lengths come in “either is possible” pairs like 43 / 44 and 86 / 88.
crypt-family radix64
If . and / appear and the string is delimited with $...$, it is more natural to suspect a crypt-family alphabet than ordinary Base64.
bcrypt, sha256crypt, sha512crypt, md5crypt, yescrypt, scrypt, and friends use this family of character sets.
This is unglamorous but very effective.
If you read “there’s a . in it, so it’s broken Base64”, you will easily overlook bcrypt and the crypt(3) family.
3.4 Count the length
After the character set, look at the length. The reasoning is simple.
- For hex:
raw byte length = character count / 2 - For Base64:
character count ≈ 4 × ceil(raw byte length / 3)Note that omitting the=padding makes it 0–2 characters shorter
At this stage you narrow the candidates. But it is safer not to make the leap of “64 hex, therefore confirmed SHA-256.”
3.5 Confirm with context
What clinches it in the end is context. This is where you approach 100%.
-
Found in
/etc/shadowSuspect Linux password hash formats such as$y$,$6$,$5$,$1$ -
Found in
.htpasswdSuspect Apache-family formats such as$apr1$,{SHA}, bcrypt -
Found in Django settings or
auth_user.passwordSuspect Django formats such aspbkdf2_sha256$...orargon2$... -
Found in a Spring Security authentication table Suspect
{id}-prefixed formats such as{bcrypt}...or{pbkdf2}... -
32 hex found around SMB / AD integration Seriously consider NTLM / MD4 family
In practice, looking at the product, framework, or configuration file the string came from is often faster than staring at the string itself.
4. Common Misidentifications
4.1 Hard-coding 64 hex = SHA-256
This one is very common. SHA-256 is of course a strong candidate, but multiple schemes produce the same 32-byte output. SHA3-256, SHA-512/256, BLAKE2s-256, and BLAKE3’s default output are all the same length.
Length is material for building a candidate set, not material for a verdict.
4.2 Mistaking $6$ for plain SHA-512
$6$... is the sha512crypt prefix.
It is not “a hex SHA-512 digest” — it is a password hash string that includes a salt and rounds.
Likewise:
$5$is sha256crypt$1$is md5crypt
The moment a prefix is present, it is no longer “just a digest.”
4.3 Reading {SHA} as “either SHA-256 or SHA-512”
In Apache or LDAP contexts, {SHA} does not vaguely mean “the SHA family.”
In most cases it means a Base64-encoded SHA-1 digest. {SSHA} is salted SHA-1.
If you handle {SHA} loosely as “some kind of SHA” based on appearance, you will get verification code and migration logic wrong.
4.4 Treating password hashes and content hashes as the same thing
They are both “hash strings,” but their purposes differ.
- Digests for file integrity checks
- Digests for API signing
- Hash / KDF strings for password storage
These three look similar but are handled differently. Password hashes in particular often embed salt, rounds, memory cost, parallelism, and so on into the string, so the “compare raw digests” mindset will not see through them.
4.5 Forgetting XOFs and variable-length digests
SHAKE128 / SHAKE256 are XOFs, so the output length can be chosen freely. BLAKE2 also allows a configurable digest length, and BLAKE3 has extendable output as well.
In other words, the inference “this length, therefore this scheme” misses whenever it leans too hard on the assumption of classic fixed-length digests.
5. The Verification Order When You Need 100% Certainty
Migrations and authentication integrations ultimately require certainty. In that case, checking in the following order keeps you out of trouble.
5.1 Identify the storage source
First, establish where the string came from.
- Linux shadow?
- Apache / Nginx basic auth?
- LDAP?
- Django / Spring Security?
- A custom application’s DB?
The specification of the storage source is often stronger evidence than the string alone.
5.2 Look up the “storage format” in official documentation
Next, look up the storage format, not the algorithm name.
Django password formatSpring Security password storage formatcrypt(5) sha512crypt formatApache htpasswd password formats
Searching with format / storage / encoding as keywords makes these easy to find.
5.3 If you have a known plaintext, actually verify against the candidate schemes
If you have a test account or a known plaintext, the fastest route is to compute with the candidate schemes and compare. For password hashes, this means extracting the salt and rounds from the string and recomputing.
5.4 Check the implementation code or configuration
If the system under investigation is your own, looking at the code and configuration is ultimately the most reliable.
- The library in use
- The framework configuration
- The options used at generation time
- The output encoding (hex / Base64 / Base64url / crypt alphabet)
Looking here usually settles the matter.
5.5 For the future, store with a scheme label
If you are the one designing going forward, choosing a format that embeds the scheme into the string makes future migrations much easier.
- Argon2’s PHC string format
- Spring Security’s
{id}encodedPassword - Django’s
algo$iterations$salt$hash - Unix
crypt(3)-family prefixed formats
Done this way, whoever looks at it later will rarely be confused. Conversely, a design that puts “just 64 hex” in the DB is unkind to your future self.
6. Summary
When identifying the scheme behind the string representation of a hash, it helps to look in this order.
- Is there a prefix?
- What are the separators?
- What is the character set?
- How many bytes does the length correspond to?
- What is the context of the storage source?
The two most important points are:
- Prefixed storage formats are fairly easy to pin down
- Plain hex / Base64 often only gets you to a candidate set
So the practical judgment goes like this.
- With
$argon2id$...,$2b$...,$6$...,{SHA}...,pbkdf2_sha256$..., the string alone takes you quite far - With only 32 / 40 / 64 / 128 hex digits, think “narrow the candidates,” not “declare a verdict”
- If you truly need certainty, go look at the source product, configuration, and implementation
Following this order speeds up an investigation considerably. Snap judgments based on length alone, on the other hand, quietly send you the long way around.
7. Services Related to This Theme
Technical Consulting & Design Review
Identifying the scheme of password hashes left in an existing DB, migrating an authentication platform, and investigating logs in mixed Windows / Web systems all require organizing not just the look of the strings but the storage source’s implementation and the migration policy. Looking at everything together, from scheme identification through migration design, makes accidents easier to avoid.
Bug Investigation & Root Cause Analysis
Investigations stuck on “we can’t make progress on verification because we don’t know what this string is” are not unusual. Pinning down where the scheme is decided — logs, configuration files, DB schema, or application implementation — speeds up root cause identification considerably.
8. References
- RFC 1321 - The MD5 Message-Digest Algorithm
- NIST FIPS 180-4 - Secure Hash Standard (SHA-1, SHA-2, SHA-512/224, SHA-512/256)
- NIST FIPS 202 - SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions
- PHC string format specification
- Argon2 reference implementation
- RFC 7693 - The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC)
- BLAKE3 C README - default output length and extendable output
- crypt(5) - prefixes and hashed passphrase formats
- Apache HTTP Server 2.4 - Password Formats
- slappasswd(8) - RFC 2307 schemes such as {SHA} and {SSHA}
- Django documentation - example of
pbkdf2_sha256$... - Spring Security -
DelegatingPasswordEncoderstorage format{id}encodedPassword
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
What to Do Before Disposing of a Windows PC — A Practical Checklist for Data Erasure, Account Unlinking, and Backups
What to do before disposing of, transferring, selling, or returning a leased Windows PC — covering backups, data erasure, BitLocker, Micr...
Handling Windows Impersonation Tokens Correctly — Borrowing Privileges per Thread and Reverting Safely
A practical guide to Windows impersonation tokens — access tokens, primary tokens, thread tokens, impersonation levels, RevertToSelf, and...
What Is MFC on Windows? Foundational Knowledge for Maintaining Existing Assets
An overview of the Microsoft Foundation Classes (MFC): its relationship to Win32, application structure, message maps, Document/View, DDX...
What Is a PDB (Program Database)? — Understanding Debug Information, Symbols, and Source Link
What a PDB (Program Database) is, what it does and does not contain, and how it relates to Debug / Release, Portable PDBs, Source Link, s...
What Is Roslyn? Reading, Fixing, and Generating C# Code from the Compiler's Point of View
An overview of Roslyn (the .NET Compiler Platform): Syntax Trees, SemanticModel, Workspaces, Analyzers, Source Generators, and where they...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Where This Topic Connects
This article connects naturally to the following service pages.
Technical Consulting & Design Review
This suits engagements where we separate the logs, DB, authentication method, and storage format of an existing system, identify the hash scheme, and organize the decisions for migration or investigation.
Author Profile
Profile page for the article author.
Go Komura
Representative of KomuraSoft LLC
Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.
Public links