’Tis but thy name that is my enemy
―Juliet
Given some data, what is it's URL? There are many candidate solutions to this question, and several decently-funded companies selling their preferred solution.
This document does not provide a new solution. It merely illuminates an existing process that has been, as with the mythical NIMH, "Right before your eyes, and beyond your wildest dreams". This process results in URIs starting with ni:///mh;
that may also be used as a checksum.
URIs beginning with ni:///mh;
are URIs for naming information that are based on RFC6920 Naming Things with Hashes and Multihash.
RFC6920 Naming Things with Hashes defines 'named information' URIs with the ni://
URI Scheme.
The following ni URI is generated from the text "Hello World!" (12 characters without the quotes), using the sha-256 algorithm shown with and without an authority field:
ni:///sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk
RFC6920 defines a Named Information Hash Algorithm Registry at IANA with 63 entries, most of which are unassigned. There are registered IDs for SHA-2 and SHA-3, but not yet for other hash algorithms. Because the registry is limited to 64 entries, it is expected that the process governing the registry may be slow to add new hash functions in order to avoid filling the registry faster than necessary.
Multihash defines a registry of hash algorithms, each with an assigned Hash Name String and Hash ID number that uniquely identifies it within the multihash table. Multihash ID numbers are encoded to binary using the LEB128 variable-length binary encoding instead of a 6-bit (64-entry) fixed-length binary integer like that used in ni URIs
Multihash Appendix D.3.
registers the "mh" hash algorithm in the Named Information Hash Algorithm Registry at IANA with the ID 49
and the Hash Name String mh
.
NIMHs are RFC 6920 ni
URIs that start with ni:///mh;
Hi
A NIMH-BLAKE3s is a RFC 6920 ni
URI that starts with ni:///mh;Hi
and whose multihash uses the 0x1e
(decimal 30
) code that is
assigned
to blake3
.
Let's create the NIMH-BLAKE3 for the data "Hello World!" (12 characters without the quotes).
The base64url-encoded BLAKE3 hash value is XKeBWty0hOmhNsEe_mnB1TAXbVSbXRjQOOtSgLSzRww
⚡ echo -n 'Hello World!' | b3sum | xxd -r -p | base64 | tr '/+' '_-' | tr -d '=' XKeBWty0hOmhNsEe_mnB1TAXbVSbXRjQOOtSgLSzRww
The base64url-encoded BLAKE3 multihash is HiBcp4Fa3LSE6aE2wR7-acHVMBdtVJtdGNA461KAtLNHDA
.
⚡ multihashBlake3B64() {
node --input-type=module -e "$(cat <<-EOF
import * as hasher from "multiformats/hashes/hasher"
import * as crypto from "node:crypto"
import { blake3 } from '@noble/hashes/blake3';
const blake3Hasher = hasher.from({
// As per multiformats table
// https://github.com/multiformats/multicodec/blob/master/table.csv#L9
name: 'blake3',
code: 0x1e,
encode: (input) => Uint8Array.from(blake3(input)),
})
const text = process.argv[1]
if (typeof text === "undefined") {
throw new Error('provide input text as first argument')
}
const digest = await blake3Hasher.digest(new TextEncoder().encode(text))
console.log(Buffer.from(digest.bytes).toString('base64'))
EOF
)" "$1"
}
⚡ multihashBlake3B64 'Hello World!' | tr '/+' '_-' | tr -d '='
HiBcp4Fa3LSE6aE2wR7-acHVMBdtVJtdGNA461KAtLNHDA
The NIMH-BLAKE3 is ni:///mh;HiBcp4Fa3LSE6aE2wR7-acHVMBdtVJtdGNA461KAtLNHDA
The URL-namespaced UUIDv5 is 7026bc85-c20a-58da-b6b6-ea3f7b969d87
.
⚡ urlUuidv5() {
node --input-type=module -e "$(cat <<-EOF
import * as uuidm from 'uuid'
const url = process.argv[1]
if (typeof url === 'undefined') {
throw new Error('provide input url as first argument')
}
const uuid = uuidm.v5(url, uuidm.v5.URL)
console.log(uuid)
EOF
)" "$1"
}
⚡ urlUuidv5 'ni:///mh;HiBcp4Fa3LSE6aE2wR7-acHVMBdtVJtdGNA461KAtLNHDA'
7026bc85-c20a-58da-b6b6-ea3f7b969d87
NIMH-SHA2-256 are RFC 6920 ni
URIs that start with ni:///mh;Ei
and whose multihash uses the 0x12
(decimal 18
) code that is assigned
to sha2-256
.
Unlike NIMH-BLAKE3, as of 2024-07-04, NIMH-SHA2-256 uses FIPS-approved crypto.
Let's create the NIMH-SHA2-256 for the data "Hello World!" (12 characters without the quotes).
The SHA2-256 hash, base64url-encoded is f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk
.
⚡ echo -n 'Hello World!' | sha256sum | xxd -r -p | base64 | tr '/+' '_-' | tr -d '=' f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk
The multihash of the SHA2-256 hash base64url-encoded is EiB/g7Flf/H8U7ktwYFIodZd/C1LH6PWdyhK3dIAEm2QaQ==
⚡ multihashSha256B64() {
node --input-type=module -e "$(cat <<-EOF
import * as hasher from "multiformats/hashes/hasher"
import * as crypto from "node:crypto"
const sha256 = hasher.from({
// As per multiformats table
// https://github.com/multiformats/multicodec/blob/master/table.csv#L9
name: 'sha2-256',
code: 0x12,
encode: (input) => new Uint8Array(crypto.createHash('sha256').update(input).digest())
})
const text = process.argv[1]
if (typeof text === "undefined") {
throw new Error('provide input text as first argument')
}
const digest = await sha256.digest(new TextEncoder().encode(text))
console.log(Buffer.from(digest.bytes).toString('base64'))
EOF
)" "$1"
}
⚡ multihashSha256B64 'Hello World!' | tr '/+' '_-' | tr -d '='
EiB_g7Flf_H8U7ktwYFIodZd_C1LH6PWdyhK3dIAEm2QaQ
The NIMH-SHA2-256 is ni:///mh;EiB_g7Flf_H8U7ktwYFIodZd_C1LH6PWdyhK3dIAEm2QaQ
The URL-namespaced UUIDv5 is 6b4c18fc-7563-52c9-bf2d-26bf4d11bcbc
.
⚡ urlUuidv5() {
node --input-type=module -e "$(cat <<-EOF
import * as uuidm from 'uuid'
const url = process.argv[1]
if (typeof url === 'undefined') {
throw new Error('provide input url as first argument')
}
const uuid = uuidm.v5(url, uuidm.v5.URL)
console.log(uuid)
EOF
)" "$1"
}
⚡ urlUuidv5 'ni:///mh;EiB_g7Flf_H8U7ktwYFIodZd_C1LH6PWdyhK3dIAEm2QaQ'
6b4c18fc-7563-52c9-bf2d-26bf4d11bcbc
Typed NIMHs are NIMHs with a ct
Content Type Query String Attribute
NIMH-SHA2-256 is a valid RFC6920 ni:
URI as long as no other spec is allowed to register the mh
Hash Name String in the Named Information Hash Algorithm Registry at IANA other than the registration already included in Multihash Appendix D.3..
NIMH-SHA2-256 is no less secure as a cryptographic name than a raw SHA2-256 hash.
Wrapping a hash in a NIMH provides the benefits of self-describing data, cryptographic agility, URI Syntax conformance, an affordance for extensions using IETF Media Types that composes well with other IETF RFCs, and the ability to share the existing IANA Media Type Registry for application-specific extensions.