Modern AI systems operate on something far more abstract than words, bytes, or tokens. They operate on forms.
Inside contemporary models we find:
This is not semantics. This is morphology.
Yet classical information theory still lacks a universal unit capable of describing the shape of information, the way a bit describes its quantity.
We propose such a unit: the subit-morph.
A subit-morph is a morphological fingerprint of an informational flow, derived from the distribution of all possible 6‑bit configurations.
Any data stream → bitstring → sliding window of 6 bits:
Code
s_i = bits[i : i+6]
k_i = int(s_i, 2) # k_i ∈ {0..63}
Frequency distribution:
Code
This yields a 64-dimensional morphological profile.
2. The geometry of a morph
A subit-morph is not just a vector. It is a geometric object.
2.1. Flat morphology (8×8 grid)
Code
2.2. Volumetric morphology (4×4×4 cube)
Code
This representation reveals:
3.1. Entropy E — degree of randomness
Code
3.2. Anisotropy A — directional structure
Code
Let λ1 ≥ λ2 ≥ λ3 be the eigenvalues of C.
Code
3.3. Morphological tension T — radial structure
Code
Every informational flow receives a morphological address:
Code
This is a point in the morphological space of information.
Two flows with the same address are morphological isomorphs, even if one is Hamlet’s monologue and the other is a soup recipe.
5. Existing approaches: where the “language of AI” stands today
Subit-morphs emerge at the intersection of several major research directions that attempt to describe the internal language of models.
5.1. Embeddings and latent spaces
Modern models represent concepts as vectors in high-dimensional spaces with:
But embeddings are:
unstable across architectures
Subit-morphs are model-independent.
5.2. Mechanistic interpretability
This field attempts to:
decompose models into circuits
identify neurons tied to specific concepts
understand internal operations
It provides local insights but:
lacks a global morphological unit
cannot compare different models
does not operate on arbitrary data streams
Subit-morphs are global and universal.
5.3. Sparse autoencoders and “features”
Sparse autoencoders extract:
more interpretable internal units
stable internal structures
But:
features depend on the model
Subit-morphs are invariant.
5.4. Topological Data Analysis (TDA)
TDA describes shape using:
But:
does not operate on streams
lacks compact coordinates
Subit-morphs are compact and stream-based.
5.5. Hashing and signatures
Hashes provide compact fingerprints but:
do not preserve structure
Subit-morphs are morphological, not cryptographic.
6. Why subit-morphs are the next logical step
All existing approaches describe the internal language of models, but none provide:
Subit-morphs offer:
a universal morphological unit
a universal space (E, A, T)
a universal method for comparing flows
This is the first language of form readable by both humans and models.
7. Experimental examples
natural language → mid-range values
model outputs → geometric patterns
Subit-morphs are:
a new way to see information
a bridge between human language and the internal language of models
a tool for interpretability
a foundation for a morphological internet
This is not a replacement for existing methods. It is a morphological superlayer that unifies them.