SUBIT-MORPHS: A UNIVERSAL LANGUAGE OF FORM FOR INFORMATIONAL FLOWS

Introduction: When information begins to have shape

Modern AI systems operate on something far more abstract than words, bytes, or tokens. They operate on forms.

Inside contemporary models we find:

  • latent spaces

  • directional features

  • topologies of meaning

  • geometric structures

This is not semantics. This is morphology.

Yet classical information theory still lacks a universal unit capable of describing the shape of information, the way a bit describes its quantity.

We propose such a unit: the subit-morph.

1. The Subit-Morph: an atom of form

A subit-morph is a morphological fingerprint of an informational flow, derived from the distribution of all possible 6‑bit configurations.

Any data stream → bitstring → sliding window of 6 bits:

Code

s_i = bits[i : i+6]
k_i = int(s_i, 2)     # k_i ∈ {0..63}

Frequency distribution:

Code

This yields a 64-dimensional morphological profile.

2. The geometry of a morph

A subit-morph is not just a vector. It is a geometric object.

2.1. Flat morphology (8×8 grid)

Code

2.2. Volumetric morphology (4×4×4 cube)

Code

This representation reveals:

  • cores

  • mantles

  • peripheries

  • symmetries

  • directional structures

3. Three fundamental metrics of form

3.1. Entropy E — degree of randomness

Code

3.2. Anisotropy A — directional structure

Code

Let λ1 ≥ λ2 ≥ λ3 be the eigenvalues of C.

Code

3.3. Morphological tension T — radial structure

Code

4. The SUBIT-address: coordinates of form

Every informational flow receives a morphological address:

Code

This is a point in the morphological space of information.

Two flows with the same address are morphological isomorphs, even if one is Hamlet’s monologue and the other is a soup recipe.

5. Existing approaches: where the “language of AI” stands today

Subit-morphs emerge at the intersection of several major research directions that attempt to describe the internal language of models.

5.1. Embeddings and latent spaces

Modern models represent concepts as vectors in high-dimensional spaces with:

  • clusters

  • directions

  • manifolds

But embeddings are:

  • model-dependent

  • unstable across architectures

  • not universal

Subit-morphs are model-independent.

5.2. Mechanistic interpretability

This field attempts to:

  • decompose models into circuits

  • identify neurons tied to specific concepts

  • understand internal operations

It provides local insights but:

  • lacks a global morphological unit

  • cannot compare different models

  • does not operate on arbitrary data streams

Subit-morphs are global and universal.

5.3. Sparse autoencoders and “features”

Sparse autoencoders extract:

  • cleaner features

  • more interpretable internal units

  • stable internal structures

But:

  • features depend on the model

  • no universal space

  • no invariant

Subit-morphs are invariant.

5.4. Topological Data Analysis (TDA)

TDA describes shape using:

  • homologies

  • persistence diagrams

  • topological invariants

But:

  • requires large datasets

  • does not operate on streams

  • lacks compact coordinates

Subit-morphs are compact and stream-based.

5.5. Hashing and signatures

Hashes provide compact fingerprints but:

  • do not preserve structure

  • cannot compare shapes

  • lack topological meaning

Subit-morphs are morphological, not cryptographic.

6. Why subit-morphs are the next logical step

All existing approaches describe the internal language of models, but none provide:

  • a universal unit

  • a universal space

  • a universal invariant

Subit-morphs offer:

  • a universal morphological unit

  • a universal space (E, A, T)

  • a universal address

  • a universal method for comparing flows

This is the first language of form readable by both humans and models.

7. Experimental examples

  • π → E≈1, A≈0, T≈0

  • natural language → mid-range values

  • code → high anisotropy

  • DNA → high tension

  • model outputs → geometric patterns

Conclusion

Subit-morphs are:

  • a new morphological unit

  • a new way to see information

  • a new language of form

  • a bridge between human language and the internal language of models

  • a tool for interpretability

  • a foundation for a morphological internet

This is not a replacement for existing methods. It is a morphological superlayer that unifies them.

Last updated