Ternary Batch Matrix Multiplication (GEMM)
One execution unit handles both ternary and conventional — switchable precision in a single core.
Explore the Vision
Discover this technology through five complementary perspectives — from technical architecture to partnership outcomes. Each layer reveals a different aspect of how this innovation creates value.
One execution unit handles both ternary and conventional — switchable precision in a single core.
What It IS
Technical VisionThe architectural essence — what makes this technology work
A single processing element with two faces — one crystalline and ternary, one smooth and conventional. A configuration signal flips between modes mid-inference, running ternary layers at extreme efficiency and conventional layers at full precision. Bilingual silicon.
Abstract
Optimized batch matrix multiplication kernel for ternary data on GPU and NPU, achieving 90% of theoretical peak throughput on modern silicon.
Visual Essence
A single processing element with two faces — one crystalline and ternary, one smooth and conventional. A configuration signal flips between modes mid-inference, running ternary layers at extreme efficiency and conventional layers at full precision. Bilingual silicon.
Technology Domains
Related Patents
From the silicon-blueprint visual family
Ternary NPU Compiler Optimization Passes
The ternary chip itself — a complete microarchitecture specification.
Ternary Model Compression via Knowledge Distillation
Memory redesigned from the ground up for three-valued data.
Hardware-Aware Ternary Network Architecture Search
Ternary data streams through dataflow hardware — bandwidth-optimal inference.
Ternary Tensor Decomposition and Factorization
The compiler knows the hardware — scheduling ternary execution across heterogeneous cores.