← Back to Domains

21 Patents

NPU Inference Core

Ternary neural network optimisation for binary Neural Processing Units — the foundational stack enabling {-1, 0, +1} inference on existing NPU silicon without hardware modification.

Ternary Neural Processing Unit Architecture for Binary NPU Optimization

Existing chips run ternary — no new silicon required.

Zero-Skip Gating for Ternary Neural Networks

Normalisation layers dissolve into the ternary fabric — no floating-point tax.

Ternary Weight Pruning and Sparsification

Weights and activations co-designed — the whole pipeline speaks three values.

Batch Normalization in Ternary Quantized Networks

Prune the tree, then ternarise what remains — 200× smaller models.

Mixed-Precision Ternary Inference Scheduling

The architecture searches itself — evolution finds the optimal ternary shape.

Cache-Aware Ternary Inference on NPU

Precision shifts on the fly — full power when needed, whisper-quiet when not.

Activation Function Approximation for Ternary Domains

Multiple chips think as one — ternary models spanning beyond a single die.

Ternary Post-Training Quantization

A master teaches a student in three values — knowledge distilled to its essence.

Ternary Convolution Kernel Optimization

The quantisation boundary learns where to draw itself.

Recurrent Neural Network Inference in Ternary Domain

Gradients compressed to three values — 8× less bandwidth across the training cluster.

Attention Mechanism Compression via Ternary Quantization

The transformer attention mechanism — rebuilt for three values.

Ternary NPU Compiler Optimization Passes

The ternary chip itself — a complete microarchitecture specification.

Ternary Sparse Tensor Operations

The nervous system mapped into silicon — biology's architecture in ternary.

Dynamic Precision Selection for Ternary Inference

Zero weights gate their own clocks — 70% of the chip sleeps while the rest thinks.

Ternary Batch Matrix Multiplication (GEMM)

One execution unit handles both ternary and conventional — switchable precision in a single core.

Ternary Model Compression via Knowledge Distillation

Memory redesigned from the ground up for three-valued data.

Hardware-Aware Ternary Network Architecture Search

Ternary data streams through dataflow hardware — bandwidth-optimal inference.

Ternary Tensor Decomposition and Factorization

The compiler knows the hardware — scheduling ternary execution across heterogeneous cores.

Ternary Graph Neural Networks

Today's commercial NPUs run ternary models through translation — no hardware changes.

Ternary Reinforcement Learning Agents

The scheduler sees the zeros and skips them — sparsity-aware execution on neural engines.

Diffusion Models in Ternary Domain

CPU and NPU collaborate — each layer runs where it fits best.