Diffusion Models in Ternary Domain
CPU and NPU collaborate — each layer runs where it fits best.
Explore the Vision
Discover this technology through five complementary perspectives — from technical architecture to partnership outcomes. Each layer reveals a different aspect of how this innovation creates value.
CPU and NPU collaborate — each layer runs where it fits best.
What It IS
Technical VisionThe architectural essence — what makes this technology work
A system-on-chip where the CPU and NPU pass inference layers back and forth like skilled craftspeople sharing a workpiece. Layers that suit ternary NPU acceleration go there; layers that need CPU flexibility stay here. Hybrid intelligence on a single die.
Abstract
Adapting diffusion-based generative models to ternary quantization, enabling on-device image and content generation without memory bloat.
Visual Essence
A system-on-chip where the CPU and NPU pass inference layers back and forth like skilled craftspeople sharing a workpiece. Layers that suit ternary NPU acceleration go there; layers that need CPU flexibility stay here. Hybrid intelligence on a single die.
Technology Domains
Related Patents
From the silicon-blueprint visual family
Ternary NPU Compiler Optimization Passes
The ternary chip itself — a complete microarchitecture specification.
Ternary Batch Matrix Multiplication (GEMM)
One execution unit handles both ternary and conventional — switchable precision in a single core.
Ternary Model Compression via Knowledge Distillation
Memory redesigned from the ground up for three-valued data.
Hardware-Aware Ternary Network Architecture Search
Ternary data streams through dataflow hardware — bandwidth-optimal inference.