Domains/Vertical Applications/P048

P048Filed

Edge Language Model Inference with Ternary Quantization

Large language models running on edge — sparse attention, compressed KV cache, local personality.

AU Application

2023900048

Filing Date

10 September 2023

Index Number

P048

Figures

14 figures

Batch / Category

Applications

Explore the Vision

Discover this technology through five complementary perspectives — from technical architecture to partnership outcomes. Each layer reveals a different aspect of how this innovation creates value.

Large language models running on edge — sparse attention, compressed KV cache, local personality.

What It IS

Technical Vision

The architectural essence — what makes this technology work

A large language model running entirely on a local device — ternary sparse attention gating 85% of computation, compressed key-value cache fitting in on-chip memory, responses personalised to the user. A personal language model that never phones home.

1/5

Explore the buyer's journey across 5 perspectives

Abstract

Deployment of large language models to edge devices via ternary quantization, enabling on-device LLM inference under 500ms latency.

Visual Essence

A large language model running entirely on a local device — ternary sparse attention gating 85% of computation, compressed key-value cache fitting in on-chip memory, responses personalised to the user. A personal language model that never phones home.

Visual Family:edge-bloom

Technology Domains

Vertical Applications(22)

← Previous Patent All Domains Next Patent →

Related Patents

From the edge-bloom visual family

Cybersecurity Threat Detection via Ternary Networks

Generative AI runs on edge — diffusion models in your pocket.

Vertical Applications

Sensor Fusion and Multi-Modal Inference at Edge

Documents read, classified, and attested entirely on-device — no cloud, tamper-evident.

Vertical Applications

Home Inference Devices and Privacy-Preserving Inference

The always-on home AI — three-state command routing with safety constraints and local attestation.

Vertical Applications

Ternary Inference for Medical Image Analysis

Privacy-preserving collaborative training across the planet — federated ternary learning.

Vertical Applications