๐Ÿ“Ÿ RUNTIME TELEMETRY
THROUGHPUT
0.0 TPS
Target: 30.0 TPS
PREFILL LATENCY
0 MS
Time to First Token
ACCEPTANCE RATE
-- %
Speculative Hits
๐Ÿ“ฆ VRAM CACHE ALLOCATION
0 MB / 4096 MB 0%
LEVEL 1 // PREFILL & DECODE

Route user prompt requests through a prefill engine to compute attention tables, then feed activations to a decode engine to satisfy the throughput target.

Target TPS: 30.0
Max VRAM: 4096 MB
ACTIVE SIMULATION GRID (12x8)
STATUS: CONFIGURING
๐Ÿงฑ COMPONENTS TOOLBOX
WIRE CONDUIT
CONDUIT

Connects inference engines and routes token data streams from inputs to outputs.