Search Results
2 results for “quantisation”
Infrastructure
Model Compression
Model compression is a set of techniques that reduce the size, memory footprint, and computational cost of machine learning models while preserving predictive accuracy, enabling deployment on resource-constrained hardware.
6 min readUpdated June 2026
Infrastructure
Quantisation
Quantisation is a model compression technique that reduces the numerical precision of a neural network's weights and activations from high-bit floating-point formats to lower-bit representations, decreasing memory usage and accelerating inference with minimal accuracy loss.
7 min readUpdated May 2026