AIWiki
Malaysia

Kolmogorov-Arnold Networks (KAN)

Kolmogorov-Arnold Networks are a neural network architecture that places learnable activation functions on connections rather than fixed activations on nodes, improving accuracy and interpretability.

4 min readLast updated July 2026Foundations

Kolmogorov-Arnold Networks (KANs) are a neural network architecture proposed in 2024 as an alternative to the multilayer perceptron (MLP). Inspired by the Kolmogorov-Arnold representation theorem, which states that any multivariate continuous function can be expressed as a composition of univariate functions and addition, KANs restructure where and how nonlinearity enters a network. The central innovation is to place learnable activation functions on the connections between neurons rather than fixed activation functions on the neurons themselves.

The key architectural idea

In a conventional MLP, each neuron applies a fixed nonlinear activation function, such as ReLU or sigmoid, and the connections between neurons carry simple numeric weights that are learned during training. KANs invert this arrangement. They have no linear weights in the traditional sense; instead, every connection carries a learnable univariate function, and the neurons simply sum the incoming transformed signals. In the original formulation these edge functions are parametrised as splines, specifically basis splines (B-splines), whose coefficients are the parameters adjusted during training.

Because a spline can flexibly represent a wide range of one-dimensional shapes, learning the activation on each edge amounts to learning the coefficients of these splines. This gives the network fine-grained control over how each input dimension is transformed on its way through the model, rather than forcing every neuron to share one fixed nonlinearity.

Claimed advantages

The proponents of KANs report two main benefits over MLPs of comparable capability.

The first is accuracy. In tasks such as data fitting and solving partial differential equations, much smaller KANs have been shown to match or exceed the accuracy of substantially larger MLPs, suggesting a more parameter-efficient way to represent certain functions, particularly those with underlying mathematical or scientific structure.

The second is interpretability. Because the learned functions live on individual edges, they can be visualised and sometimes read off as recognisable mathematical expressions. This makes it possible, in favourable cases, to extract a compact symbolic formula from a trained network, which is appealing in scientific settings where understanding the discovered relationship matters as much as predictive accuracy.

| Property | MLP | KAN | | --- | --- | --- | | Activation location | On nodes | On edges | | Activation type | Fixed | Learnable (splines) | | Interpretability | Limited | Higher for structured data | | Best-fit domains | General | Scientific and mathematical functions |

Limitations and status

KANs are a comparatively new and active research area rather than a settled replacement for MLPs. Spline-based edge functions can be slower to train and harder to scale to very large models than the highly optimised matrix operations that MLPs and transformers rely on. Their advantages are most pronounced on scientific and low-dimensional problems, and their usefulness for large-scale language or vision tasks remains under investigation. Researchers have proposed variants that replace B-splines with other basis functions, such as sinusoidal or polynomial forms, to improve efficiency, and the architecture continues to evolve.

References

  1. Liu, Z., et al. (2024). KAN: Kolmogorov-Arnold Networks. arXiv:2404.19756.
  2. DataCamp. (2024). Kolmogorov-Arnold Networks (KANs): A Guide With Implementation. datacamp.com.
  3. Reinhart, E., et al. (2024). SineKAN: Kolmogorov-Arnold Networks using sinusoidal activation functions. Frontiers in Artificial Intelligence.