Gradient Boosting
A machine learning ensemble technique that builds predictive models sequentially, where each new model corrects the errors of its predecessors using gradient descent optimisation.
Gradient boosting is a supervised machine learning technique that combines many weak prediction models — usually shallow decision trees — into a single strong predictor. Each successive model is trained to correct the residual errors made by the ensemble built so far, with the corrections guided by the gradient of a chosen loss function. The method was introduced by statistician Jerome H. Friedman in 1999 and has since become one of the most widely used algorithms for tabular data in both industry and competitive machine learning.
How gradient boosting works
A gradient boosting model is constructed in additive stages. The procedure begins with a simple constant prediction, such as the mean of the target variable. At each iteration, the algorithm computes the negative gradient of the loss function with respect to the current ensemble's predictions; this gradient acts as a pseudo-residual that captures where the model is currently wrong. A new weak learner — typically a regression tree of limited depth — is then fitted to those pseudo-residuals and added to the ensemble with a shrinkage factor known as the learning rate. The process is repeated for hundreds or thousands of iterations until further additions stop improving validation performance.
Because the procedure follows the gradient of an arbitrary differentiable loss, the same framework can be used for regression with squared error or Huber loss, binary and multiclass classification with logistic and softmax loss, ranking with LambdaRank, and survival analysis with Cox loss.
Popular implementations
Three open-source libraries dominate practical use. XGBoost, released in 2014, introduced regularised tree objectives, sparse-aware split finding, and efficient out-of-core training, and quickly became the default tool on platforms such as Kaggle. LightGBM, developed by Microsoft, uses histogram-based splitting and leaf-wise tree growth, often delivering faster training and lower memory use than XGBoost on large datasets. LightGBM 4.6.0, released in February 2025, added improved CUDA acceleration, Apple Silicon support, and better distributed training. CatBoost, from Yandex, handles categorical features natively using ordered target statistics and is known for strong out-of-the-box performance with minimal tuning.
| Library | First release | Tree growth | Categorical handling | | --- | --- | --- | --- | | XGBoost | 2014 | Level-wise (depth-wise) | One-hot or target encoding | | LightGBM | 2016 | Leaf-wise (best-first) | Native via histograms | | CatBoost | 2017 | Symmetric oblivious | Native ordered encoding |
Applications
Gradient boosting is the default starting point for most tabular prediction problems, including credit scoring, customer churn, click-through rate prediction, fraud detection, insurance pricing, demand forecasting, and clinical risk scoring. It typically outperforms deep neural networks on structured data with fewer than several million rows, while producing interpretable feature-importance and SHAP value outputs that regulated industries can audit.
Strengths and limitations
The technique handles mixed numerical and categorical inputs, captures non-linear interactions automatically, and is robust to monotonic feature transformations. It can, however, overfit small datasets if the number of boosting rounds or tree depth is not controlled, and training is harder to parallelise than for random forests because trees are added sequentially. On unstructured data such as images, text, or audio, modern deep learning architectures remain superior.
References
- Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, 29(5), 1189–1232.
- Chen, T., and Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. KDD '16.
- Ke, G. et al. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. NeurIPS.
- Microsoft (2025). LightGBM 4.6.0 release notes. github.com/microsoft/LightGBM.
- Bank Negara Malaysia (2023). Risk Management in Technology (RMiT) Policy Document. bnm.gov.my.