XGBoost & LightGBM - Gradient Boosting for Tabular Data

XGBoost (eXtreme Gradient Boosting) and LightGBM (Light Gradient Boosting Machine) are the de facto standard libraries for machine learning on tabular/structured data. They consistently win Kaggle competitions and are widely used in industry for their speed, accuracy, and robustness.

When to Use

Classification or regression on tabular data (CSVs, databases, spreadsheets).
Kaggle competitions or data science competitions on structured data.
Feature importance analysis and feature selection.
Handling missing values automatically (no need to impute).
Working with imbalanced datasets (built-in class weighting).
Need for fast training on large datasets (millions of rows).
Hyperparameter tuning with cross-validation.
Ranking tasks (learning-to-rank algorithms).
When you need interpretable feature importances.
Production ML systems requiring fast inference on tabular data.

xgboost-lightgbm

XGBoost & LightGBM - Gradient Boosting for Tabular Data

When to Use

Reference Documentation