LightGBM

Original author(s)	Guolin Ke^[1] / Microsoft Research
Developer(s)	Microsoft and LightGBM contributors^[2]
Initial release	2016; 8 years ago (2016)

Stable release	v4.3.0^[3] / January 15, 2024; 5 months ago (2024-01-15)

Repository	github.com/microsoft/LightGBM
Written in	C++, Python, R, C
Operating system	Windows, macOS, Linux
Type	Machine learning, gradient boosting framework
License	MIT License
Website	lightgbm.readthedocs.io

LightGBM, short for Light Gradient-Boosting Machine, is a free and open-source distributed gradient-boosting framework for machine learning, originally developed by Microsoft.^[4]^[5] It is based on decision tree algorithms and used for ranking, classification and other machine learning tasks. The development focus is on performance and scalability.

Overview

The LightGBM framework supports different algorithms including GBT, GBDT, GBRT, GBM, MART^[6]^[7] and RF.^[8] LightGBM has many of XGBoost's advantages, including sparse optimization, parallel training, multiple loss functions, regularization, bagging, and early stopping. A major difference between the two lies in the construction of trees. LightGBM does not grow a tree level-wise — row by row — as most other implementations do.^[9] Instead it grows trees leaf-wise. It chooses the leaf it believes will yield the largest decrease in loss.^{[citation needed]} Besides, LightGBM does not use the widely used sorted-based decision tree learning algorithm, which searches the best split point on sorted feature values,^[10] as XGBoost or other implementations do. Instead, LightGBM implements a highly optimized histogram-based decision tree learning algorithm, which yields great advantages on both efficiency and memory consumption.^[11] The LightGBM algorithm utilizes two novel techniques called Gradient-Based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) which allow the algorithm to run faster while maintaining a high level of accuracy.^[12]

LightGBM works on Linux, Windows, and macOS and supports C++, Python,^[13] R, and C#.^[14] The source code is licensed under MIT License and available on GitHub.^[15]

Gradient-based one-side sampling

Gradient-based one-side sampling (GOSS) is a method that leverages the fact that there is no native weight for data instance in GBDT. Since data instances with different gradients play different roles in the computation of information gain, the instances with larger gradients will contribute more to the information gain. So to retain the accuracy of the information, GOSS keeps the instances with large gradients and randomly drops the instances with small gradients.^[12]

Exclusive feature bundling

Exclusive feature bundling (EFB) is a near-lossless method to reduce the number of effective features. In a sparse feature space many features are nearly exclusive, implying they rarely take nonzero values simultaneously. One-hot encoded features are a perfect example of exclusive features. EFB bundles these features, reducing dimensionality to improve efficiency while maintaining a high level of accuracy. The bundle of exclusive features into a single feature is called an exclusive feature bundle.^[12]

References

External links

Microsoft free and open-source software (FOSS)

Overview

Software

Applications	3D Movie Maker Atom Conference XP Family.Show File Manager Open Live Writer Microsoft PowerToys Terminal Windows Calculator Windows Console Windows Package Manager WorldWide Telescope XML Notepad
Video games	Allegiance
Programming languages	Bosque C# Dafny F# F* GW-BASIC IronPython IronRuby Lean P Power Fx PowerShell Project Verona Q# Small Basic Online TypeScript Visual Basic
Frameworks, development tools	.NET .NET Framework .NET Gadgeteer .NET MAUI .NET Micro Framework AirSim ASP.NET ASP.NET AJAX ASP.NET Core ASP.NET MVC ASP.NET Razor ASP.NET Web Forms Avalonia Babylon.js BitFunnel Blazor C++/WinRT CCF ChakraCore CLR Profiler Dapr DeepSpeed DiskSpd Dryad Dynamic Language Runtime eBPF on Windows Electron Entity Framework Fluent Design System Fluid Framework Infer.NET LightGBM Managed Extensibility Framework Microsoft Automatic Graph Layout Microsoft C++ Standard Library Microsoft Cognitive Toolkit Microsoft Design Language Microsoft Detours Microsoft Enterprise Library Microsoft SEAL mimalloc Mixed Reality Toolkit ML.NET mod_mono Mono MonoDevelop MSBuild MsQuic Neural Network Intelligence npm NuGet OneFuzz Open Management Infrastructure Open Neural Network Exchange Open Service Mesh Open XML SDK Orleans Playwright ProcDump ProcMon Python Tools for Visual Studio R Tools for Visual Studio RecursiveExtractor Roslyn Sandcastle SignalR StyleCop SVNBridge T2 Temporal Prover Text Template Transformation Toolkit TLA+ Toolbox U-Prove vcpkg Virtual File System for Git Visual Studio Code Voldemort VoTT Vowpal Wabbit Windows App SDK Windows Communication Foundation Windows Driver Frameworks KMDF UMDF Windows Forms Windows Presentation Foundation Windows Template Library Windows UI Library WinJS WinObjC WiX XDP for Windows XSP xUnit.net Z3 Theorem Prover
Operating systems	MS-DOS (v1.25, v2.0 & v4.0) Barrelfish SONiC Azure Linux
Other	ChronoZoom Extensible Storage Engine FlexWiki FourQ Gollum Project Mu ReactiveX SILK TLAPS TPM 2.0 Reference Implementation WikiBhasha

Licenses

Forges

Overview

Gradient-based one-side sampling

Exclusive feature bundling

See also

References

Further reading

External links