stackgbm offers a minimalist implementation of model stacking (Wolpert, 1992) for gradient boosted tree models built by xgboost (Chen and Guestrin, 2016), lightgbm (Ke et al., 2017), and catboost (Prokhorenkova et al., 2018).

## Install

First, make sure to install two R packages that are not yet available from CRAN as of June 2020:

Then install stackgbm from GitHub:

remotes::install_github("nanxstats/stackgbm")

## Design

stackgbm implements a classic two-layer stacking model: the first layer generates “features” produced by gradient boosting trees. The second layer is a logistic regression that uses these features as inputs. The code is derived from our 2nd place solution for a precisionFDA brain cancer machine learning challenge in 2020.

To make sure the package is easy to understand, modify, and extend, we choose to build this package with base R without any special frameworks or dialects. We also only exposed the most essential tunable parameters for the boosted tree models (learning rate, maximum depth of a tree, and number of iterations).