Srikanth Pagadala

Data Preparation for Gradient Boosting with XGBoost

04 Aug 2016

XGBoost is a popular implementation of Gradient Boosting because of its speed and performance.

Internally, XGBoost models represent all problems as a regression predictive modeling problem that only takes numerical values as input. If your data is in a different form, it must be prepared into the expected format.

Here you will discover how to prepare your data for using with gradient boosting with the XGBoost library in Python.

By the end you will know:

  • How to encode string output variables for classification?
  • How to prepare categorical input variables using one hot encoding?
  • How to automatically handle missing data with XGBoost?

Source Code

Report

Next: Save Gradient Boosting Models with XGBoost