Why does `categorical_feature` of lightgbm not work?

Bowen Peng

I want to use LightGBM to predict the tradeMoney of house, but I get troubles when I have specified categorical_feature in the lgb.Dataset of LightGBM.
I get data.dtypes as follows:

type(train)
pandas.core.frame.DataFrame

train.dtypes
area                  float64
rentType               object
houseFloor             object
totalFloor              int64
houseToward            object
houseDecoration        object
region                 object
plate                  object
buildYear               int64
saleSecHouseNum         int64
subwayStationNum        int64
busStationNum           int64
interSchoolNum          int64
schoolNum               int64
privateSchoolNum        int64
hospitalNum             int64
drugStoreNum            int64

And I use LightGBM to train it as follows:

categorical_feats = ['rentType', 'houseFloor', 'houseToward', 'houseDecoration', 'region', 'plate']
folds = KFold(n_splits=5, shuffle=True, random_state=2333)

oof_lgb = np.zeros(len(train))
predictions_lgb = np.zeros(len(test))
feature_importance_df = pd.DataFrame()

for fold_, (trn_idx, val_idx) in enumerate(folds.split(train.values, target.values)):
    print("fold {}".format(fold_))
    trn_data = lgb.Dataset(train.iloc[trn_idx], label=target.iloc[trn_idx], categorical_feature=categorical_feats)
    val_data = lgb.Dataset(train.iloc[val_idx], label=target.iloc[val_idx], categorical_feature=categorical_feats)

    num_round = 10000
    clf = lgb.train(params, trn_data, num_round, valid_sets = [trn_data, val_data], verbose_eval=500, early_stopping_rounds = 200)

    oof_lgb[val_idx] = clf.predict(train.iloc[val_idx], num_iteration=clf.best_iteration)

    predictions_lgb += clf.predict(test, num_iteration=clf.best_iteration) / folds.n_splits

print("CV Score: {:<8.5f}".format(r2_score(target, oof_lgb)))

BUT it still gives such error messages even if I have specified the categorical_features.

ValueError: DataFrame.dtypes for data must be int, float or bool. Did not expect the data types in fields rentType, houseFloor, houseToward, houseDecoration, region, plate

And here are the requirements:

LightGBM version: 2.2.3
Pandas version: 0.24.2
Python version: 3.6.8
|Anaconda, Inc.| (default, Feb 21 2019, 18:30:04) [MSC v.1916 64 bit (AMD64)]

Could anyone help me, please?

Mischa Lisovyi

The problem is that lightgbm can handle only features, that are of category type, not object. Here the list of all possible categorical features is extracted. Such features are encoded into integers in the code. But nothing happens to objects and thus lightgbm complains, when it finds that not all features have been transformed into numbers.

So the solution is to do

for c in categorical_feats:
    train[c] = train[c].astype('category')

before your CV loop

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

LightGBM 'データセットでcategorical_featureを使用しています。' 警告?

分類Dev

why does innerHTML not work?

分類Dev

Why does this Firebase ".indexOn" not work?

分類Dev

Why does not it work async pipe?

分類Dev

Why does Horspool not work on binaries?

分類Dev

why does this negative lookahead not work?

分類Dev

Why does sudo not work with curl?

分類Dev

Why source maps does not work?

分類Dev

why does a constructor work this way?

分類Dev

Why does this rename operation not work?

分類Dev

Why does this piece of Golang code not work?

分類Dev

Why does this backreference not work inside a lookbehind?

分類Dev

Why does :host(:hover) not work here?

分類Dev

Why does the code . shortcut not work on OSX?

分類Dev

Why does #[derive(Show)] not work anymore?

分類Dev

Why my implicit function parameter does not work?

分類Dev

Why does a deserialized TDictionary not work correctly?

分類Dev

Why does a deserialized TDictionary not work correctly?

分類Dev

Why does zipWith.zipWith work?

分類Dev

Why does the break statement not work here?

分類Dev

Why git alias with push does not work?

分類Dev

why does css cursor not work for styled scrollbar

分類Dev

Why does my method for collision detection not work?

分類Dev

Why does sorting a JS array of numbers with < work?

分類Dev

Why does Lua loadstring() not work on the demo site?

分類Dev

Why does my "INSERT INTO" Statement not work?

分類Dev

Why does my if statement work with an else if, but not an OR operator

分類Dev

why sticky position does not work in child div

分類Dev

Why Linq Prepend() does not work with List<T>?

Related 関連記事

ホットタグ

アーカイブ