Logistic Regression Model using Regularization (L1 / L2) Lasso and Ridge

chits

I am trying to build model and create the grid search and below is the code. Raw data is downloaded from this site(credit card fraud data). https://www.kaggle.com/mlg-ulb/creditcardfraud

Code starting from standardization after reading the data.

standardization = StandardScaler()
credit_card_fraud_df[['Amount']] = standardization.fit_transform(credit_card_fraud_df[['Amount']])
# Assigning feature variable to X
X = credit_card_fraud_df.drop(['Class'], axis=1)

# Assigning response variable to y
y = credit_card_fraud_df['Class']
# Splitting the data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, random_state=100)
X_train.head()
power_transformer = PowerTransformer(copy=False)
power_transformer.fit(X_train)                       ## Fit the PT on training data
X_train_pt_df = power_transformer.transform(X_train)    ## Then apply on all data
X_test_pt_df = power_transformer.transform(X_test)
y_train_pt_df = y_train
y_test_pt_df = y_test
train_pt_df = pd.DataFrame(data=X_train_pt_df, columns=X_train.columns.tolist())
# set up cross validation scheme
folds = StratifiedKFold(n_splits = 5, shuffle = True, random_state = 4)

# specify range of hyperparameters
params = {"C":np.logspace(-3,3,5,7), "penalty":["l1","l2"]}# l1 lasso l2 ridge

## using Logistic regression for class imbalance
model = LogisticRegression(class_weight='balanced')
grid_search_cv = GridSearchCV(estimator = model, param_grid = params, 
                        scoring= 'roc_auc', 
                        cv = folds, 
                        return_train_score=True, verbose = 1)            
grid_search_cv.fit(X_train_pt_df, y_train_pt_df)
## reviewing the results
cv_results = pd.DataFrame(grid_search_cv.cv_results_)
cv_results

Sample Result:

  mean_fit_time std_fit_time    mean_score_time std_score_time  param_C param_penalty   params  split0_test_score   split1_test_score   split2_test_score   split3_test_score   split4_test_score   mean_test_score std_test_score  rank_test_score
    0   0.044332    0.002040    0.000000    0.000000    0.001   l1  {'C': 0.001, 'penalty': 'l1'}   NaN NaN NaN NaN NaN NaN NaN 6
    1   0.477965    0.046651    0.016745    0.003813    0.001   l2  {'C': 0.001, 'penalty': 'l2'}   0.485714    0.428571    0.542857    0.485714    0.457143    0.480000    0.037904    5

I do not have any null values in the input data.I am not understanding why am i getting Nan values for these columns. Can anyone please help me?

Sergey Bushmanov

You have a problem with default solver defined here:

model = LogisticRegression(class_weight='balanced')

which follows from the following error message:

ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

Also, it might be useful to study docs prior to defining a param grid:

penalty: {‘l1’, ‘l2’, ‘elasticnet’, ‘none’}, default=’l2’ Used to specify the norm used in the penalization. The ‘newton-cg’, ‘sag’ and ‘lbfgs’ solvers support only l2 penalties. ‘elasticnet’ is only supported by the ‘saga’ solver. If ‘none’ (not supported by the liblinear solver), no regularization is applied.

Al soon as you correct it with a different solver that supports your desired grid, you're fine to go:

## using Logistic regression for class imbalance
model = LogisticRegression(class_weight='balanced', solver='saga')
grid_search_cv = GridSearchCV(estimator = model, param_grid = params, 
                        scoring= 'roc_auc', 
                        cv = folds, 
                        return_train_score=True, verbose = 1)            
grid_search_cv.fit(X_train_pt_df, y_train_pt_df)
## reviewing the results
cv_results = pd.DataFrame(grid_search_cv.cv_results_)

Note as well a ConvergenceWarning which might suggest you need to increase default max_iter, tol, or switch to another solver and rethink the desired param grid.

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Compute a kernel ridge regression in R for model selection

分類Dev

L2 regularization in tensorflow with high level API

分類Dev

Ridge / Lasso Regressionのh(simpleError(msg、call))のエラー

分類Dev

Ridge / Lasso Regressionのh(simpleError(msg、call))のエラー

分類Dev

Coq:(a :: L1)=(b :: L2)⇒a=b∧L1= L2?

分類Dev

Plot coefficients from a multinomial logistic regression model

分類Dev

Confidence intervals for Ridge regression

分類Dev

value error on logistic regression model and how to check prediction accuracy?

分類Dev

ILNumerics:ILMath.ridge_regression

分類Dev

Scala:なぜl1 ::: l2はl1。:: :( l2)と等しくないのですか?

分類Dev

Training logistic regression using scikit learn for multi-class classification

分類Dev

How to get feature importance in logistic regression using weights?

分類Dev

Logistic Regression with GLM

分類Dev

python logistic regression (beginner)

分類Dev

Firth's Logistic Regression

分類Dev

sklearn RFE with logistic regression

分類Dev

RのL1およびL2ノルム

分類Dev

Cortex A53 L1 L2 キャッシュ情報

分類Dev

Questions about ridge regression on python : Scaling, and interpretation

分類Dev

scikit-learn Ridge Regression UnboundLocalError

分類Dev

L2がNP完全であり、L1をL2に減らすことができる場合

分類Dev

Using combinations of principal components in a regression model

分類Dev

Logistic regression for fault detection in an image

分類Dev

Plot logistic regression curve in R

分類Dev

Dummy variables for Logistic regression in R

分類Dev

Creating a DF inside of the lasso model

分類Dev

PyTorchにL1 / L2正則化を追加しますか?

分類Dev

SpringCacheableを使用したL1 + L2キャッシング戦略

分類Dev

Tensorflow:How to add regularization in the model

Related 関連記事

  1. 1

    Compute a kernel ridge regression in R for model selection

  2. 2

    L2 regularization in tensorflow with high level API

  3. 3

    Ridge / Lasso Regressionのh(simpleError(msg、call))のエラー

  4. 4

    Ridge / Lasso Regressionのh(simpleError(msg、call))のエラー

  5. 5

    Coq:(a :: L1)=(b :: L2)⇒a=b∧L1= L2?

  6. 6

    Plot coefficients from a multinomial logistic regression model

  7. 7

    Confidence intervals for Ridge regression

  8. 8

    value error on logistic regression model and how to check prediction accuracy?

  9. 9

    ILNumerics:ILMath.ridge_regression

  10. 10

    Scala:なぜl1 ::: l2はl1。:: :( l2)と等しくないのですか?

  11. 11

    Training logistic regression using scikit learn for multi-class classification

  12. 12

    How to get feature importance in logistic regression using weights?

  13. 13

    Logistic Regression with GLM

  14. 14

    python logistic regression (beginner)

  15. 15

    Firth's Logistic Regression

  16. 16

    sklearn RFE with logistic regression

  17. 17

    RのL1およびL2ノルム

  18. 18

    Cortex A53 L1 L2 キャッシュ情報

  19. 19

    Questions about ridge regression on python : Scaling, and interpretation

  20. 20

    scikit-learn Ridge Regression UnboundLocalError

  21. 21

    L2がNP完全であり、L1をL2に減らすことができる場合

  22. 22

    Using combinations of principal components in a regression model

  23. 23

    Logistic regression for fault detection in an image

  24. 24

    Plot logistic regression curve in R

  25. 25

    Dummy variables for Logistic regression in R

  26. 26

    Creating a DF inside of the lasso model

  27. 27

    PyTorchにL1 / L2正則化を追加しますか?

  28. 28

    SpringCacheableを使用したL1 + L2キャッシング戦略

  29. 29

    Tensorflow:How to add regularization in the model

ホットタグ

アーカイブ