ValueError : 입력에 NaN, 무한대 또는 dtype ( 'float64')에 비해 너무 큰 값이 있습니다.

debugcn 에 게시 Dev

아니 루드

대출 예측 연습 문제를 연습하고 데이터의 누락 된 값을 채우려 고합니다. 여기 에서 데이터를 얻었습니다 . 이 문제를 해결하기 위해이 자습서를 따릅니다 .

내가 사용중인 전체 코드 (파일 이름 model.py)와 GitHub 의 데이터를 찾을 수 있습니다 .

DataFrame은 다음과 같습니다.

마지막 행이 실행 된 후 (model.py 파일의 122 행에 해당)

/home/user/.local/lib/python2.7/site-packages/numpy/lib/arraysetops.py:216: FutureWarning: numpy not_equal will not check object identity in the future. The comparison did not return the same result as suggested by the identity (`is`)) and will change.
  flag = np.concatenate(([True], aux[1:] != aux[:-1]))
/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
Traceback (most recent call last):
  File "model.py", line 123, in <module>
    classification_model(model, df,predictor_var,outcome_var)
  File "model.py", line 89, in classification_model
    model.fit(data[predictors],data[outcome])
  File "/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/logistic.py", line 1173, in fit
    order="C")
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 521, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 407, in check_array
    _assert_all_finite(array)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 58, in _assert_all_finite
    " or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

누락 된 값 때문에이 오류가 발생합니다. 이 누락 된 값을 어떻게 채우나요?

The missing values for Self_Employed and LoanAmount is filled how do I fill the rest.Thank you for the help.

jezrael

You can use fillna:

df['Gender'].fillna('no data',inplace=True)
df['Married'].fillna('no data',inplace=True)

Or if need replace multiple columns to same value:

cols = ['Gender','Married']
df[cols] = df[cols].fillna('no data')

If need replace multiple columns is possible use dict with column names and value for replace:

df = pd.DataFrame({'Gender':['m','f',np.nan], 
                   'Married':[np.nan,'yes','no'],
                   'credit history':[1.,np.nan,0]})
print (df)
  Gender Married  credit history
0      m     NaN             1.0
1      f     yes             NaN
2    NaN      no             0.0

d = {'Gender':'no data', 'Married':'no data', 'credit history':0}
df = df.fillna(d)
print (df)
    Gender  Married  credit history
0        m  no data             1.0
1        f      yes             0.0
2  no data       no             0.0

이 기사는 인터넷에서 수집됩니다. 재 인쇄 할 때 출처를 알려주십시오.

침해가 발생한 경우 연락 주시기 바랍니다[email protected] 삭제

에서 수정2021-06-19

몇 마디 만하겠습니다

0리뷰

로그인참여 후 검토

Related 관련 기사

기사

ValueError : 입력에 NaN, 무한대 또는 dtype ( 'float64')에 비해 너무 큰 값이 있습니다.

ValueError : 입력에 NaN, 무한대 또는 dtype ( 'float64')에 비해 너무 큰 값이 있습니다.

ValueError : 입력에 NaN, 무한대 또는 dtype ( 'float64')에 비해 너무 큰 값이 있습니다. sklearn

입력에 무한대 또는 dtype ( 'float64') 오류에 비해 너무 큰 값이 있습니다.

Scikit-learn SequentialFeatureSelector 입력에 NaN, 무한대 또는 dtype ( 'float64')에 비해 너무 큰 값이 포함되어 있습니다. 파이프 라인도

Random Forest Classifier ValueError : 입력에 NaN, 무한대 또는 dtype ( 'float32')에 비해 너무 큰 값이 있습니다.

double에 비해 너무 큰 값

Python의 float가 매우 긴 입력에 대해 ValueError를 발생시키는 이유는 무엇입니까?

ValueError : Value tf.Tensor .. shape = (), dtype = float64)의 순위가 일괄 처리에 충분하지 않습니다.?

기본에 비해 너무 큰 값 (오류 토큰은 "09"임)

python gensim : 인덱스 배열에 정수가 아닌 dtype (float64)이 있습니다.

Swing에 치수 (높이 또는 너비)의 충분히 큰 (또는 무시 된) 크기를 나타내는 상수가 있습니까?

ValueError : an = 600 배열 (float)에 대한 부울 인덱스가 너무 많습니다.

KeyError : "[Float64Index ([34.62365962451697, 30.28671076822607, 35.84740876993872], dtype = 'float64')]가 [열]에 없습니다."

Pandas : ValueError (Sparse [float64, 0.0] dtype을 float64 데이터 유형으로 변환하는 방법)

Helm 템플릿에서 pluck이 float64로 평가되는 이유는 무엇입니까?

HTML 입력 태그 속성 값에 큰 따옴표를 사용해야하는 이유는 무엇입니까?

조건별로 데이터 프레임을 분할 할 때 "ValueError : 부울 배열이 조건에 대해 예상되며 float64가 아닙니다."

조건별로 데이터 프레임을 분할 할 때 "ValueError : 부울 배열이 조건에 대해 예상되지만 float64가 아닙니다."

df.astype ( 'float64')의 오류없는 실행 후에도 pandas가 객체 dtype을 float64로 변환하지 않습니다.

파이썬에서 RAM에 비해 너무 큰 순열 목록

데이터를 역 직렬화하는 동안 값이 UInt64에 대해 너무 크거나 너무 작습니다.

angular.js는 단순한 사용 사례에 비해 너무 큰 것으로 간주됩니다.

DeprecationWarning : 빈 시리즈의 기본 dtype은 향후 버전 경고에서 'float64'대신 'object'입니다.

NaN 또는 IsNumeric에 해당하는 ECL은 무엇입니까?

Pandas 형식-DataFrame float64 열 (NaN 포함)을 int로 저장하는 방법은 무엇입니까?

입력 값 길이에 해당하는 입력 너비 조정

이미 출력 된 후 클립 보드에 비해 너무 큰 경우 터미널의 모든 현재 출력을 파일로 복사하는 방법은 무엇입니까?

int64 값이있는 Pandas read_json이 ValueError를 발생시킵니다 : 값이 너무 큽니다.

"이 소프트웨어는 64 비트 OS에서 지원되지 않습니다"에 대한 해결 방법은 무엇입니까?

CMD 창이 출력에 비해 너무 큽니다.