I'm trying to write a function to find the mean of an attribute for values that only fall within a specific class.
Below is my code:
`mean=0
total=0
count=0
for i in range(len(training_data)):
if (training_data[i,334])==0:
if training_data[i,2]<>None:
total+=training_data[i,2]
count+=1
mean=total/count`
However, my attribute has some null values in it. I am working with numpy, and the null values are being coded as "NaN". In my function above, even though I am specifically specifying that the value cannot be equal to "None", which is Python's equivalent to null, my "total" attribute continues to show up as 'nan'. I have tried many different equivalents to "None" and have not been able to get a value for the total variable other than 'nan'. Is there something obvious I'm missing? Thank you in advance!
With the power of numpy
your code can be trimmed to 2 lines:
idx = training_data[:,334] == 0
mean = np.nanmean(training_data[idx, 2])
idx
here is the boolean array which is True
for the indices of rows falling into specific class, and np.nanmean
calculates the mean value ignoring NaNs.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments