I Want a Separate column which returns "Yes" if the column "ID" contains all numeric values and 'No' if it contains alphabets or alphanumeric values.
ID Result
3965 Yes
wyq8 No
RO_123 No
CMD_ No
2976 Yes
You can use pd.Series.str.isnumeric
here.
df['Result'] = np.where(df['ID'].str.isnumeric(), 'YES', 'NO')
ID Result
0 3965 YES
1 wyq8 NO
2 RO_123 NO
3 CMD_ NO
4 2976 YES
There's a caveat with using isnumeric
it doesn't identify float
numbers.
test = pd.Series(["9.0", "9"])
test.str.isnumeric()
0 False
1 True
dtype: bool
If you strictly mark YES
for int
then use isnumeric
else you can use pd.Series.str.fullmatch
(available from version 1.1.0) here.
df['Result'] = np.where(df['ID'].str.fullmatch(r"\d+|\d+\.\d+", 'YES', 'NO')
For version <1.1.0 you use re.fullmatch
numeric_pat = re.compile(r"\d+|\d+\.\d+")
def numeric(val):
match = numeric_pat.fullmatch(val)
if match: return 'YES'
else: return 'NO'
df['Result'] = df['ID'].apply(numeric)
Or we can use pd.to_numeric
with boolean masking using pd.Series.isna
m = pd.to_numeric(df['ID'], errors='coerce').isna()
df['Result'] = np.where(m, 'NO', 'YES')
With errors
parameter set to 'coerce'
values which cannot be turned into numeic value will set to Nan
.
test = pd.Series(['3965', 'wyq8', 'RO_123', 'CMD_', '2976'])
pd.to_numeric(test)
0 3965.0
1 NaN
2 NaN
3 NaN
4 2976.0
Name: ID, dtype: float64
Or you can build a custom function
def numeric(val):
try:
float(val) # Using just `float` would suffice as int can be
return 'YES' # converted to `float` so both `int`
# and `float` wouldnot raise any error
except ValueError:
return 'NO'
df['Result'] = df['ID'].apply(numeric)
Note: float
handles scientic notation too, float("1e6")
-> 1000000.0
.
test = pd.Series(['1e6', '1', 'a 10', '1E6'])
test.apply(numeric)
0 YES
1 YES
2 NO
3 YES
dtype: object
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments