i want to manipulate the following cvs file:
"Day" "Hour" "X1" "X2" "X3" "X4" "X5"
2015-01-01 00:00 1 2 3 4 5
.....
to the following:
"Day Hour" "X2" "X3" "X5"
"2015-01-01 00:00" 2 3 5
.....
It's just combine two columns and use a range of columns. Ive tried to following:
csv = pandas.read_csv('test.csv')
csv['Time'] = cvs.Day + " " + csv.Hour
csv.set_index('Time')
I can not figure out how to get this columns without creating a new DataFrame.
You can reassign csv to a new dataframe:
df['Time'] = df.Day + " " + df.Hour
df = df[[-1]]
Once you have no other reference to the df then it will be gc'd
Or use the csv
lib to read and join the columns after zipping with transposing with itertools.izip
:
import pandas as pd
from itertools import izip
import csv
with open("foo.csv") as f:
next(f) # skip header
r = csv.reader(f)
zp = izip(*r)
pairs = izip(next(zp), next(zp))
df = pd.DataFrame(("{} {}".format(a,b) for a,b in pairs),columns=["Time"])
print(df)
Output:
Time
0 2015-01-01 00:00
If you actually want to keep the other columns just drop after creating the new column:
df['Time'] = df.Day + " " + df.Hour
df.drop(["Day","Hour"],axis=1,inplace=True)
print(df)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments