Need to split variable length data in a pandas dataframe column into multiple columns

debugcn Published at Dev

Nexxsys

I have 2 column dataframe likes this:

ITEM        REFNUMS
1   00000299    0036701923024762922029229294652954429569295832...
2   00000655    NaN
24  00001791    00016027123076000158004563065131972
25  00001805    00016027123076000158004563065131972
26  00001813    00016027123076000158004563065131972
27  00001821    00016027123076000158004563065131972
28  00001937    0142530521316303164702509000510012201310027820...

I would like to split the REFNUMS columns into divisible parts and add onto the existing dataframe if possible as I need to retain the row index and matching ITEM #. The data in REFNUMS is a length divisible by 5, when not NaN, so for example Row 1 is = 78 sets of 5.

data_len = (data['REFNUMS'].str.len())/5

Then

0         NaN
1        78.0
2         NaN

Appreciate any suggestions on how to do this.

DSM

IIUC, you can use str.extractall to get the groups of 5 digits, clean up the columns, and then join:

In [168]: r = df.REFNUMS.str.extractall("(\d{1,5})").unstack()

In [169]: r.columns = r.columns.droplevel(0)

In [170]: df.join(r)
Out[170]: 
    ITEM                                            REFNUMS      0      1      2      3      4      5      6      7      8     9
1    299  0036701923024762922029229294652954429569295832...  00367  01923  02476  29220  29229  29465  29544  29569  29583     2
2    655                                                NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN   NaN
24  1791                00016027123076000158004563065131972  00016  02712  30760  00158  00456  30651  31972   None   None  None
25  1805                00016027123076000158004563065131972  00016  02712  30760  00158  00456  30651  31972   None   None  None
26  1813                00016027123076000158004563065131972  00016  02712  30760  00158  00456  30651  31972   None   None  None
27  1821                00016027123076000158004563065131972  00016  02712  30760  00158  00456  30651  31972   None   None  None
28  1937  0142530521316303164702509000510012201310027820...  01425  30521  31630  31647  02509  00051  00122  01310  02782     0

Collected from the Internet

Please contact [email protected] to delete if infringement.