Using Python to manipulate data from a CSV only applying it to the first result

user2822471

I have a CSV I'm attempting to build a small python script for that will 'convert' it to CSV (basically to prepare data into an acceptable format).

I'm hitting a bit of a road block as I need to detect the first result out of 'blocks' of results;

for example

AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
EEFFGGHHII-2.4-5.6-7.5

The first part (preceding the dash) has a variable length and is the only way to detect an 'individual' listing in the particular database. I basically want to insert a flag in a separate column which identifies each cluster that share the same code.

There are several hundred thousand listings so I can't come up with a list to just search through.

Thanks for any help.

Mark Tolonen

If the data is grouped as shown, itertools.groupby can iterate ordered data grouping by a common key:

import csv
import itertools
import operator

data1 = '''\
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
EEFFGGHHII-2.4-5.6-7.5
'''

data2 = '''\
SHIRT-RED
SHIRT-BLUE
SHIRT-GREEN
SHOE-RED
SHOE-BLUE
'''

def setup():
    '''Generate some sample input files.'''
    with open('sample1.hsv','w') as f:
        f.write(data1)
    with open('sample2.hsv','w') as f:
        f.write(data2)

def process(infile,outfile):
    with open(infile,'r',newline='') as ifile, open(outfile,'w',newline='') as ofile:
        r = csv.reader(ifile,delimiter='-')
        w = csv.writer(ofile,delimiter=',')

        # key is the first column (offset 0)
        # group is an iterator over the lines that have the same key
        for key,group in itertools.groupby(r,operator.itemgetter(0)):
            # Add a final column to the row list.  1 for first item.
            w.writerow(next(group) + [1])
            # Remaining items in group get a zero value in new column.
            for other in group:
                w.writerow(other + [0])

if __name__ == '__main__':
    setup()
    process('sample1.hsv','sample1.csv')
    process('sample2.hsv','sample2.csv')

Results:

sample1.hsv

AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
AABBCCDD-1.2-2.4-2.6
EEFFGGHHII-2.4-5.6-7.5

sample1.csv

AABBCCDD,1.2,2.4,2.6,1
AABBCCDD,1.2,2.4,2.6,0
AABBCCDD,1.2,2.4,2.6,0
AABBCCDD,1.2,2.4,2.6,0
EEFFGGHHII,2.4,5.6,7.5,1

sample2.hsv

SHIRT-RED
SHIRT-BLUE
SHIRT-GREEN
SHOE-RED
SHOE-BLUE

sample2.csv

SHIRT,RED,1
SHIRT,BLUE,0
SHIRT,GREEN,0
SHOE,RED,1
SHOE,BLUE,0

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Using Python to manipulate data from a CSV only applying it to the first result

From Dev

SQL two left joins result only data from first table

From Dev

Using streams to manipulate then write data to CSV

From Dev

How do I print only the first 10 lines from a csv file using Python?

From Dev

Get only first column values from CSV rows using CSVHelper

From Dev

How to manipulate CSV data?

From Dev

How to manipulate CSV data?

From Dev

copy data from csv to postgresql using python

From Dev

Importing data from CSV to MySQL using python

From Dev

How to show only first result from select?

From Dev

Multiprocessing in Python: keep only the first result returned

From Dev

Click only applying to the first item

From Dev

Python-xlsxwriter only writes as a text when using csv data

From Dev

Select whitespace from first line only using regex in python

From Dev

Get only the required lines from a csv file using python

From Dev

Get only the required lines from a csv file using python

From Dev

Getting only 1 Result from a Query - Saving Result in a CSV File

From Dev

Manipulate data returned from server in Angular 2 without using pipes

From Dev

Using LiveData to manipulate data in RecyclerView from inside a Fragment

From Dev

Read CSV file, manipulate columns and append result in new column. Python 2.7

From Dev

Read CSV file, manipulate columns and append result in new column. Python 2.7

From Dev

Remove \n characters only from first line of xls while converting to csv Python pandas

From Dev

SQL Query only displaying first result rather than arrayed data

From Dev

Extract column data from a CSV file using Python

From Dev

Extracting variable names and data from csv file using Python

From Dev

Exporting data as CSV file from ServiceNow instance using Python

From Dev

Extract column data from a CSV file using Python

From Dev

Extracting variable names and data from csv file using Python

From Dev

How to Alter returned data format called from csv using python

Related Related

  1. 1

    Using Python to manipulate data from a CSV only applying it to the first result

  2. 2

    SQL two left joins result only data from first table

  3. 3

    Using streams to manipulate then write data to CSV

  4. 4

    How do I print only the first 10 lines from a csv file using Python?

  5. 5

    Get only first column values from CSV rows using CSVHelper

  6. 6

    How to manipulate CSV data?

  7. 7

    How to manipulate CSV data?

  8. 8

    copy data from csv to postgresql using python

  9. 9

    Importing data from CSV to MySQL using python

  10. 10

    How to show only first result from select?

  11. 11

    Multiprocessing in Python: keep only the first result returned

  12. 12

    Click only applying to the first item

  13. 13

    Python-xlsxwriter only writes as a text when using csv data

  14. 14

    Select whitespace from first line only using regex in python

  15. 15

    Get only the required lines from a csv file using python

  16. 16

    Get only the required lines from a csv file using python

  17. 17

    Getting only 1 Result from a Query - Saving Result in a CSV File

  18. 18

    Manipulate data returned from server in Angular 2 without using pipes

  19. 19

    Using LiveData to manipulate data in RecyclerView from inside a Fragment

  20. 20

    Read CSV file, manipulate columns and append result in new column. Python 2.7

  21. 21

    Read CSV file, manipulate columns and append result in new column. Python 2.7

  22. 22

    Remove \n characters only from first line of xls while converting to csv Python pandas

  23. 23

    SQL Query only displaying first result rather than arrayed data

  24. 24

    Extract column data from a CSV file using Python

  25. 25

    Extracting variable names and data from csv file using Python

  26. 26

    Exporting data as CSV file from ServiceNow instance using Python

  27. 27

    Extract column data from a CSV file using Python

  28. 28

    Extracting variable names and data from csv file using Python

  29. 29

    How to Alter returned data format called from csv using python

HotTag

Archive