Reading large amount of data stored in lines from csv

Fugu_Fish

I need to read in a lot of data (~10^6 data points) from a *.csv-file.

  • the data is stored in lines
  • I can't know how many data points per line and how many lines are there before I read it in
  • the amount of data points per line can be different for each line

So the *.csv-file could look like this:

x Header

x1,x2

y Header

y1,y2,y3, ...

z Header

z1,z2

...

Right now I read in every line as string and split it at every comma. This is what my code looks like:

index = 1;
headerLine = textscan(csvFileHandle,'%s',1,'Delimiter','\n');

while ~isempty(headerLine{1})

    dummy = textscan(csvFileHandle,'%s',1,'Delimiter','\n', ...
                'BufSize',2^31 - 1);
    rawData(index) = textscan(dummy{1}{1},'%f','Delimiter',',');
    headerLine = textscan(csvFileHandle,'%s',1,'Delimiter','\n');

    index = index + 1;
end

It's working, but it's pretty slow. Most of the time is used while splitting the string with textscan. (~95%). I preallocated rawData with sample data, but it brought next to nothing for the speed.

Is there a better way than mine to read in something like this?

If not, is there a faster way to split this string?

nkjt

First suggestion: to read a single line as a string when looping over a file, just use fgetl (returns a nice single string so no faffing with cell arrays).

Also, you might consider (if possible), reading everything in a single go rather than making repeating reads from file:

output = textscan(fid, '%*s%s','Delimiter','\n');  % skips headers with *

If the file is so big that you can't do everything at once, try to read in blocks (e.g. tackle 1000 lines at a time, parsing data as you go).

For converting the string, there are the options of str2num or strsplit+str2double but the only thing I can think of that might be slightly quicker than textscan is sscanf. Since this doesn't accept the delimiter as a separate input put it in the format string (the last value doesn't end with ,, true, but sscanf can handle that).

for n = 1:length(output);
    data{n} = sscanf(output{n},'%f,');
end

Tests with a limited patch of test data suggests sscanf is a bit quicker (but might depend on machine/version/data sizes).

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Reading large amount of data stored in lines from csv

From Dev

Ways to display large amount of data from CSV files in HTML pages?

From Dev

How to persist large amounts of data by reading from a CSV file

From Dev

Exporting large amount of data into CSV in CodeIgniter

From Dev

Batch Reading a Result Set from a SQL Server Sproc with a very large amount of data?

From Dev

Reading groups of lines from a large text file

From Dev

How to send large amount of Data To SqlServer Stored Procedure

From Dev

Process a large amount of data from postgreSQL with Java

From Dev

Saving large amount of data from Firebase

From Dev

Reading a large amount of numbers in python

From Dev

reading lines from txt file into .csv

From Dev

Drawing large amount lines .net

From Dev

Javascript: Reading only the last x amount of lines of large server text file

From Dev

Reading data from a CSV file

From Dev

Efficiently reading specific lines from large files into R

From Dev

reading data from multiple lines as a single item

From Dev

iOS - How to handle large amount of data from webservice

From Dev

How to delete large amount of data from Oracle table in batches

From Dev

Passing large amount of data from javascript to a java function

From Dev

How to delete large amount of data from Oracle table in batches

From Dev

Xpages how to sort preselected large amount data from view

From Dev

iOS - How to handle large amount of data from webservice

From Dev

Excel unresponsive when manipulating large amount of data from add-in

From Dev

OutOfMemoryException with SQL query getting large amount of data from SQL Server

From Dev

How to load large amount of data from MySQL and save as text file?

From Dev

Optimizing the SQL Query to get data from large amount MySQL database

From Dev

Reading .csv stored in Azure Blob Storage from Excel

From Dev

C++, Reading and Saving large amount of numbers

From Dev

Excel vba reading large amount of files faster

Related Related

  1. 1

    Reading large amount of data stored in lines from csv

  2. 2

    Ways to display large amount of data from CSV files in HTML pages?

  3. 3

    How to persist large amounts of data by reading from a CSV file

  4. 4

    Exporting large amount of data into CSV in CodeIgniter

  5. 5

    Batch Reading a Result Set from a SQL Server Sproc with a very large amount of data?

  6. 6

    Reading groups of lines from a large text file

  7. 7

    How to send large amount of Data To SqlServer Stored Procedure

  8. 8

    Process a large amount of data from postgreSQL with Java

  9. 9

    Saving large amount of data from Firebase

  10. 10

    Reading a large amount of numbers in python

  11. 11

    reading lines from txt file into .csv

  12. 12

    Drawing large amount lines .net

  13. 13

    Javascript: Reading only the last x amount of lines of large server text file

  14. 14

    Reading data from a CSV file

  15. 15

    Efficiently reading specific lines from large files into R

  16. 16

    reading data from multiple lines as a single item

  17. 17

    iOS - How to handle large amount of data from webservice

  18. 18

    How to delete large amount of data from Oracle table in batches

  19. 19

    Passing large amount of data from javascript to a java function

  20. 20

    How to delete large amount of data from Oracle table in batches

  21. 21

    Xpages how to sort preselected large amount data from view

  22. 22

    iOS - How to handle large amount of data from webservice

  23. 23

    Excel unresponsive when manipulating large amount of data from add-in

  24. 24

    OutOfMemoryException with SQL query getting large amount of data from SQL Server

  25. 25

    How to load large amount of data from MySQL and save as text file?

  26. 26

    Optimizing the SQL Query to get data from large amount MySQL database

  27. 27

    Reading .csv stored in Azure Blob Storage from Excel

  28. 28

    C++, Reading and Saving large amount of numbers

  29. 29

    Excel vba reading large amount of files faster

HotTag

Archive