Java - Read file and split into multiple files

Ankit Rustagi

I have a file which I would like to read in Java and split this file into n (user input) output files. Here is how I read the file:

int n = 4;
BufferedReader br = new BufferedReader(new FileReader("file.csv"));
try {
    String line = br.readLine();

    while (line != null) {
        line = br.readLine();
    }
} finally {
    br.close();
}

How do I split the file - file.csv into n files?

Note - Since the number of entries in the file are of the order of 100k, I can't store the file content into an array and then split it and save into multiple files.

harsh

Since one file can be very large, each split file could be large as well.

Example:

Source File Size: 5GB

Num Splits: 5: Destination

File Size: 1GB each (5 files)

There is no way to read this large split chunk in one go, even if we have such a memory. Basically for each split we can read a fix size byte-array which we know should be feasible in terms of performance as well memory.

NumSplits: 10 MaxReadBytes: 8KB

public static void main(String[] args) throws Exception
    {
        RandomAccessFile raf = new RandomAccessFile("test.csv", "r");
        long numSplits = 10; //from user input, extract it from args
        long sourceSize = raf.length();
        long bytesPerSplit = sourceSize/numSplits ;
        long remainingBytes = sourceSize % numSplits;

        int maxReadBufferSize = 8 * 1024; //8KB
        for(int destIx=1; destIx <= numSplits; destIx++) {
            BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+destIx));
            if(bytesPerSplit > maxReadBufferSize) {
                long numReads = bytesPerSplit/maxReadBufferSize;
                long numRemainingRead = bytesPerSplit % maxReadBufferSize;
                for(int i=0; i<numReads; i++) {
                    readWrite(raf, bw, maxReadBufferSize);
                }
                if(numRemainingRead > 0) {
                    readWrite(raf, bw, numRemainingRead);
                }
            }else {
                readWrite(raf, bw, bytesPerSplit);
            }
            bw.close();
        }
        if(remainingBytes > 0) {
            BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+(numSplits+1)));
            readWrite(raf, bw, remainingBytes);
            bw.close();
        }
            raf.close();
    }

    static void readWrite(RandomAccessFile raf, BufferedOutputStream bw, long numBytes) throws IOException {
        byte[] buf = new byte[(int) numBytes];
        int val = raf.read(buf);
        if(val != -1) {
            bw.write(buf);
        }
    }

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

how to split an XML file into multiple XML files using java

From Dev

Split file into multiple files in Excel

From Dev

split file into multiple files (by columns)

From Dev

How do you read a text file, then split that text file into multiple text files with python?

From Dev

How do you read a text file, then split that text file into multiple text files with python?

From Dev

How to split a file to multiple files with multiple threads?

From Dev

Split diary file into multiple files using Python

From Dev

How to split csv file into multiple files by size

From Dev

How to split a .wav file into multiple .wav files?

From Dev

How to split a file into multiple files in C?

From Dev

How to split a text file into multiple text files

From Dev

Split txt file into multiple new files with regex

From Dev

Java read txt file to hashmap, split by ":"

From Dev

How to read a text file in Java and split it?

From Dev

Java read a file and split lines into array for Output

From Dev

read file with multiple delimiters in Java

From Dev

Cut a file in multiple files (Java)

From Dev

VBA - To split and excel file into multiple files and split those files into multiple sheets

From Dev

How to read content of a file and then split the content into several files?

From Dev

Java read integer arrays from multiple files

From Dev

How to read multiple XML files in Java?

From Dev

How to read multiple line from file and split it to array in php

From Dev

How to read multiple line from file and split it to array in php

From Dev

Read file and split each line into multiple variable in C++

From Dev

How to read files in a .zip file in Java?

From Dev

Read file using Java nio files walk()

From Dev

How to split one text file into multiple *.txt files?

From Dev

split file into multiple files based upon differing start and end delimiter

From Dev

Can I split a large HAProxy config file into multiple smaller files?

Related Related

  1. 1

    how to split an XML file into multiple XML files using java

  2. 2

    Split file into multiple files in Excel

  3. 3

    split file into multiple files (by columns)

  4. 4

    How do you read a text file, then split that text file into multiple text files with python?

  5. 5

    How do you read a text file, then split that text file into multiple text files with python?

  6. 6

    How to split a file to multiple files with multiple threads?

  7. 7

    Split diary file into multiple files using Python

  8. 8

    How to split csv file into multiple files by size

  9. 9

    How to split a .wav file into multiple .wav files?

  10. 10

    How to split a file into multiple files in C?

  11. 11

    How to split a text file into multiple text files

  12. 12

    Split txt file into multiple new files with regex

  13. 13

    Java read txt file to hashmap, split by ":"

  14. 14

    How to read a text file in Java and split it?

  15. 15

    Java read a file and split lines into array for Output

  16. 16

    read file with multiple delimiters in Java

  17. 17

    Cut a file in multiple files (Java)

  18. 18

    VBA - To split and excel file into multiple files and split those files into multiple sheets

  19. 19

    How to read content of a file and then split the content into several files?

  20. 20

    Java read integer arrays from multiple files

  21. 21

    How to read multiple XML files in Java?

  22. 22

    How to read multiple line from file and split it to array in php

  23. 23

    How to read multiple line from file and split it to array in php

  24. 24

    Read file and split each line into multiple variable in C++

  25. 25

    How to read files in a .zip file in Java?

  26. 26

    Read file using Java nio files walk()

  27. 27

    How to split one text file into multiple *.txt files?

  28. 28

    split file into multiple files based upon differing start and end delimiter

  29. 29

    Can I split a large HAProxy config file into multiple smaller files?

HotTag

Archive