What is the fastest way to create a list of directories specified in a file?

Kaizer Sozay

I have a text file, "foo.txt", that specifies a directory in each line:

data/bar/foo
data/bar/foo/chum
data/bar/chum/foo
...

There could be millions of directories and subdirectories What is the quickest way to create all the directories in bulk, using a terminal command ?

By quickest, I mean quickest to create all the directories. Since there are millions of directories there are many write operations.

I am using ubuntu 12.04.

EDIT: Keep in mind, the list may not fit in memory, since there are MILLIONS of lines, each representing a directory.

EDIT: My file has 4.5 million lines, each representing a directory, composed of alphanumeric characters, the path separator "/" , and possibly "../"

When I ran xargs -d '\n' mkdir -p < foo.txt after a while it kept printing errors until i did ctrl + c:

mkdir: cannot create directory `../myData/data/a/m/e/d': No space left on device

But running df -h gives the following output:

Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda        48G   20G   28G  42% /
devtmpfs        2.0G  4.0K  2.0G   1% /dev
none            401M  164K  401M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            2.0G     0  2.0G   0% /run/shm

free -m

 total       used       free     shared    buffers     cached
Mem:          4002       3743        258          0       2870         13
-/+ buffers/cache:        859       3143
Swap:          255         26        229

EDIT: df -i

Filesystem      Inodes   IUsed  IFree IUse% Mounted on
/dev/xvda      2872640 1878464 994176   66% /
devtmpfs        512053    1388 510665    1% /dev
none            512347     775 511572    1% /run
none            512347       1 512346    1% /run/lock
none            512347       1 512346    1% /run/shm

df -T

Filesystem     Type     1K-blocks     Used Available Use% Mounted on
/dev/xvda      ext4      49315312 11447636  37350680  24% /
devtmpfs       devtmpfs   2048212        4   2048208   1% /dev
none           tmpfs       409880      164    409716   1% /run
none           tmpfs         5120        0      5120   0% /run/lock
none           tmpfs      2049388        0   2049388   0% /run/shm

EDIT: I increased the number of inodes, and reduced the depth of my directories, and it seemed to work. It took 2m16seconds this time round.

Stéphane Chazelas

With GNU xargs:

xargs -d '\n' mkdir -p -- < foo.txt

xargs will run as few mkdir commands as possible.

With standard syntax:

(export LC_ALL=C
 sed 's/[[:blank:]"\'\'']/\\&/g' < foo.txt | xargs mkdir -p --)

Where it's not efficient is that mkdir -p a/b/c will attempt some mkdir("a") and possibly stat("a") and chdir("a") and same for "a/b" even if "a/b" existed beforehand.

If your foo.txt has:

a
a/b
a/b/c

in that order, that is, if for each path, there have been a line for each of the path components before, then you can omit the -p and it will be significantly more efficient. Or alternatively:

perl -lne 'mkdir $_ or warn "$_: $!\n"' < foo.txt

Which avoids invoking a (many) mkdir command altogether.

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

What is the fastest way to compare strings in JavaScript?

来自分类Dev

What is the fastest way to serialize a DataFrame besides to_pickle?

来自分类Dev

A pythonic way to sum the values of a dict with a specified key list?

来自分类Dev

file command + search directories and sub-directories

来自分类Dev

Fastest way to upload text files into HDFS(hadoop)

来自分类Dev

fastest way to go through JSON object in Javascript

来自分类Dev

How to count directories in make file

来自分类Dev

What's a fast and pythonic/clean way of removing a sorted list from another sorted list in python?

来自分类Dev

Fastest way to get count of unique elements in javascript array

来自分类Dev

Fastest way to compare two byte arrays in F#

来自分类Dev

How to create a function at runtime with specified argument names?

来自分类Dev

Is there any way to get all sub directories from a given path?

来自分类Dev

list_directories_and_files 上缺少 Azure FileProperties 内容

来自分类Dev

How can I recursively find the *directories* containing a file of a certain type?

来自分类Dev

Any way to limit the input arguments of a FastAPI handler into several specified options?

来自分类Dev

Fastest way to find the third largest of five given numbers without using array or loops?

来自分类Dev

What is the way to kill an executing procedure?

来自分类Dev

匹配File :: create的结果

来自分类Dev

Ruby version 2.2.5 but your Gem file specified 2.1.2

来自分类Dev

How to create a list of Tab from a list of String?

来自分类Dev

How to create a constructor initialized with a list?

来自分类Dev

create list with negative value permutation

来自分类Dev

Is there any way to create rounded corners for SVG ARC?

来自分类Dev

What is the best way to iterate an nested object in rails

来自分类Dev

What is the best way to parse binary protocols with Rust

来自分类Dev

What is the correct way to customize a Bootstrap theme?

来自分类Dev

What is the best way to wait for an animation to finish in MVVM?

来自分类Dev

what is the correct way to add type definition for this module

来自分类Dev

How to create a DOT file in Python?

Related 相关文章

  1. 1

    What is the fastest way to compare strings in JavaScript?

  2. 2

    What is the fastest way to serialize a DataFrame besides to_pickle?

  3. 3

    A pythonic way to sum the values of a dict with a specified key list?

  4. 4

    file command + search directories and sub-directories

  5. 5

    Fastest way to upload text files into HDFS(hadoop)

  6. 6

    fastest way to go through JSON object in Javascript

  7. 7

    How to count directories in make file

  8. 8

    What's a fast and pythonic/clean way of removing a sorted list from another sorted list in python?

  9. 9

    Fastest way to get count of unique elements in javascript array

  10. 10

    Fastest way to compare two byte arrays in F#

  11. 11

    How to create a function at runtime with specified argument names?

  12. 12

    Is there any way to get all sub directories from a given path?

  13. 13

    list_directories_and_files 上缺少 Azure FileProperties 内容

  14. 14

    How can I recursively find the *directories* containing a file of a certain type?

  15. 15

    Any way to limit the input arguments of a FastAPI handler into several specified options?

  16. 16

    Fastest way to find the third largest of five given numbers without using array or loops?

  17. 17

    What is the way to kill an executing procedure?

  18. 18

    匹配File :: create的结果

  19. 19

    Ruby version 2.2.5 but your Gem file specified 2.1.2

  20. 20

    How to create a list of Tab from a list of String?

  21. 21

    How to create a constructor initialized with a list?

  22. 22

    create list with negative value permutation

  23. 23

    Is there any way to create rounded corners for SVG ARC?

  24. 24

    What is the best way to iterate an nested object in rails

  25. 25

    What is the best way to parse binary protocols with Rust

  26. 26

    What is the correct way to customize a Bootstrap theme?

  27. 27

    What is the best way to wait for an animation to finish in MVVM?

  28. 28

    what is the correct way to add type definition for this module

  29. 29

    How to create a DOT file in Python?

热门标签

归档