Hadoop/Python: Loading a reference file to use in the mapper

debugcn 投稿 Dev

n4cer500

I want to process CSV files in Python with Hadoop, but I need to reference another file that contains lookup information.

I read that I can use the -files command line option which creates a symlink to the local file, but how do I reference this file in my Python mapper file?

n4cer500

Once this job was created in Amazon EMR, I could copy the file to S3 and reference it directly using the -cacheFile option:

bin/hadoop ... -cacheFile s3://my-bucket/files/cachefile.csv#reference

In Python I could then open this file:

with open("reference") as reference_file:
    references = reference_file.read().splitlines()

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-29

コメントを追加

サインイン

分類Dev

Delete an "in use" file after no processes reference it

分類Dev

Use regexes to reformat a reference title in a BibTeX file

分類Dev

Can not use realm with object mapper swift 3.0

分類Dev

How to use Custom mapper with IObjectMapper on the service

分類Dev

Page keeps loading when inserting reference to jquery

分類Dev

React lazy loading javascript file

分類Dev

Loading NSDictionary parsing xml file

分類Dev

Error Loading the AIML file in pyaiml

分類Dev

java loading an arraylist from a file

分類Dev

pdf file not loading from website

分類Dev

Symfony: Dynamic configuration file loading

分類Dev

pydoop vs hadoopy-hadooppythonクライアント

分類Dev

Reference a Sheet and Cell for use in Formula

分類Dev

Reference to file in onChange vs onClick?

分類Dev

Cross-Reference with external file

分類Dev

Angular 5 - Dynamic base reference is causing duplicate loading of bundles|chunks

分類Dev

If a volatile reference has changed between a thread loading the reference and calling a function on it, can the old object be garbage collected?

分類Dev

Spring does not use autowired constructor for loading the bean

分類Dev

How can I use eager loading in this case?

分類Dev

Startup disk creater not loading ISO file in 14.04

分類Dev

Swift WKWebView Loading local file not working on a device

分類Dev

Loading C3.js into an HTML file

分類Dev

Get file size of PHAsset without loading in the resource?

分類Dev

Loading pickled (dill) file containing dictionary of functions

分類Dev

Is it possible to read a file without loading it into memory?

分類Dev

Django - CSS File Not Loading In Production (Debug: False)

分類Dev

Django - CSS File Not Loading In Production (Debug: False)

分類Dev

Layout file not loading any css or javaScript in laravel

分類Dev

Any suggestions for loading an Image from a file dialog?

Related 関連記事

記事