Read Shapefile from Google Cloud Storage using Dataflow + Beam + Python

samuq

How can one read Shapefile from Google Cloud Storage using Dataflow + Beam + Python.
I've found only beam.io.ReadFromText, but python shapefile reader demands file-like object: shp.Reader(shp=shp_file, dbf=dbf_file) or a shapefile.
I'm using Python 2.7.

samuq

This is the way to do it:

prj_file =  beam.io.gcp.gcsio.GcsIO().open(
    filenamePRJ, 
    mode='r',
    read_buffer_size=1677721600, 
    mime_type='application/octet-stream'
)

shp_file = beam.io.gcp.gcsio.GcsIO().open(
    filenameSHP, 
    mode='r',
    read_buffer_size=1677721600,
    mime_type='application/octet-stream'
)

dbf_file =  beam.io.gcp.gcsio.GcsIO().open(
    filenameDBF,
    mode='r',
    read_buffer_size=1677721600,
    mime_type='application/octet-stream'
)

sf = shp.Reader(shp=shp_file, dbf=dbf_file)      
euref  = osr.SpatialReference()
euref.ImportFromWkt(str(prj_file.read()))
wgs84 = osr.SpatialReference()
wgs84.ImportFromEPSG(4326)
transformation = osr.CoordinateTransformation(euref,wgs84)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Read and write avro files by inferring schema using Python SDK in Google Cloud Dataflow - Apache Beam

From Dev

Beam / Dataflow Custom Python job - Cloud Storage to PubSub

From Dev

Read a set of xml files using Google Cloud DataFlow python sdk

From Dev

Read a CSV from Google Cloud Storage using Google Cloud Functions in Python script

From Dev

convert csv to avro in python using google-cloud-dataflow beam.io.avroio.WriteToAvro(

From Dev

TypeError when connecting to Google Cloud BigQuery from Apache Beam Dataflow in Python?

From Dev

Are there any runners supported for apache beam python besides google cloud dataflow?

From Dev

Writing to Google Cloud Storage from PubSub using Cloud Dataflow using DoFn

From Dev

Moving data from Cloud SQL to Elastic Search using Beam and DataFlow

From Dev

Merging files in Google Cloud Storage using Google Cloud Dataflow

From Dev

How to skip carriage returns in csv file while reading from cloud storage using google cloud dataflow in java

From Dev

Using Dask to read parquet files from a google cloud storage

From Dev

Read JSON file directly from google storage (using Cloud Functions)

From Dev

Logs for Beam application in Google cloud dataflow

From Dev

How to deploy Google Cloud Dataflow with connection to PostgreSQL (beam-nuggets) from Google Cloud Functions

From Java

GCS - Read a text file from Google Cloud Storage directly into python

From Dev

Read huge JSON line by line from Google Cloud Storage with Python

From Dev

HttpForbiddenError when trying to access Google Cloud Storage from Apache Beam

From Dev

Streaming dataflow from Google Cloud Storage to Big Query

From Dev

Read image from Google Cloud storage and send it using Google Cloud function

From Dev

Google Datalab read from cloud storage

From Dev

Apache Beam/Google dataflow Python streaming autoscaling

From Dev

Download multiple file from Google cloud storage using Python

From Dev

How to serve an image from google cloud storage using python flask

From Dev

What is a convenient way to deploy and manage execution of a Python SDK Apache Beam pipeline for Google cloud Dataflow

From Dev

How to read data from Google storage cloud to Google cloud datalab

From Dev

How do I write compressed files to Google Cloud Storage using Google Cloud Dataflow?

From Dev

How to query datastore from dataflow/beam in python

From Dev

Access elements of PCollectionView<List<Foo>> : Google Cloud Dataflow/Apache Beam

Related Related

  1. 1

    Read and write avro files by inferring schema using Python SDK in Google Cloud Dataflow - Apache Beam

  2. 2

    Beam / Dataflow Custom Python job - Cloud Storage to PubSub

  3. 3

    Read a set of xml files using Google Cloud DataFlow python sdk

  4. 4

    Read a CSV from Google Cloud Storage using Google Cloud Functions in Python script

  5. 5

    convert csv to avro in python using google-cloud-dataflow beam.io.avroio.WriteToAvro(

  6. 6

    TypeError when connecting to Google Cloud BigQuery from Apache Beam Dataflow in Python?

  7. 7

    Are there any runners supported for apache beam python besides google cloud dataflow?

  8. 8

    Writing to Google Cloud Storage from PubSub using Cloud Dataflow using DoFn

  9. 9

    Moving data from Cloud SQL to Elastic Search using Beam and DataFlow

  10. 10

    Merging files in Google Cloud Storage using Google Cloud Dataflow

  11. 11

    How to skip carriage returns in csv file while reading from cloud storage using google cloud dataflow in java

  12. 12

    Using Dask to read parquet files from a google cloud storage

  13. 13

    Read JSON file directly from google storage (using Cloud Functions)

  14. 14

    Logs for Beam application in Google cloud dataflow

  15. 15

    How to deploy Google Cloud Dataflow with connection to PostgreSQL (beam-nuggets) from Google Cloud Functions

  16. 16

    GCS - Read a text file from Google Cloud Storage directly into python

  17. 17

    Read huge JSON line by line from Google Cloud Storage with Python

  18. 18

    HttpForbiddenError when trying to access Google Cloud Storage from Apache Beam

  19. 19

    Streaming dataflow from Google Cloud Storage to Big Query

  20. 20

    Read image from Google Cloud storage and send it using Google Cloud function

  21. 21

    Google Datalab read from cloud storage

  22. 22

    Apache Beam/Google dataflow Python streaming autoscaling

  23. 23

    Download multiple file from Google cloud storage using Python

  24. 24

    How to serve an image from google cloud storage using python flask

  25. 25

    What is a convenient way to deploy and manage execution of a Python SDK Apache Beam pipeline for Google cloud Dataflow

  26. 26

    How to read data from Google storage cloud to Google cloud datalab

  27. 27

    How do I write compressed files to Google Cloud Storage using Google Cloud Dataflow?

  28. 28

    How to query datastore from dataflow/beam in python

  29. 29

    Access elements of PCollectionView<List<Foo>> : Google Cloud Dataflow/Apache Beam

HotTag

Archive