My Google AI Platform / ML Engine training job doesn't seem to have access to the training file I put into a Google Cloud Storage bucket.
Google's AI Platform / ML Engine requires you store training data files in one of their Cloud Storage buckets. Accessing locally from CLI works fine. However, when I send a training job (after ensuring the data is in the appropriate location in my Cloud Storage bucket), I get an error seeming to be due to no access to the bucket Link URL
.
The error is from trying to read what looks to me like the contents of a web page that Google served up saying "Hey, you don't have access to this." I see this gaia.loginAutoRedirect.start(5000,
and a URL with this flag at the end: noautologin=true
.
I know permissions between AI Platform and Cloud Storage are a thing, but both are under the same project. The walkthroughs I'm using at very least imply that no further action is required if under the same project.
I am assuming I need to use the Link URL
provided in the bucket Overview tab. Tried the Link for gsutil
but the python (from Google's CloudML Samples repo) was upset about using gs://
.
I think Google's examples are proving insufficient since their example data is from a public URL rather than a private Cloud Storage bucket.
Ultimately, the error message I get is a Python error. But like I said, this is preceded by a bunch of gross INFO
logs of HTML/CSS/JS from Google saying I don't have permission to get the file I'm trying to get. These logs are actually just because I added a print statement to the util.py
file as well - right before read_csv()
on the train file. (So the Python parse error is due to trying to parse HTML as a CSV).
...
INFO g("gaia.loginAutoRedirect.stop",function(){var b=n;b.b=!0;b.a&&(clearInterval(b.a),b.a=null)});
INFO gaia.loginAutoRedirect.start(5000,
INFO 'https:\x2F\x2Faccounts.google.com\x2FServiceLogin?continue=https%3A%2F%2Fstorage.cloud.google.com%2F<BUCKET_NAME>%2Fdata%2F%2Ftrain.csv\x26followup=https%3A%2F%2Fstorage.cloud.google.com%2F<BUCKET_NAME>%2Fdata%2F%2Ftrain.csv\x26service=cds\x26passive=1209600\x26noautologin=true',
ERROR Command '['python', '-m', u'trainer.task', u'--train-files', u'gs://<BUCKET_NAME>/data/train.csv', u'--eval-files', u'gs://<BUCKET_NAME>/data/test.csv', u'--batch-pct', u'0.2', u'--num-epochs', u'1000', u'--verbosity', u'DEBUG', '--job-dir', u'gs://<BUCKET_NAME>/predictor']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 137, in <module>
train_and_evaluate(args)
File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 80, in train_and_evaluate
train_x, train_y, eval_x, eval_y = util.load_data()
File "/root/.local/lib/python2.7/site-packages/trainer/util.py", line 168, in load_data
train_df = pd.read_csv(training_file_path, header=0, names=_CSV_COLUMNS, na_values='?')
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 678, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 446, in _read
data = parser.read(nrows)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1036, in read
ret = self._engine.read(nrows)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1848, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 891, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 945, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 932, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2112, in pandas._libs.parsers.raise_parser_error
ParserError: Error tokenizing data. C error: Expected 5 fields in line 205, saw 961
To get the data, I'm more or less trying to mimic this: https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/census/tf-keras/trainer/util.py
Various ways I have tried to address my bucket in my copy of util.py: https://console.cloud.google.com/storage/browser/<BUCKET_NAME>/data
(think this was the "Link URL" back in May)
https://storage.cloud.google.com/<BUCKET_NAME>/data
(this is the "Link URL" now - in July)
gs://<BUCKET_NAME>/data
(this is the URI - which gives a different error about not liking gs
as a url type)
Transferring the answer from a comment above:
Looks like the URL approach requires cookie based authentication if it's not a public object. Instead of using a URL, I would suggest using tf.gfile with a gs:// path, as is used in the Keras sample. If you need to download the file from GCS in a separate step, you can use the GCS client library.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments