我可以使Flex模板作业开始处理数据的时间少于10分钟吗？

debugcn 发表于 Dev

杰米特

我正在使用terraform资源google_dataflow_flex_template_job部署数据流flex模板作业。

resource "google_dataflow_flex_template_job" "streaming_beam" {
  provider                = google-beta
  name                    = "streaming-beam"
  container_spec_gcs_path = module.streaming_beam_flex_template_file[0].fully_qualified_path
  parameters = {
    "input_subscription"    = google_pubsub_subscription.ratings[0].id
    "output_table"          = "${var.project}:beam_samples.streaming_beam_sql"
    "service_account_email" = data.terraform_remote_state.state.outputs.sa.email
    "network"               = google_compute_network.network.name
    "subnetwork"            = "regions/${google_compute_subnetwork.subnet.region}/subnetworks/${google_compute_subnetwork.subnet.name}"
  }
}

一切正常，但是在没有我要求的情况下，该作业似乎正在使用灵活的资源调度（flexRS）模式，我之所以这样说是因为该作业需要大约十分钟才能启动，并且在此期间状态为QUEUED，我认为这仅适用于flexRS职位。

对于生产场景，使用flexRS模式很好，但是我目前仍在开发数据流作业，这样做时，flexRS十分不便，因为无论我有多小，都要花大约10分钟才能看到我可能做出的任何更改的效果。

在启用FlexRS中说明

要启用FlexRS作业，请使用以下管道选项：--flexRSGoal = COST_OPTIMIZED，其中成本优化的目标意味着数据流服务选择任何可用的折扣资源，或--flexRSGoal = SPEED_OPTIMIZED，在其中进行优化以缩短执行时间。

然后，我发现以下语句：

要打开FlexRS，必须指定值COST_OPTIMIZED以允许Dataflow服务选择任何可用的折扣资源。

在指定管道的执行参数>设置其他云数据流管道选项

我的解释是，flexrs_goal = SPEED_OPTIMIZED将关闭flexRS模式。但是，我将google_dataflow_flex_template_job资源的定义更改为：

resource "google_dataflow_flex_template_job" "streaming_beam" {
  provider                = google-beta
  name                    = "streaming-beam"
  container_spec_gcs_path = module.streaming_beam_flex_template_file[0].fully_qualified_path
  parameters = {
    "input_subscription"    = google_pubsub_subscription.ratings[0].id
    "output_table"          = "${var.project}:beam_samples.streaming_beam_sql"
    "service_account_email" = data.terraform_remote_state.state.outputs.sa.email
    "network"               = google_compute_network.network.name
    "subnetwork"            = "regions/${google_compute_subnetwork.subnet.region}/subnetworks/${google_compute_subnetwork.subnet.name}"
    "flexrs_goal"           = "SPEED_OPTIMIZED" 
  }
}

（请注意）的添加，"flexrs_goal" = "SPEED_OPTIMIZED"但似乎没有任何区别。数据流UI确认我已设置SPEED_OPTIMIZED：

但该作业开始处理数据仍需要花费很长时间（9分46秒），并且在此期间一直处于状态= QUEUED：

2021-01-17 19:49:19.021 GMTStarting GCE instance, launcher-2021011711491611239867327455334861, to launch the template.
...
...
2021-01-17 19:59:05.381 GMTStarting 1 workers in europe-west1-d...
2021-01-17 19:59:12.256 GMTVM, launcher-2021011711491611239867327455334861, stopped.

I then tried explictly setting flexrs_goal=COST_OPTIMIZED just to see if it made any difference, but this only caused an error:

"The workflow could not be created. Causes: The workflow could not be created due to misconfiguration. The experimental feature flexible_resource_scheduling is not supported for streaming jobs. Contact Google Cloud Support for further help. "

This makes sense. My job is indeed a streaming job and the documentation does indeed state that flexRS is only for batch jobs.

This page explains how to enable Flexible Resource Scheduling (FlexRS) for autoscaled batch pipelines in Dataflow.

https://cloud.google.com/dataflow/docs/guides/flexrs

This doesn't solve my problem though. As I said above if I deploy with flexrs_goal=SPEED_OPTIMIZED then still state=QUEUED for almost ten minutes, yet as far as I know QUEUED is only applicable to flexRS jobs:

Therefore, after you submit a FlexRS job, your job displays an ID and a Status of Queued

https://cloud.google.com/dataflow/docs/guides/flexrs#delayed_scheduling

Hence I'm very confused:

Why is my job getting queued even though it is not a flexRS job?
Why does it take nearly ten minutes for my job to start processing any data?
How can I speed up the time it takes for my job to start processing data so that I can get quicker feedback during development/testing?

UPDATE, I dug a bit more into the logs to find out what was going on during those 9minutes 46 seconds. These two consecutive log messages are 7 minutes 23 seconds apart:

2021-01-17 19:51:03.381 GMT "INFO:apache_beam.runners.portability.stager:Executing command: ['/usr/local/bin/python', '-m', 'pip', 'download', '--dest', '/tmp/dataflow-requirements-cache', '-r', '/dataflow/template/requirements.txt', '--exists-action', 'i', '--no-binary', ':all:']"
2021-01-17 19:58:26.459 GMT "INFO:apache_beam.runners.portability.stager:Downloading source distribution of the SDK from PyPi"

Whatever is going on between those two log records is the main contributor to the long time spent in state=QUEUED. Anyone know what might be the cause?

sudo

如现有答案中所述，您需要apache-beam在您的requirements.txt中提取模块：

RUN pip install -U apache-beam==<version>
RUN pip install -U -r ./requirements.txt

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-04-6

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

我可以使Flex模板作业开始处理数据的时间少于10分钟吗？

我可以使Flex模板作业开始处理数据的时间少于10分钟吗？

Robocopy可以在少于一分钟的时间增量内监视文件吗？

我怎么可以在15分钟添加时间和处理溢出？

Grails-如果少于10分钟，则比较Groovy日期时间

Grails-比较Groovy日期时间（如果少于10分钟）

如何创建自定义验证器以拒绝持续时间少于10分钟的时间

24小时显示时间...如果不是30分钟过去了，我可以只显示小时吗？

@reboot 1分钟后如何开始Cron作业？

@reboot 1分钟后如何开始Cron作业？

在Unix下30分钟开始cron作业

新的Windows 10安装在会话开始前阶段需要约20分钟的时间

在特定时间（Windows Server 2008和批处理文件）之间的每5分钟cron作业

在mysql中的开始时间和结束时间之间获取30分钟的间隔数据

熊猫：如何将10分钟间隔的时间序列转换为数据帧？

以10分钟的间隔连接和绘制具有不同时间的数据

我可以使用模板处理迭代器吗？

我如何在工作时间凌晨45分钟的Google App引擎上执行Cron作业？

VSCODE中的Python处理时间超过30分钟

查找和删除播放时间少于3分钟的音频文件

如何在TFS查询编辑器中构建查询以显示故事处于活动状态的时间少于10分钟

在MongoDB中删除记录的时间早于10分钟？

总计时间到10分钟

从R？的时间向量中分别减去10分钟。

在SQL Server中将15分钟的时间序列转换为10分钟的时间序列

SparkR作业100分钟超时

我可以使用NA来处理数据吗？

iOS应用：20分钟后服务器超时验证。这正常吗，我该如何处理？

从开始和结束时间在PHP中分配20分钟的时间段

从开始和结束时间在PHP中分配20分钟的时间段

我可以从过去15分钟内以bash执行的历史记录中获取命令吗？