AWS RDS with Postgres : Is OOM killer configured

Loc Ann

We are running load test against an application that hits a Postgres database.

During the test, we suddenly get an increase in error rate. After analysing the platform and application behaviour, we notice that:

  • CPU of Postgres RDS is 100%
  • Freeable memory drops on this same server

And in the postgres logs, we see:

2018-08-21 08:19:48 UTC::@:[XXXXX]:LOG: server process (PID XXXX) was terminated by signal 9: Killed

After investigating and reading documentation, it appears one possibility is linux oomkiller running having killed the process.

But since we're on RDS, we cannot access system logs /var/log messages to confirm.

So can somebody:

  • confirm that oom killer really runs on AWS RDS for Postgres
  • give us a way to check this ?
  • give us a way to compute max memory used by Postgres based on number of connections ?

I didn't find the answer here:

Fabio Manzano

AWS maintains a page with best practices for their RDS service: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html

In terms of memory allocation, that's the recommendation:

An Amazon RDS performance best practice is to allocate enough RAM so that your working set resides almost completely in memory. To tell if your working set is almost all in memory, check the ReadIOPS metric (using Amazon CloudWatch) while the DB instance is under load. The value of ReadIOPS should be small and stable. If scaling up the DB instance class—to a class with more RAM—results in a dramatic drop in ReadIOPS, your working set was not almost completely in memory. Continue to scale up until ReadIOPS no longer drops dramatically after a scaling operation, or ReadIOPS is reduced to a very small amount. For information on monitoring a DB instance's metrics, see Viewing DB Instance Metrics.

Also, that's their recommendation to troubleshoot possible OS issues:

Amazon RDS provides metrics in real time for the operating system (OS) that your DB instance runs on. You can view the metrics for your DB instance using the console, or consume the Enhanced Monitoring JSON output from Amazon CloudWatch Logs in a monitoring system of your choice. For more information about Enhanced Monitoring, see Enhanced Monitoring

There's a lot of good recommendations there, including query tuning.

Note that, as a last resort, you could switch to Aurora, which is compatible with PostgreSQL:

Aurora features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 64TB per database instance. Aurora delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across three Availability Zones.

EDIT: talking specifically about your issue w/ PostgreSQL, check this Stack Exchange thread -- they had a long connection with auto commit set to false.

We had a long connection with auto commit set to false:

connection.setAutoCommit(false)

During that time we were doing a lot of small queries and a few queries with a cursor:

statement.setFetchSize(SOME_FETCH_SIZE)

In JDBC you create a connection object, and from that connection you create statements. When you execute the statments you get a result set.

Now, every one of these objects needs to be closed, but if you close statement, the entry set is closed, and if you close the connection all the statements are closed and their result sets.

We were used to short living queries with connections of their own so we never closed statements assuming the connection will handle the things once it is closed.

The problem was now with this long transaction (~24 hours) which never closed the connection. The statements were never closed. Apparently, the statement object holds resources both on the server that runs the code and on the PostgreSQL database.

My best guess to what resources are left in the DB is the things related to the cursor. The statements that used the cursor were never closed, so the result set they returned never closed as well. This meant the database didn't free the relevant cursor resources in the DB, and since it was over a huge table it took a lot of RAM.

Hope it helps!

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

OOM killer not working?

分類Dev

Escape the death of the OOM killer in Linux

分類Dev

Is there a way to AWS RDS postgres.log into CloudWatch?

分類Dev

AWS Lambda NodeJS Connect to RDS Postgres Database

分類Dev

Connect to AWS RDS Postgres database with python

分類Dev

AWS RDS Postgres replication to GCP cloud SQL

分類Dev

Exporting a AWS Postgres RDS Table to AWS S3

分類Dev

Connecting to Postgres AWS RDS Instance - "certificate verify failed"

分類Dev

Why is the OOM killer killing processes when swap is hardly used?

分類Dev

Linuxでoom killerを無効にする方法は?

分類Dev

Why does linux out-of-memory (OOM) killer not run automatically, but works upon sysrq-key?

分類Dev

Ubuntu 18.04 OOM Killer keeps getting invoked, but I can't find any sign of out of memory

分類Dev

AWS RDS BatchExecuteStatementRequest

分類Dev

Postgres Instance on RDS vs Aurora

分類Dev

AWS-RDSでPostgresイベントトリガーを作成する

分類Dev

AWS RDS postgres.logをCloudWatchに取り込む方法はありますか?

分類Dev

AWS Jupyter Notebookを介して(Postgres)RDS DBをクエリする方法は?

分類Dev

十分なRAMを備えたOom-killer(メモリ不足)(?!)-inotify

分類Dev

Getting a GeoServer talking to an AWS RDS?

分類Dev

AWS-aws_s3拡張機能(RDSが提供)を使用したS3からRDS(postgres)へのインポートが失敗している

分類Dev

pgadmin4からAWS RDSのpostgresインスタンスに接続することができません

分類Dev

AWS Postgres RDSからS3(次にRedshift)にデータをパイプする方法は?

分類Dev

AWS RDS MySqlまたはPostgres-パフォーマンスとコストの面で?

分類Dev

AWS RDS Postgres Crypto関数は、pgcrypto拡張機能を有効にしても機能しません

分類Dev

別のAWSアカウントでRDS / Postgresレプリカを作成しますか?

分類Dev

AWS RDS:致命的:ユーザー「postgres」のパスワード認証に失敗しました

分類Dev

AWSはネットワークからRDS(Postgres)にアクセスできません

分類Dev

SVNは最大512MBを超えるファイルを転送しません(OOM Killer)

分類Dev

OOM-killerがリソースホッグを殺せないことがあるのはなぜですか?

Related 関連記事

  1. 1

    OOM killer not working?

  2. 2

    Escape the death of the OOM killer in Linux

  3. 3

    Is there a way to AWS RDS postgres.log into CloudWatch?

  4. 4

    AWS Lambda NodeJS Connect to RDS Postgres Database

  5. 5

    Connect to AWS RDS Postgres database with python

  6. 6

    AWS RDS Postgres replication to GCP cloud SQL

  7. 7

    Exporting a AWS Postgres RDS Table to AWS S3

  8. 8

    Connecting to Postgres AWS RDS Instance - "certificate verify failed"

  9. 9

    Why is the OOM killer killing processes when swap is hardly used?

  10. 10

    Linuxでoom killerを無効にする方法は?

  11. 11

    Why does linux out-of-memory (OOM) killer not run automatically, but works upon sysrq-key?

  12. 12

    Ubuntu 18.04 OOM Killer keeps getting invoked, but I can't find any sign of out of memory

  13. 13

    AWS RDS BatchExecuteStatementRequest

  14. 14

    Postgres Instance on RDS vs Aurora

  15. 15

    AWS-RDSでPostgresイベントトリガーを作成する

  16. 16

    AWS RDS postgres.logをCloudWatchに取り込む方法はありますか?

  17. 17

    AWS Jupyter Notebookを介して(Postgres)RDS DBをクエリする方法は?

  18. 18

    十分なRAMを備えたOom-killer(メモリ不足)(?!)-inotify

  19. 19

    Getting a GeoServer talking to an AWS RDS?

  20. 20

    AWS-aws_s3拡張機能(RDSが提供)を使用したS3からRDS(postgres)へのインポートが失敗している

  21. 21

    pgadmin4からAWS RDSのpostgresインスタンスに接続することができません

  22. 22

    AWS Postgres RDSからS3(次にRedshift)にデータをパイプする方法は?

  23. 23

    AWS RDS MySqlまたはPostgres-パフォーマンスとコストの面で?

  24. 24

    AWS RDS Postgres Crypto関数は、pgcrypto拡張機能を有効にしても機能しません

  25. 25

    別のAWSアカウントでRDS / Postgresレプリカを作成しますか?

  26. 26

    AWS RDS:致命的:ユーザー「postgres」のパスワード認証に失敗しました

  27. 27

    AWSはネットワークからRDS(Postgres)にアクセスできません

  28. 28

    SVNは最大512MBを超えるファイルを転送しません(OOM Killer)

  29. 29

    OOM-killerがリソースホッグを殺せないことがあるのはなぜですか?

ホットタグ

アーカイブ