scheduler: how to tune cfq to favor interactive processes

debugcn 投稿 Dev

arielf

Problem: scheduler doesn't seem to favor interactive processes:

On a desktop system with automatic cron-scheduled backups from one (btrfs) disk to another (ext4). The backup process mounts the idle disk (/dev/sda<X>), backups to it, and finally unmounts it.

Every time the backup process kicks in, the system becomes unusable. The scheduler seems to be failing to do its most basic job of favoring interactive processes over batch ones. While the backup processes run, there's a lot of IO going on and everything else freezes. The keyboard and mouse pointer stop responding. Echo when keys are pressed in any terminal/shell is delayed by several seconds.

As soon as the backup completes, interactive response goes back to normal.

More details on the setup and configs:

The backup process uses rsnapshot (which calls rsync and cp -al) and runs at a lower priority (the backup job is preceded by nice), like so:

nice /usr/bin/rsnapshot -VD -c /etc/my-rsnapshot.conf daily

Running the backup under nice doesn't seem to help. During backups, all interactive processes seem to be starved by the heavy CPU and IO of the rsync and cp processes.

This is a IA-64, iCore-7 system, which should be able to run 8-processes in parallel. Memory is 16GB and some of it is free. Trimmed down mount output (when additional disk is mounted) is:

/dev/sdb2 on / type btrfs (rw,relatime,subvol=@,thread_pool=4)
/dev/sdb3 on /home type btrfs (rw,relatime,subvol=@home,thread_pool=4)

/dev/sda2 on /media/idisk/root ext4 (rw,relatime)
/dev/sda3 on /media/idisk/home ext4 (rw,relatime)

none on /sys/fs/cgroup type tmpfs (rw)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuset,clone_children)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu,release_agent=/run/cgmanager/agents/cgm-release-agent.cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory,release_agent=/run/cgmanager/agents/cgm-release-agent.memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices,release_agent=/run/cgmanager/agents/cgm-release-agent.devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer,release_agent=/run/cgmanager/agents/cgm-release-agent.freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio,release_agent=/run/cgmanager/agents/cgm-release-agent.blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb,release_agent=/run/cgmanager/agents/cgm-release-agent.hugetlb)

This is on an up-to-date 14.04 LTS system. The scheduler, by default is set to completely-fair-queue (cfq):

# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]
# cat /sys/block/sdb/queue/scheduler
noop deadline [cfq]

I was able to find one related question. scheduler starves processes which suggests to use nice, but I'm already doing this.

Another related question with relevant information is: How do I change the noop scheduler

How can I make keyboard, mouse, and interactive-shells more responsive when the backup is running?

Thanks in advance.

arielf

Just a partial answer, done more research and experiments since asking which have solved my problem, and seeing there are no responses

There are known issues/bugs in the Linux kernel schedulers as of early 2016.

The short summary is that under different circumstances, cores remain idle even though there are runnable processes in the process queue.

References:

A switch from btrfs to ext4 can alleviate these issues:

I personally switched back from btrfs to ext4. I/O performance has noticeably improved.

A switch to SSD can further alleviate IO performance

SSDs have dropped significantly in price and reliability. A 2TB Samsung SSD (EVO 850) now costs a little over $600. Switching the system (root and home) to SSD now makes the intensive backup activity completely unnoticeable (system SSD is snappy while doing heavy writing to a regular ext4-formatted disk on the same system).

Finally: with SSD, the benefit of complex schedulers in the kernel seems to becoming questionable. I changed my default to noop with no noticeable degradation whatsoever in performance. I fact, with a noop scheduler, I see a reduction in system load, lower CPU scaling numbers, and lower hardware temperatures.

$ cat /sys/block/sda/queue/scheduler
[noop] deadline cfq

$ cat /proc/cpuinfo | grep  Hz
model name      : Intel(R) Core(TM) i7-4771 CPU @ 3.50GHz
cpu MHz         : 836.308
model name      : Intel(R) Core(TM) i7-4771 CPU @ 3.50GHz
cpu MHz         : 990.253
... similar low actual frequency scaling for all cores ...

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-07-4

コメントを追加

サインイン

分類Dev

Related 関連記事

記事

scheduler: how to tune cfq to favor interactive processes

scheduler: how to tune cfq to favor interactive processes

How to tune an Oracle SQL query

How to fine-tune a functional model in Keras?

Eclipse PDT: how to tune content assist proposals?

How to remove Ubuntu in favor of Windows 10

CFQ `cfq_cfq_cfqq_slice_new（cfqq）`を理解する

How to make Reportlab PDF interactive?

How to scrap an interactive charts with scrapy?

How many Nginx processes should there be?

How to stop heroku db processes

How to select specific processes with 'top'?

How to check event scheduler status mysql

How to design dag for different scheduler start time

How to run SSIS package with password in Windows Scheduler?

How to run cron scheduler only once?

How to select the mq-deadline scheduler?

How to use windows task scheduler to run updatedb?

How to Implement iOS8 Interactive Notification

How to add a live (interactive) console window to Atom?

How to extract data from an interactive chart?

How can I make an interactive menu with C?

How to write an interactive program ( Parameters and objects)

How to tune/edit/add/change the kafka docker container parameters outside from the container

How do I kill processes in Ubuntu?

How many processes does TensorFlow open?

How to avoid defunct python processes in Linux?

How to get number of processes and threads in a C program?

How to track newly created processes in Linux?

How to design an enterprise application with many independent processes?

How can I see what processes are running?