rsyslog significantly slows down OpenStack services

Bug #1580200 reported by Ilya Shakhat
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
High
Maksim Malchuk
6.1.x
Won't Fix
High
MOS Maintenance
7.0.x
Won't Fix
High
MOS Maintenance
8.0.x
Won't Fix
High
MOS Maintenance
Mitaka
Fix Released
High
Maksim Malchuk

Bug Description

Detailed bug description:
 It's observed that OpenStack API services reply with different delay on different controllers in the cluster. The difference is around 10 times. The investigation shows that the "slow" node becomes faster if rsyslog service is stopped. Please see detailed strace captures in attachment below.

Note: this bug maybe related to https://bugs.launchpad.net/fuel/+bug/1580197

Steps to reproduce:
  The issue appears at 200-node scale after Rally tests execution

Expected results:
  Services work fast

Actual result:
  Service slow 10x times when rsyslog is running

Reproducibility:
 2 out of 2 times

Workaround:
  Disable rsyslogd

Impact:
  Supportability

Description of the environment:
 Operation system: Ubuntu
 Versions of components: MOS 9
 Reference architecture: MOS
 Network model: Neutron 9
 Related projects installed: none

Revision history for this message
Ilya Shakhat (shakhat) wrote :
Download full text (9.5 KiB)

rsyslogd is ON (note that one of calls lasts for 1.5 seconds!):

13:55:30.540264 setsockopt(8, SOL_TCP, TCP_QUICKACK, [1], 4) = 0 <0.000031>
13:55:30.540419 recvfrom(8, "GET / HTTP/1.1\r\nUser-Agent: curl"..., 8192, 0, NULL, NULL) = 77 <0.000010>
13:55:30.540550 getsockname(8, {sa_family=AF_INET, sin_port=htons(9292), sin_addr=inet_addr("192.168.0.88")}, [16]) = 0 <0.000009>
13:55:30.540985 stat("/var/log/glance/api.log", {st_mode=S_IFREG|0644, st_size=2606291, ...}) = 0 <0.000016>
13:55:30.542807 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000017>
13:55:30.542885 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000014>
13:55:30.542987 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000015>
13:55:30.543091 write(3, "2016-05-10 13:55:30.540 7654 DEB"..., 228) = 228 <0.000020>
13:55:30.543186 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000011>
13:55:30.543228 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000009>
13:55:30.543274 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000013>
13:55:30.543344 sendto(7, "<151>May 10 13:55:30 glance-api:"..., 260, MSG_NOSIGNAL, NULL, 0) = 260 <1.587517>
13:55:32.131202 stat("/var/log/glance/api.log", {st_mode=S_IFREG|0644, st_size=2606519, ...}) = 0 <0.000015>
13:55:32.131641 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000018>
13:55:32.131721 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000013>
13:55:32.131791 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000012>
13:55:32.131885 write(3, "2016-05-10 13:55:32.131 7654 DEB"..., 199) = 199 <0.000019>
13:55:32.131979 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000017>
13:55:32.132066 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000017>
13:55:32.132139 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000013>
13:55:32.132218 sendto(7, "<151>May 10 13:55:32 glance-api:"..., 231, MSG_NOSIGNAL, NULL, 0) = 231 <0.000012>
13:55:32.132385 stat("/var/log/glance/api.log", {st_mode=S_IFREG|0644, st_size=2606718, ...}) = 0 <0.000010>
13:55:32.132463 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000012>
13:55:32.132515 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000012>
13:55:32.132564 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000010>
13:55:32.132628 write(3, "2016-05-10 13:55:32.132 7654 DEB"..., 222) = 222 <0.000017>
13:55:32.132728 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000012>
13:55:32.132772 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000009>
13:55:32.132812 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 <0.000009>
13:55:32.132863 sendto(7, "<151>May 10 13:55:32 glance-api:"..., 254, MSG_NOSIGNAL, NULL, 0) = 254 <0.000011>
13:55:32.133379 sendto(8, "HTTP/1.1 300 Multiple Choices\r\nC"..., 768, 0, NULL, 0) = 768 <0.000038>
13:55:32.133468 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ......

Read more...

Changed in fuel:
milestone: none → 9.0
Peter Zhurba (pzhurba)
Changed in fuel:
importance: Undecided → Medium
Dmitry Pyzhov (dpyzhov)
tags: added: area-library
Changed in fuel:
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
status: New → Confirmed
milestone: 9.0 → 10.0
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Maksim Malchuk (mmalchuk)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/315692

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/315746

Revision history for this message
Ilya Shakhat (shakhat) wrote :

Raising priority to High and targeting the bug to 9.0.

It's observed that when rsyslog service is running the performance degrades significantly, e.g. for keystone (apache log, the duration is in 8th column in microseconds):

rsyslog is running:
- - - [13/May/2016:12:56:13 +0000] "GET /v3 HTTP/1.1" 200 247 777912 "-" "curl/7.35.0"
- - - [13/May/2016:12:56:14 +0000] "GET /v3 HTTP/1.1" 200 247 318353 "-" "curl/7.35.0"
- - - [13/May/2016:12:56:15 +0000] "GET /v3 HTTP/1.1" 200 247 303048 "-" "curl/7.35.0"

rsyslog is stopped:
- - - [13/May/2016:12:56:20 +0000] "GET /v3 HTTP/1.1" 200 247 8799 "-" "curl/7.35.0"
- - - [13/May/2016:12:56:21 +0000] "GET /v3 HTTP/1.1" 200 247 7909 "-" "curl/7.35.0"
- - - [13/May/2016:12:56:22 +0000] "GET /v3 HTTP/1.1" 200 247 4765 "-" "curl/7.35.0"
- - - [13/May/2016:12:56:22 +0000] "GET /v3 HTTP/1.1" 200 247 4665 "-" "curl/7.35.0"

On every request the service (keystone) writes only 2 messages to log, one on INFO level and the other on DEBUG. However the difference in duration is in two degrees of ten.

Changed in fuel:
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/315692
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=9967f88e0ad8c68150c05c86af946056f87cc93d
Submitter: Jenkins
Branch: master

commit 9967f88e0ad8c68150c05c86af946056f87cc93d
Author: Maksim Malchuk <email address hidden>
Date: Tue May 10 13:34:49 2016 +0300

    Use same rsyslog config files for remote logging

    The cloud-init should use the same filenames for the rsyslog remote
    logging as used in the fuel-library templates which would be
    overwritten during the deployment by the puppet. This is partially
    fix the problem with duplicate messages sent from slave nodes to the
    Fuel master node rsyslog.

    Change-Id: Ib95db5545889e8ae2b86dcf787c4adb7413d5698
    Partial-Bug: #1579081
    Partial-Bug: #1580200
    Signed-off-by: Maksim Malchuk <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/316299

tags: added: scale
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/mitaka)

Reviewed: https://review.openstack.org/316299
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=3538a7a9583fc2c0db9e4e39c7621e3bbefd2202
Submitter: Jenkins
Branch: stable/mitaka

commit 3538a7a9583fc2c0db9e4e39c7621e3bbefd2202
Author: Maksim Malchuk <email address hidden>
Date: Tue May 10 13:34:49 2016 +0300

    Use same rsyslog config files for remote logging

    The cloud-init should use the same filenames for the rsyslog remote
    logging as used in the fuel-library templates which would be
    overwritten during the deployment by the puppet. This is partially
    fix the problem with duplicate messages sent from slave nodes to the
    Fuel master node rsyslog.

    Change-Id: Ib95db5545889e8ae2b86dcf787c4adb7413d5698
    Partial-Bug: #1579081
    Partial-Bug: #1580200
    Signed-off-by: Maksim Malchuk <email address hidden>
    (cherry picked from commit 9967f88e0ad8c68150c05c86af946056f87cc93d)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/315746
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=50aa72b3d41c6c4be24bea03ea8ad2ba0e2fbb3e
Submitter: Jenkins
Branch: master

commit 50aa72b3d41c6c4be24bea03ea8ad2ba0e2fbb3e
Author: Maksim Malchuk <email address hidden>
Date: Thu May 12 22:12:36 2016 +0300

    Remove duplicate configurations of the rsyslog

    This commit removes the duplicate configuration settings configured
    by the incorrect usage of the rsyslog::client class for local and
    remote logging. Also, this commit adds some comments, notes and
    contains styling fixes.

    Depends-On: Ib95db5545889e8ae2b86dcf787c4adb7413d5698
    Change-Id: Iefef30379a62c8fab123e03c35c302de8e39a4a3
    Closes-Bug: #1579081
    Partial-Bug: #1580200
    Signed-off-by: Maksim Malchuk <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/316528

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/316528
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=86cf7bdddb683a5cfb1bf0f95416d396fcd3ae17
Submitter: Jenkins
Branch: stable/mitaka

commit 86cf7bdddb683a5cfb1bf0f95416d396fcd3ae17
Author: Maksim Malchuk <email address hidden>
Date: Thu May 12 22:12:36 2016 +0300

    Remove duplicate configurations of the rsyslog

    This commit removes the duplicate configuration settings configured
    by the incorrect usage of the rsyslog::client class for local and
    remote logging. Also, this commit adds some comments, notes and
    contains styling fixes.

    Depends-On: Ib95db5545889e8ae2b86dcf787c4adb7413d5698
    Change-Id: Iefef30379a62c8fab123e03c35c302de8e39a4a3
    Closes-Bug: #1579081
    Partial-Bug: #1580200
    Signed-off-by: Maksim Malchuk <email address hidden>
    (cherry picked from commit 50aa72b3d41c6c4be24bea03ea8ad2ba0e2fbb3e)

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Please try to reproduce the issue using the UDP logs shipper instead of the TCP (default for fuel)

Revision history for this message
Maksim Malchuk (mmalchuk) wrote :

Still waiting for a scale env to run performance tests.

Revision history for this message
Maksim Malchuk (mmalchuk) wrote :

Problem found on the scale lab, in the keystone configuration.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/319454

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/319458

Revision history for this message
Roman Rufanov (rrufanov) wrote :

adding older versions since this will affect all customers.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/319454
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=6a6d8e08f6084557331788fc46ac1ae4f08d705d
Submitter: Jenkins
Branch: master

commit 6a6d8e08f6084557331788fc46ac1ae4f08d705d
Author: Maksim Malchuk <email address hidden>
Date: Sat May 21 00:00:35 2016 +0300

    Remove duplicate logging for the keystone wsgi

    The keystone wsgi logging already moved to the separate apache logs.

    Change-Id: Ide84e3917d7ddacfd1a0e9dcb26b17dbb0c21f5d
    Partial-Bug: #1580200
    Signed-off-by: Maksim Malchuk <email address hidden>

Revision history for this message
Maksim Malchuk (mmalchuk) wrote :

It is to dangerous to backport these fixes.

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

As per [1] it is to dangerous to backport.
[1] https://bugs.launchpad.net/fuel/+bug/1580200/comments/18

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/319458
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=4f6d7b2ce06ade6255d0f1e648ddb4aab7bfad16
Submitter: Jenkins
Branch: stable/mitaka

commit 4f6d7b2ce06ade6255d0f1e648ddb4aab7bfad16
Author: Maksim Malchuk <email address hidden>
Date: Sat May 21 00:00:35 2016 +0300

    Remove duplicate logging for the keystone wsgi

    The keystone wsgi logging already moved to the separate apache logs.

    Change-Id: Ide84e3917d7ddacfd1a0e9dcb26b17dbb0c21f5d
    Partial-Bug: #1580200
    Signed-off-by: Maksim Malchuk <email address hidden>

Revision history for this message
Leontiy Istomin (listomin) wrote :

Applied the fix on top of Fuel 9.0-mos-389 - the issue has gone

Changed in fuel:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.