Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add note about PYTHONMALLOC for accurate jemalloc memory tracking #17709

Merged
merged 6 commits into from
Oct 7, 2024

Conversation

hensg
Copy link
Contributor

@hensg hensg commented Sep 13, 2024

Added a note in the documentation suggesting that users may set PYTHONMALLOC=malloc when using jemalloc. This allows jemalloc to track memory usage more accurately by bypassing Python's internal small-object allocator (pymalloc), helping to ensure that cache_autotuning functions as expected.

This doc change aims to provide more clarity for users configuring jemalloc with Synapse.

Based on:

# Read the relevant global stats from jemalloc. Note that these may
# not be accurate if python is configured to use its internal small
# object allocator (which is on by default, disable by setting the
# env `PYTHONMALLOC=malloc`).

@hensg hensg requested a review from a team as a code owner September 13, 2024 20:12
@CLAassistant
Copy link

CLAassistant commented Sep 13, 2024

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot deployed to PR Documentation Preview September 13, 2024 23:14 Active
Added a note in the documentation suggesting that users may set `PYTHONMALLOC=malloc` when using `jemalloc`. This allows jemalloc to track memory usage more accurately by bypassing Python's internal small-object allocator (`pymalloc`), helping to ensure that cache_autotuning functions as expected.

This doc change aims to provide more clarity for users configuring jemalloc with Synapse.
@daedric7
Copy link

Added a note in the documentation suggesting that users may set PYTHONMALLOC=malloc when using jemalloc. This allows jemalloc to track memory usage more accurately by bypassing Python's internal small-object allocator (pymalloc), helping to ensure that cache_autotuning functions as expected.

This doc change aims to provide more clarity for users configuring jemalloc with Synapse.

I tried this setting with Docker Compose. Memory usage did go down, yet became much more erratic. On the following graph, the switching point is around 14h00.

image

@hensg
Copy link
Contributor Author

hensg commented Sep 16, 2024

@daedric7 could you share the version without malloc/jemalloc and your configurations?

@daedric7
Copy link

@daedric7 could you share the version without malloc/jemalloc and your configurations?

Almost 24h of memory usage before the change.

image

event_cache_size: 2K
caches:
  global_factor: 2
  expire_caches: true
  cache_entry_ttl: 5m
  sync_response_cache_duration: 60m
  cache_autotuning:
    max_cache_memory_usage: 1024M
    target_cache_memory_usage: 512M
    min_cache_ttl: 30m
 docker  compose ps | sort -h
NAME                             IMAGE                              COMMAND                  SERVICE                          CREATED       STATUS                 PORTS
av-client-worker-1               matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-client-worker-1               3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-client-worker-2               matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-client-worker-2               3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-client-worker-3               matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-client-worker-3               3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-client-worker-4               matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-client-worker-4               3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-edu-worker                    matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-edu-worker                    3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-event-persister-1             matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-event-persister-1             3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-event-persister-2             matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-event-persister-2             3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-event-persister-3             matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-event-persister-3             3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-event-persister-4             matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-event-persister-4             3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-request-worker-1   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-request-worker-1   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-request-worker-2   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-request-worker-2   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-request-worker-3   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-request-worker-3   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-request-worker-4   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-request-worker-4   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-sender-worker-1    matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-sender-worker-1    3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-sender-worker-2    matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-sender-worker-2    3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-sender-worker-3    matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-sender-worker-3    3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-federation-sender-worker-4    matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-federation-sender-worker-4    3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-inbound-federation-worker-1   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-inbound-federation-worker-1   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-inbound-federation-worker-2   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-inbound-federation-worker-2   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-inbound-federation-worker-3   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-inbound-federation-worker-3   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-inbound-federation-worker-4   matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-inbound-federation-worker-4   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-sliding-worker                matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-sliding-worker                3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-sync-worker-1                 matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-sync-worker-1                 3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-sync-worker-2                 matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-sync-worker-2                 3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-sync-worker-3                 matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-sync-worker-3                 3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
av-tasks-worker                  matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   av-tasks-worker                  3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp
synapse-matrix-synapse-1         matrixdotorg/synapse:v1.115.0rc2   "/data/patches/patch…"   matrix-synapse                   3 hours ago   Up 3 hours (healthy)   8008-8009/tcp, 8448/tcp

What other info can i provide ?

@hensg
Copy link
Contributor Author

hensg commented Sep 16, 2024

Are you using jemalloc @daedric7 ? e.g.: LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

@daedric7
Copy link

Are you using jemalloc @daedric7 ? e.g.: LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

As far as i was aware, jemalloc2 it's always and use when using Docker.

Is this not true ?

@daedric7
Copy link

Are you using jemalloc @daedric7 ? e.g.: LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2

As far as i was aware, jemalloc2 it's always and use when using Docker.

Is this not true ?

I checked start.py inside my container:

   jemallocpath = "/usr/lib/%s-linux-gnu/libjemalloc.so.2" % (platform.machine(),)

    if os.path.isfile(jemallocpath):
        environ["LD_PRELOAD"] = jemallocpath
    else:
        log("Could not find %s, will not use" % (jemallocpath,))

Can you please confirm that jemalloc2 is not used by default with docker ? Everything else points to the oposite:

###
### Stage 2: runtime
###

FROM docker.io/library/python:${PYTHON_VERSION}-slim-bookworm

LABEL org.opencontainers.image.url='https://matrix.org/docs/projects/server/synapse'
LABEL org.opencontainers.image.documentation='https://github.com/element-hq/synapse/blob/master/docker/README.md'
LABEL org.opencontainers.image.source='https://github.com/element-hq/synapse.git'
LABEL org.opencontainers.image.licenses='AGPL-3.0-or-later'

RUN \
  --mount=type=cache,target=/var/cache/apt,sharing=locked \
  --mount=type=cache,target=/var/lib/apt,sharing=locked \
  apt-get update -qq && apt-get install -yqq \
  curl \
  gosu \
  libjpeg62-turbo \
  libpq5 \
  libwebp7 \
  xmlsec1 \
  libjemalloc2 \
  libicu72 \
  libssl-dev \
  openssl \
  && rm -rf /var/lib/apt/lists/* 

@github-actions github-actions bot deployed to PR Documentation Preview September 30, 2024 12:27 Active
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

changelog.d/17709.doc Outdated Show resolved Hide resolved
@erikjohnston erikjohnston enabled auto-merge (squash) October 7, 2024 08:28
@github-actions github-actions bot deployed to PR Documentation Preview October 7, 2024 08:29 Active
@erikjohnston erikjohnston merged commit beb7a95 into element-hq:develop Oct 7, 2024
33 checks passed
@hensg hensg deleted the docs-jemalloc branch October 10, 2024 10:47
yingziwu added a commit to yingziwu/synapse that referenced this pull request Oct 17, 2024
No significant changes since 1.117.0rc1.

- Add config option `redis.password_path`. ([\#17717](element-hq/synapse#17717))

- Fix a rare bug introduced in v1.29.0 where invalidating a user's access token from a worker could raise an error. ([\#17779](element-hq/synapse#17779))
- In the response to `GET /_matrix/client/versions`, set the `unstable_features` flag for [MSC4140](matrix-org/matrix-spec-proposals#4140) to `false` when server configuration disables support for delayed events. ([\#17780](element-hq/synapse#17780))
- Improve input validation and room membership checks in admin redaction API. ([\#17792](element-hq/synapse#17792))

- Clarify the docstring of `test_forget_when_not_left`. ([\#17628](element-hq/synapse#17628))
- Add documentation note about PYTHONMALLOC for accurate jemalloc memory tracking. Contributed by @hensg. ([\#17709](element-hq/synapse#17709))
- Remove spurious "TODO UPDATE ALL THIS" note in the Debian installation docs. ([\#17749](element-hq/synapse#17749))
- Explain how load balancing works for `federation_sender_instances`. ([\#17776](element-hq/synapse#17776))

- Minor performance increase for large accounts using sliding sync. ([\#17751](element-hq/synapse#17751))
- Increase performance of the notifier when there are many syncing users. ([\#17765](element-hq/synapse#17765), [\#17766](element-hq/synapse#17766))
- Fix performance of streams that don't change often. ([\#17767](element-hq/synapse#17767))
- Improve performance of sliding sync connections that do not ask for any rooms. ([\#17768](element-hq/synapse#17768))
- Reduce overhead of sliding sync E2EE loops. ([\#17771](element-hq/synapse#17771))
- Sliding sync minor performance speed up using new table. ([\#17787](element-hq/synapse#17787))
- Sliding sync minor performance improvement by omitting unchanged data from incremental responses. ([\#17788](element-hq/synapse#17788))
- Speed up sliding sync when there are many active subscriptions. ([\#17789](element-hq/synapse#17789))
- Add missing license headers on new source files. ([\#17799](element-hq/synapse#17799))

* Bump phonenumbers from 8.13.45 to 8.13.46. ([\#17773](element-hq/synapse#17773))
* Bump python-multipart from 0.0.10 to 0.0.12. ([\#17772](element-hq/synapse#17772))
* Bump regex from 1.10.6 to 1.11.0. ([\#17770](element-hq/synapse#17770))
* Bump ruff from 0.6.7 to 0.6.8. ([\#17774](element-hq/synapse#17774))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants