-
Notifications
You must be signed in to change notification settings - Fork 1
Reduce concurrency to 2 Gunicorn workers #178
Conversation
This could also be because of memory use when running the recipe files? In addition, I'd suggest increasing the dyno size too on heroku! |
i hadn't thought about this. I presume by "recipe runs" you mean when we execute the recipe modules via |
As a first pass, i'm going to enable log-runtime-metrics to track load and memory using four our current dyno: https://devcenter.heroku.com/articles/log-runtime-metrics. |
@andersy005 measuring seems right next step! |
@yuvipanda, something is going on during the Here's the memory profile after a reboot 2022-11-02T18:30:10.845831+00:00 heroku[web.1]: source=web.1 dyno=heroku.247104119.54df4cd5-f10c-4baa-b412-32d8fa56c24d sample#memory_total=149.32MB sample#memory_rss=148.88MB sample#memory_cache=0.45MB sample#memory_swap=0.00MB sample#memory_pgpgin=69641pages sample#memory_pgpgout=31414pages sample#memory_quota=512.00MB I then launch a test run for this recipe: pangeo-forge/staged-recipes#215 After calling 2022-11-02T18:32:02.363030+00:00 app[web.1]: 2022-11-02 18:32:02,362 DEBUG - orchestrator - Running command: ['pangeo-forge-runner', 'bake', '--repo=https://github.com/norlandrhagen/staged-recipes', '--ref=8308f82cbdede7d8039a72e4137e5d16c800eb89', '--json', '--prune', '--Bake.recipe_id=NWM', '-f=/tmp/tmp985ps8od.json', '--feedstock-subdir=recipes/NWM']
2022-11-02T18:32:14.054996+00:00 heroku[web.1]: source=web.1 dyno=heroku.247104119.54df4cd5-f10c-4baa-b412-32d8fa56c24d sample#load_avg_1m=0.63
2022-11-02T18:32:14.188714+00:00 heroku[web.1]: source=web.1 dyno=heroku.247104119.54df4cd5-f10c-4baa-b412-32d8fa56c24d sample#memory_total=329.25MB sample#memory_rss=326.84MB sample#memory_cache=2.41MB sample#memory_swap=0.00MB sample#memory_pgpgin=122482pages sample#memory_pgpgout=38195pages sample#memory_quota=512.00MB
notice how the memory increased from 2022-11-02T18:34:53.563144+00:00 heroku[web.1]: source=web.1 dyno=heroku.247104119.54df4cd5-f10c-4baa-b412-32d8fa56c24d sample#memory_total=826.02MB sample#memory_rss=511.88MB sample#memory_cache=0.00MB sample#memory_swap=314.14MB sample#memory_pgpgin=255319pages sample#memory_pgpgout=124278pages sample#memory_quota=512.00MB
2022-11-02T18:34:53.720844+00:00 heroku[web.1]: Process running mem=826M(161.3%)
2022-11-02T18:34:53.926451+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2022-11-02T18:34:54.931260+00:00 app[web.1]: [2022-11-02 18:34:54 +0000] [57] [CRITICAL] WORKER TIMEOUT (pid:58)
2022-11-02T18:34:54.964405+00:00 app[web.1]: [2022-11-02 18:34:54 +0000] [57] [WARNING] Worker with pid 58 was terminated due to signal 6
2022-11-02T18:34:55.311602+00:00 app[web.1]: [2022-11-02 18:34:55 +0000] [122] [INFO] Booting worker with pid: 122
2022-11-02T18:34:57.219544+00:00 app[web.1]: [2022-11-02 18:34:57 +0000] [122] [INFO] Started server process [122]
2022-11-02T18:34:57.219620+00:00 app[web.1]: [2022-11-02 18:34:57 +0000] [122] [INFO] Waiting for application startup.
2022-11-02T18:34:57.220136+00:00 app[web.1]: [2022-11-02 18:34:57 +0000] [122] [INFO] Application startup complete. My suspicion is that the expansion of the meta-information of pangeo-forge runner is the cause of this spike. Not sure if the s3 crawling in pangeo-forge/staged-recipes#215 could also be another reason this recipe in particular is running into this memory issues. |
|
This is an attempt at addressing recent memory issues