You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In am writing in a possible issue related to a custom titiler deployment to AKS.
I have been using an older version of titiler for over a year successfully in AKS and never had any issues. Some 2 weeks ago I have upgraded to 0.17 (some small code changes were also done to the server).
I am deploying these custom server to a 8cpu/32GB RAM machine in AKS and it constantly gets stuck after running and arbitrary amount of time (30 min). The cluster is using default AKS nginx ingress load balancer. Here is all the def in yaml
apiVersion: v1kind: Namespacemetadata:
name: titiler
---
apiVersion: apps/v1kind: Deploymentmetadata:
name: titilernamespace: titilerspec:
replicas: 1selector:
matchLabels:
app: titilertemplate:
metadata:
labels:
app: titilerspec:
nodeSelector:
type: "manual"containers:
- name: titiler#image: ghcr.io/undp-data/cogserver:v0.0.3image: undpgeohub.azurecr.io/cogserver-debugimagePullPolicy: Alwaysresources:
limits:
memory: "9G"cpu: "3000m"env:
# - name: WEB_CONCURRENCY# value: "1"# - name: MAX_WORKERS# value: "1"# - name: WEB_CONCURRENCY# value: "1"# - name: RIO_TILER_MAX_THREADS# value: "1"# - name: API_CORS_ORIGIN# value: "*"
---
apiVersion: v1kind: Servicemetadata:
name: titilernamespace: titilerlabels:
app: titilerspec:
ports:
- name: webport: 80targetPort: 80selector:
app: titilertype: ClusterIP # LoadBalancer # NodePort ### load balancer will make the service accessible on the internet using an external ip but no https
---
apiVersion: networking.k8s.io/v1kind: Ingressmetadata:
name: titiler-ssl-tls-ingressnamespace: titilerannotations:
kubernetes.io/ingress.class: addon-http-application-routingcert-manager.io/cluster-issuer: zerosslspec:
tls:
- hosts:
- titiler.undpgeohub.org # update IP address heresecretName: titiler-certrules:
- host: titiler.undpgeohub.org # update IP address herehttp:
paths:
- path: "/"pathType: Prefixbackend:
service:
name: titilerport:
number: 80
and this is the docker file
FROM ghcr.io/osgeo/gdal:ubuntu-small-latest as base
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libffi-dev python3-pip
RUN python3 -m pip install pipenv
WORKDIR /opt/server
RUN export PYTHON_VERSION="$(python3 --version | cut -d ' ' -f 2)" && pipenv --python ${PYTHON_VERSION}
RUN pipenv run pip install -U pip
#RUN pipenv run pip install uvicorn titiler asyncpg postgis --no-cache-dir --upgrade
COPY requirements.txt requirements.txt
RUN pipenv run pip install -r requirements.txt
COPY src/cogserver cogserver
ENV HOST=0.0.0.0
ENV PORT=80
ENV WEB_CONCURRENCY=1
ENV CPL_TMPDIR=/tmp
ENV GDAL_CACHEMAX=75%
ENV GDAL_INGESTED_BYTES_AT_OPEN=32768
ENV GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR
ENV GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
ENV GDAL_HTTP_MULTIPLEX=YES
ENV GDAL_HTTP_VERSION=2
ENV PYTHONWARNINGS=ignore
ENV VSI_CACHE=FALSE
#ENV RIO_TILER_MAX_THREADS=2
#CMD pipenv run uvicorn cogserver:app --host ${HOST} --port ${PORT} --log-config cogserver/logconf.yaml
CMD pipenv run uvicorn cogserver:app --host ${HOST} --port ${PORT} --log-level trace
I set the WEB_CONSURRENCY to 1 to force using 1 worker. The fact is the load balancer reports 504 gateway timeout and I can never find anything in the pod logs. It looks like the service just gets stuck, like waiting on a thread or something and becomes unresponsive
I tried to force RIO_TILER_MAX_THREADS to 1 and various other options but I just can not get it up and running properly.
I need to mention that I run a dev server in identical config deployed in a different namespace (titiler-dev) on the same node.
I did not create an issue because I believe this is related to my deployment.
The SSL/TLS is managed by cert manager (letsencrypt and zerossl) . I also did not find any issue in cert-manager's pod logs
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello titilers,
In am writing in a possible issue related to a custom titiler deployment to AKS.
I have been using an older version of titiler for over a year successfully in AKS and never had any issues. Some 2 weeks ago I have upgraded to 0.17 (some small code changes were also done to the server).
I am deploying these custom server to a 8cpu/32GB RAM machine in AKS and it constantly gets stuck after running and arbitrary amount of time (30 min). The cluster is using default AKS nginx ingress load balancer. Here is all the def in yaml
and this is the docker file
I set the WEB_CONSURRENCY to 1 to force using 1 worker. The fact is the load balancer reports 504 gateway timeout and I can never find anything in the pod logs. It looks like the service just gets stuck, like waiting on a thread or something and becomes unresponsive
I tried to force RIO_TILER_MAX_THREADS to 1 and various other options but I just can not get it up and running properly.
I need to mention that I run a dev server in identical config deployed in a different namespace (titiler-dev) on the same node.
I did not create an issue because I believe this is related to my deployment.
The SSL/TLS is managed by cert manager (letsencrypt and zerossl) . I also did not find any issue in cert-manager's pod logs
I will be grateful for any hints/ideas
Beta Was this translation helpful? Give feedback.
All reactions