Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loses connection to LDAP server after a while and needs to restart #18

Open
ephracis opened this issue Nov 19, 2018 · 26 comments
Open

Loses connection to LDAP server after a while and needs to restart #18

ephracis opened this issue Nov 19, 2018 · 26 comments

Comments

@ephracis
Copy link
Contributor

After kube-ldap has been running for a while (a day or two I think) it starts to timeout when doing /auth.

If I delete the pod and wait for a new one to be created it works fine. Nothing shows up in the log even though I have debug enabled.

@juckerf
Copy link
Contributor

juckerf commented Nov 21, 2018

which version of kube-ldap are you running?
A bug with similar symptoms was fixed in v1.2.1 (via #15, also see issue #13)

@ephracis
Copy link
Contributor Author

I am running 1.2.1 but I had to do some minor adjustments (support ldap without tls and groups in, at least our, AD are not always lists but sometimes a single name). Other than that it is unmodified.

I have tried to tinker with the reconnect parameters but I can't make it stable over more than a couple of days.

@juckerf
Copy link
Contributor

juckerf commented Nov 30, 2018

hm.. sounds strange
on the dev branch (588a90b) I added logging to the ldap connection. probably you can merge this with your custom version and then we should at least have some debug output.

@ephracis
Copy link
Contributor Author

Ok so I have built an image from the dev branch and deployed in a lab cluster. I had to wait a day before the issue started but now I get a time out whenever I try to access /auth with proper LDAP credentials. Here are the logs:

$ kubectl -n kube-system logs kube-ldap-7dd8d9d849-cg49t | grep -v "GET /healthz"
2018-12-15T21:45:26.227Z - info: kube-ldap listening on https port 8081
2018-12-15T22:00:34.942Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:00:34 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=33583104, heapTotal=16072704, heapUsed=13206688, external=17612, loadavg=[0.01220703125, 0.1123046875, 0.22509765625], uptime=3136, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T22:03:35.037Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:03:35 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=33914880, heapTotal=16596992, heapUsed=13495992, external=17636, loadavg=[0.0830078125, 0.0791015625, 0.1953125], uptime=3316, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T22:06:35.125Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:06:35 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=33947648, heapTotal=16596992, heapUsed=13639120, external=17636, loadavg=[0.05419921875, 0.10546875, 0.1923828125], uptime=3496, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T22:08:45.327Z - info: ::ffff:172.16.13.1 - test.user00001 [15/Dec/2018:22:08:45 +0000] "GET /auth HTTP/1.1" 200 261 "-" "curl/7.29.0"
2018-12-15T22:08:48.701Z - info: ::ffff:172.16.87.0 - - [15/Dec/2018:22:08:48 +0000] "POST /token?timeout=30s HTTP/1.1" 200 242 "-" "Go-http-client/1.1"
2018-12-15T22:08:49.127Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2018-12-15T22:08:49.128Z - info: ::ffff:172.16.87.0 - - [15/Dec/2018:22:08:49 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"
2018-12-15T22:17:40.528Z - info: ::ffff:172.16.13.1 - test.user00001 [15/Dec/2018:22:17:40 +0000] "GET /auth HTTP/1.1" 200 261 "-" "curl/7.29.0"
2018-12-15T22:17:42.569Z - info: ::ffff:172.16.87.0 - - [15/Dec/2018:22:17:42 +0000] "POST /token?timeout=30s HTTP/1.1" 200 242 "-" "Go-http-client/1.1"
2018-12-15T22:17:43.124Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2018-12-15T22:17:43.125Z - info: ::ffff:172.16.87.0 - - [15/Dec/2018:22:17:43 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"
2018-12-15T22:33:35.998Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:33:35 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=35393536, heapTotal=18169856, heapUsed=15226240, external=25828, loadavg=[0.0029296875, 0.037109375, 0.09765625], uptime=5117, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T22:36:33.594Z - info: ::ffff:172.16.13.1 - test.user00001 [15/Dec/2018:22:36:33 +0000] "GET /auth HTTP/1.1" 200 261 "-" "curl/7.29.0"
2018-12-15T22:36:35.797Z - info: ::ffff:172.16.87.0 - - [15/Dec/2018:22:36:35 +0000] "POST /token?timeout=30s HTTP/1.1" 200 242 "-" "Go-http-client/1.1"
2018-12-15T22:36:36.126Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2018-12-15T22:36:36.128Z - info: ::ffff:172.16.87.0 - - [15/Dec/2018:22:36:36 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"
2018-12-15T22:51:36.578Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:51:36 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=40943616, heapTotal=19218432, heapUsed=16532488, external=34020, loadavg=[0.09423828125, 0.27197265625, 0.20166015625], uptime=6197, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T22:54:36.667Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:54:36 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=42024960, heapTotal=19218432, heapUsed=16747936, external=34020, loadavg=[0.02783203125, 0.1962890625, 0.193359375], uptime=6378, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T22:57:36.767Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 22:57:36 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=43376640, heapTotal=19742720, heapUsed=16988912, external=34020, loadavg=[0.0029296875, 0.107421875, 0.158203125], uptime=6558, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:00:36.860Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:00:36 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=44728320, heapTotal=19742720, heapUsed=17181928, external=34020, loadavg=[0.0029296875, 0.0595703125, 0.12939453125], uptime=6738, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:03:36.959Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:03:36 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=45977600, heapTotal=19742720, heapUsed=17408704, external=34020, loadavg=[0.26318359375, 0.1533203125, 0.15478515625], uptime=6918, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:06:37.054Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:06:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=47521792, heapTotal=20791296, heapUsed=17619456, external=34020, loadavg=[0.01318359375, 0.08447265625, 0.1279296875], uptime=7098, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:09:37.150Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:09:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=47726592, heapTotal=20791296, heapUsed=17820128, external=34020, loadavg=[0.01806640625, 0.0673828125, 0.12109375], uptime=7278, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:12:37.247Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:12:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=47685632, heapTotal=20791296, heapUsed=18046088, external=34020, loadavg=[0.013671875, 0.048828125, 0.10888671875], uptime=7458, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:15:37.343Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:15:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=47927296, heapTotal=20791296, heapUsed=18267256, external=34020, loadavg=[0.02392578125, 0.05078125, 0.10205078125], uptime=7638, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:18:37.437Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:18:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=48046080, heapTotal=20791296, heapUsed=18475680, external=34020, loadavg=[0.0458984375, 0.04345703125, 0.08984375], uptime=7818, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:21:37.600Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:21:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=48271360, heapTotal=21315584, heapUsed=18670112, external=34020, loadavg=[0.01513671875, 0.03759765625, 0.07763671875], uptime=7998, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:24:37.692Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:24:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=48287744, heapTotal=21839872, heapUsed=18880456, external=34020, loadavg=[0.0029296875, 0.02001953125, 0.06005859375], uptime=8179, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:27:37.790Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:27:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=48640000, heapTotal=21839872, heapUsed=19096240, external=34020, loadavg=[0.00927734375, 0.03125, 0.05322265625], uptime=8359, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:30:37.885Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:30:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=48795648, heapTotal=21839872, heapUsed=19297080, external=34020, loadavg=[0.05419921875, 0.0419921875, 0.0498046875], uptime=8539, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:33:37.982Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:33:37 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=48865280, heapTotal=22364160, heapUsed=19530992, external=34020, loadavg=[0.47119140625, 0.13671875, 0.08154296875], uptime=8719, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-15T23:36:38.079Z - error: uncaughtException: read ECONNRESET date=Sat Dec 15 2018 23:36:38 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=49549312, heapTotal=22364160, heapUsed=19770808, external=34020, loadavg=[0.07470703125, 0.12109375, 0.08544921875], uptime=8899, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-16T09:18:47.990Z - info: request timeout (client interrupt)
2018-12-16T09:18:47.992Z - info: ::ffff:172.16.13.1 - test.user00001 [16/Dec/2018:09:18:47 +0000] "GET /auth HTTP/1.1" 401 12 "-" "curl/7.29.0"

The only modification I have done is to disable the ldaps requirement since I do not have a proper CA for Active Directory in the lab.

@juckerf
Copy link
Contributor

juckerf commented Dec 17, 2018

please increase the loglevel to debug (set environment variable LOGLEVEL in your deployment to debug), try to reproduce the issue again and then post the (more verbose) logs. (sorry I forgot to mention that before).
caution: because of the ldap library used by kube-ldap passwords get logged in cleartext at debug loglevel (ldapjs/node-ldapjs#469)

@ephracis
Copy link
Contributor Author

I believe I have that.

$ $ kubectl -n kube-system get deploy kube-ldap -o yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    kubectl.kubernetes.io/last-applied-configuration: ... snip ...
  creationTimestamp: "2018-12-15T21:45:19Z"
  generation: 1
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    basalt.se/zone: Infra
    k8s-app: kube-ldap
    kubernetes.io/cluster-service: "true"
  name: kube-ldap
  namespace: kube-system
  resourceVersion: "493"
  selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/kube-ldap
  uid: b79241fd-00b2-11e9-92b9-fa163ead1760
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-ldap
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        addonmanager.kubernetes.io/mode: Reconcile
        k8s-app: kube-ldap
    spec:
      containers:
      - env:
        - name: LDAP_URI
          value: ldap://10.0.0.10:389
        - name: LDAP_BINDDN
          valueFrom:
            secretKeyRef:
              key: binddn
              name: kube-ldap-ldap-bind-credentials
        - name: LDAP_BINDPW
          valueFrom:
            secretKeyRef:
              key: bindpw
              name: kube-ldap-ldap-bind-credentials
        - name: LDAP_BASEDN
          value: ou=users,dc=example,dc=com
        - name: LDAP_FILTER
          value: (sAMAccountName=%s)
        - name: JWT_KEY
          valueFrom:
            secretKeyRef:
              key: key
              name: kube-ldap-jwt-key
        - name: JWT_TOKEN_LIFETIME
          value: "28800"
        - name: TLS_CERT_PATH
          value: /etc/ssl/kube-ldap/tls.crt
        - name: TLS_KEY_PATH
          value: /etc/ssl/kube-ldap/tls.key
        - name: MAPPING_USERNAME
          value: sAMAccountName
        - name: MAPPING_UID
          value: sAMAccountName
        - name: MAPPING_GROUPS
          value: memberOf
        - name: MAPPING_EXTRAFIELDS
        - name: LOGLEVEL
          value: debug
        - name: LDAP_TIMEOUT
          value: "60"
        - name: LDAP_RECONN_INIT_DELAY
          value: "10"
        - name: LDAP_RECONN_MAX_DELAY
          value: "60000"
        - name: LDAP_RECONN_FAIL_AFTER
          value: "1000"
        image: docker.basalt.se:5001/containers/kubernetes/kube_ldap:dev-1.2.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 8081
            scheme: HTTPS
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: kube-ldap
        ports:
        - containerPort: 8081
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/ssl/kube-ldap
          name: kube-ldap-tls
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: kube-ldap-tls
        secret:
          defaultMode: 420
          secretName: kube-ldap-tls
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2018-12-15T21:45:24Z"
    lastUpdateTime: "2018-12-15T21:45:24Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2018-12-15T21:45:19Z"
    lastUpdateTime: "2018-12-15T21:45:24Z"
    message: ReplicaSet "kube-ldap-7dd8d9d849" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

@juckerf
Copy link
Contributor

juckerf commented Dec 18, 2018

I believe I have that.

[...]

Yep, your deployment seems OK.
Are you sure you have pulled the latest changes from the dev branch before building the image? (latest commit was 0175274 on Nov 30th)

If everything is correct something like the following should be logged by the pod when trying to authenticate:

2018-12-18T13:58:48.843Z - info: kube-ldap listening on http port 8080
2018-12-18T13:58:48.942Z - debug: connected after 1 attempt(s)
2018-12-18T13:58:49.035Z - debug: sending request {"messageID":1,"protocolOp":"ExtendedRequest","requestName":"1.3.6.1.4.1.1466.20037","requestValue":null,"controls":[]}
2018-12-18T13:58:49.038Z - debug: data event: <Buffer 30 0c 02 01 01 78 07 0a 01 00 04 00 04 00>
2018-12-18T13:58:49.040Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-18T13:58:49.041Z - debug: Parsing done: {"messageID":1,"protocolOp":"ExtendedResponse","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-18T13:58:49.041Z - debug: response received messageID=1, protocolOp=ExtendedResponse, status=0, matchedDN=, errorMessage=, referrals=[], responseName=undefined, responseValue=undefined, controls=[]
2018-12-18T13:58:49.841Z - info: ::ffff:192.168.100.186 - - [18/Dec/2018:13:58:49 +0000] "POST /token HTTP/1.1" 200 904 "-" "Go-http-client/2.0"

2018-12-18T13:59:02.890Z - info: ::ffff:10.51.6.1 - - [18/Dec/2018:13:59:02 +0000] "GET /healthz HTTP/1.1" 200 2 "-" "kube-probe/1.8"

2018-12-18T13:59:12.449Z - info: ::ffff:10.51.6.1 - - [18/Dec/2018:13:59:12 +0000] "GET /auth HTTP/1.1" 401 12 "-" "curl/7.47.0"

2018-12-18T13:59:12.889Z - info: ::ffff:10.51.6.1 - - [18/Dec/2018:13:59:12 +0000] "GET /healthz HTTP/1.1" 200 2 "-" "kube-probe/1.8"

2018-12-18T13:59:17.824Z - debug: sending request {"messageID":2,"protocolOp":"BindRequest","version":3,"name":"uid=bind,ou=people,dc=example,dc=com","authenticationType":"Simple","credentials":"secret","controls":[]}
2018-12-18T13:59:17.825Z - debug: data event: <Buffer 30 0c 02 01 02 61 07 0a 01 00 04 00 04 00>
2018-12-18T13:59:17.826Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-18T13:59:17.826Z - debug: Parsing done: {"messageID":2,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-18T13:59:17.826Z - debug: response received messageID=2, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-18T13:59:17.839Z - debug: sending request {"messageID":3,"protocolOp":"SearchRequest","baseObject":{"rdns":[{"attrs":{"dc":{"value":"example","name":"dc","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"com","name":"dc","order":0}},"spLead":0,"spTrail":0}],"_format":{}},"scope":"sub","derefAliases":0,"sizeLimit":0,"timeLimit":10,"typesOnly":false,"filter":"(&(uid=test)(memberOf=cn=k8s,ou=groups,dc=example,dc=com))","attributes":[],"controls":[]}
2018-12-18T13:59:17.842Z - debug: data event: <Buffer 30 0c 02 01 03 65 07 0a 01 00 04 00 04 00>
2018-12-18T13:59:17.842Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-18T13:59:17.842Z - debug: Parsing done: {"messageID":3,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-18T13:59:17.842Z - debug: response received messageID=3, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-18T13:59:17.843Z - info: no object found with filter [(&(uid=test)(memberOf=cn=k8s,ou=groups,dc=example,dc=com))]
2018-12-18T13:59:17.844Z - info: ::ffff:10.51.6.1 - test [18/Dec/2018:13:59:17 +0000] "GET /auth HTTP/1.1" 401 12 "-" "curl/7.47.0"

@ephracis
Copy link
Contributor Author

ephracis commented Dec 19, 2018

Sorry, was running on master instead of dev.

@ephracis
Copy link
Contributor Author

The lab has been running for a few days and now it times out when I try to access /auth with proper creds. The logs from that attempt is seen at the end of the following log:

2018-12-19T19:33:40.997Z - info: kube-ldap listening on https port 8081
2018-12-19T19:33:41.015Z - debug: connected after 1 attempt(s)
2018-12-19T19:33:41.017Z - debug: Setting timeout to 60000
2018-12-19T19:33:41.017Z - debug: sending request {"messageID":1,"protocolOp":"ExtendedRequest","requestName":"1.3.6.1.4.1.1466.20037","requestValue":null,"controls":[]}
2018-12-19T19:33:41.037Z - debug: data event: <Buffer 30 84 00 00 00 7d 02 01 01 78 84 00 00 00 74 0a 01 34 04 00 04 55 30 30 30 30 30 30 30 30 3a 20 4c 64 61 70 45 72 72 3a 20 44 53 49 44 2d 30 43 30 39 ... >
2018-12-19T19:33:41.040Z - debug: parse: data=<Buffer 0a 01 34 04 00 04 55 30 30 30 30 30 30 30 30 3a 20 4c 64 61 70 45 72 72 3a 20 44 53 49 44 2d 30 43 30 39 31 32 36 39 2c 20 63 6f 6d 6d 65 6e 74 3a 20 ... >
2018-12-19T19:33:41.046Z - debug: Parsing done: {"messageID":1,"protocolOp":"ExtendedResponse","status":52,"matchedDN":"","errorMessage":"00000000: LdapErr: DSID-0C091269, comment: Error initializing SSL/TLS, data 0, v3839\u0000","referrals":[],"responseName":"1.3.6.1.4.1.1466.20037","controls":[]}
2018-12-19T19:33:41.048Z - debug: response received messageID=1, protocolOp=ExtendedResponse, status=52, matchedDN=, errorMessage=00000000: LdapErr: DSID-0C091269, comment: Error initializing SSL/TLS, data 0, v3839^@, referrals=[], responseName=1.3.6.1.4.1.1466.20037, responseValue=undefined, controls=[]
2018-12-19T19:36:37.147Z - debug: Setting timeout to 60000
2018-12-19T19:36:37.149Z - debug: sending request {"messageID":2,"protocolOp":"BindRequest","version":3,"name":"[email protected]","authenticationType":"Simple","credentials":"redacted","controls":[]}
2018-12-19T19:36:37.172Z - debug: data event: <Buffer 30 84 00 00 00 10 02 01 02 61 84 00 00 00 07 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.174Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.174Z - debug: Parsing done: {"messageID":2,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-19T19:36:37.175Z - debug: response received messageID=2, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-19T19:36:37.185Z - debug: Setting timeout to 60000
2018-12-19T19:36:37.186Z - debug: sending request {"messageID":3,"protocolOp":"SearchRequest","baseObject":{"rdns":[{"attrs":{"ou":{"value":"users","name":"ou","order":0}},"spLead":0,"spTrail":0},{"attrs":{"ou":{"value":"3","name":"OU","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"example","name":"dc","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"local","name":"dc","order":0}},"spLead":0,"spTrail":0}],"_format":{}},"scope":"sub","derefAliases":0,"sizeLimit":0,"timeLimit":10,"typesOnly":false,"filter":"(sAMAccountName=test.user00001)","attributes":[],"controls":[]}
2018-12-19T19:36:37.213Z - debug: data event: <Buffer 30 84 00 00 05 6d 02 01 03 64 84 00 00 05 64 04 38 43 4e 3d 54 65 73 74 20 55 73 65 72 20 28 30 30 30 30 31 29 2c 4f 55 3d 55 73 65 72 73 2c 4f 55 3d ... >
2018-12-19T19:36:37.215Z - debug: parse: data=<Buffer 04 38 43 4e 3d 54 65 73 74 20 55 73 65 72 20 28 30 30 30 30 31 29 2c 4f 55 3d 55 73 65 72 73 2c 4f 55 3d 33 2c 44 43 3d 74 72 65 6b 76 61 72 74 73 2c ... >
2018-12-19T19:36:37.217Z - debug: Parsing done: {"messageID":3,"protocolOp":"SearchEntry","objectName":"CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com","attributes":[{"type":"objectClass","vals":["top","person","organizationalPerson","user"]},{"type":"cn","vals":["Test User (00001)"]},{"type":"distinguishedName","vals":["CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com"]},{"type":"instanceType","vals":["4"]},{"type":"whenCreated","vals":["20180424104025.0Z"]},{"type":"whenChanged","vals":["20181108154100.0Z"]},{"type":"uSNCreated","vals":["37420"]},{"type":"memberOf","vals":["CN=role - audit & analytics,CN=Users,DC=example,DC=com"]},{"type":"uSNChanged","vals":["119390"]},{"type":"name","vals":["Test User (00001)"]},{"type":"objectGUID","vals":["�\u0001\u00072$��M��� D���"]},{"type":"userAccountControl","vals":["66048"]},{"type":"badPwdCount","vals":["0"]},{"type":"codePage","vals":["0"]},{"type":"countryCode","vals":["0"]},{"type":"badPasswordTime","vals":["131861652469554912"]},{"type":"lastLogoff","vals":["0"]},{"type":"lastLogon","vals":["131861652603461498"]},{"type":"pwdLastSet","vals":["131690400258569879"]},{"type":"primaryGroupID","vals":["513"]},{"type":"objectSid","vals":["\u0001\u0005\u0000\u0000\u0000\u0000\u0000\u0005\u0015\u0000\u0000\u0000���\u0002\"��\u0005�[�W\u0004\u0000\u0000"]},{"type":"accountExpires","vals":["9223372036854775807"]},{"type":"logonCount","vals":["0"]},{"type":"sAMAccountName","vals":["test.user00001"]},{"type":"sAMAccountType","vals":["805306368"]},{"type":"userPrincipalName","vals":["[email protected]"]},{"type":"objectCategory","vals":["CN=Person,CN=Schema,CN=Configuration,DC=example,DC=com"]},{"type":"dSCorePropagationData","vals":["16010101000000.0Z"]},{"type":"lastLogonTimestamp","vals":["131861652603461498"]}],"controls":[]}
2018-12-19T19:36:37.218Z - debug: response received messageID=3, protocolOp=SearchEntry, objectName=CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com, attributes=[type=objectClass, vals=[top, person, organizationalPerson, user], type=cn, vals=[Test User (00001)], type=distinguishedName, vals=[CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com], type=instanceType, vals=[4], type=whenCreated, vals=[20180424104025.0Z], type=whenChanged, vals=[20181108154100.0Z], type=uSNCreated, vals=[37420], type=memberOf, vals=[CN=role - audit & analytics,CN=Users,DC=example,DC=com], type=uSNChanged, vals=[119390], type=name, vals=[Test User (00001)], type=objectGUID, vals=[�^A^G2$��M��� D���], type=userAccountControl, vals=[66048], type=badPwdCount, vals=[0], type=codePage, vals=[0], type=countryCode, vals=[0], type=badPasswordTime, vals=[131861652469554912], type=lastLogoff, vals=[0], type=lastLogon, vals=[131861652603461498], type=pwdLastSet, vals=[131690400258569879], type=primaryGroupID, vals=[513], type=objectSid, vals=[^A^E^@^@^@^@^@^E^U^@^@^@���^B"��^E�[�W^D^@^@], type=accountExpires, vals=[9223372036854775807], type=logonCount, vals=[0], type=sAMAccountName, vals=[test.user00001], type=sAMAccountType, vals=[805306368], type=userPrincipalName, vals=[[email protected]], type=objectCategory, vals=[CN=Person,CN=Schema,CN=Configuration,DC=example,DC=com], type=dSCorePropagationData, vals=[16010101000000.0Z], type=lastLogonTimestamp, vals=[131861652603461498]], controls=[]
2018-12-19T19:36:37.222Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.223Z - debug: Parsing done: {"messageID":3,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-19T19:36:37.223Z - debug: response received messageID=3, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-19T19:36:37.224Z - debug: Setting timeout to 60000
2018-12-19T19:36:37.224Z - debug: sending request {"messageID":4,"protocolOp":"BindRequest","version":3,"name":"CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com","authenticationType":"Simple","credentials":"redacted","controls":[]}
2018-12-19T19:36:37.233Z - debug: data event: <Buffer 30 84 00 00 00 10 02 01 04 61 84 00 00 00 07 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.234Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.234Z - debug: Parsing done: {"messageID":4,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-19T19:36:37.234Z - debug: response received messageID=4, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-19T19:36:37.237Z - debug: Setting timeout to 60000
2018-12-19T19:36:37.237Z - debug: sending request {"messageID":5,"protocolOp":"BindRequest","version":3,"name":"[email protected]","authenticationType":"Simple","credentials":"redacted","controls":[]}
2018-12-19T19:36:37.241Z - debug: data event: <Buffer 30 84 00 00 00 10 02 01 05 61 84 00 00 00 07 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.242Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.242Z - debug: Parsing done: {"messageID":5,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-19T19:36:37.242Z - debug: response received messageID=5, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-19T19:36:37.243Z - debug: Setting timeout to 60000
2018-12-19T19:36:37.243Z - debug: sending request {"messageID":6,"protocolOp":"SearchRequest","baseObject":{"rdns":[{"attrs":{"ou":{"value":"users","name":"ou","order":0}},"spLead":0,"spTrail":0},{"attrs":{"ou":{"value":"3","name":"OU","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"example","name":"dc","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"local","name":"dc","order":0}},"spLead":0,"spTrail":0}],"_format":{}},"scope":"sub","derefAliases":0,"sizeLimit":0,"timeLimit":10,"typesOnly":false,"filter":"(sAMAccountName=test.user00001)","attributes":["sAMAccountName","sAMAccountName","memberOf"],"controls":[]}
2018-12-19T19:36:37.246Z - debug: data event: <Buffer 30 84 00 00 00 c7 02 01 06 64 84 00 00 00 be 04 38 43 4e 3d 54 65 73 74 20 55 73 65 72 20 28 30 30 30 30 31 29 2c 4f 55 3d 55 73 65 72 73 2c 4f 55 3d ... >
2018-12-19T19:36:37.246Z - debug: parse: data=<Buffer 04 38 43 4e 3d 54 65 73 74 20 55 73 65 72 20 28 30 30 30 30 31 29 2c 4f 55 3d 55 73 65 72 73 2c 4f 55 3d 33 2c 44 43 3d 74 72 65 6b 76 61 72 74 73 2c ... >
2018-12-19T19:36:37.247Z - debug: Parsing done: {"messageID":6,"protocolOp":"SearchEntry","objectName":"CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com","attributes":[{"type":"memberOf","vals":["CN=role - audit & analytics,CN=Users,DC=example,DC=com"]},{"type":"sAMAccountName","vals":["test.user00001"]}],"controls":[]}
2018-12-19T19:36:37.247Z - debug: response received messageID=6, protocolOp=SearchEntry, objectName=CN=Test User (00001),OU=Users,OU=Test,DC=example,DC=com, attributes=[type=memberOf, vals=[CN=role - audit & analytics,CN=Users,DC=example,DC=com], type=sAMAccountName, vals=[test.user00001]], controls=[]
2018-12-19T19:36:37.248Z - debug: parse: data=<Buffer 0a 01 00 04 00 04 00>
2018-12-19T19:36:37.248Z - debug: Parsing done: {"messageID":6,"protocolOp":"LDAPResult","status":0,"matchedDN":"","errorMessage":"","referrals":[],"controls":[]}
2018-12-19T19:36:37.248Z - debug: response received messageID=6, protocolOp=LDAPResult, status=0, matchedDN=, errorMessage=, referrals=[], controls=[]
2018-12-19T19:36:37.259Z - info: ::ffff:172.16.66.1 - test.user00001 [19/Dec/2018:19:36:37 +0000] "GET /auth HTTP/1.1" 200 261 "-" "curl/7.29.0"
2018-12-19T19:36:39.782Z - info: ::ffff:172.16.96.0 - - [19/Dec/2018:19:36:39 +0000] "POST /token?timeout=30s HTTP/1.1" 200 242 "-" "Go-http-client/1.1"
2018-12-19T19:36:40.093Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2018-12-19T19:36:40.096Z - info: ::ffff:172.16.96.0 - - [19/Dec/2018:19:36:40 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"
2018-12-19T19:52:30.547Z - debug: error event: Error
    at Socket.onSocketError (/srv/www/kube-ldap/node_modules/ldapjs/lib/client/client.js:1167:50)
    at emitOne (events.js:116:13)
    at Socket.emit (events.js:211:7)
    at emitErrorNT (internal/streams/destroy.js:64:8)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickCallback (internal/process/next_tick.js:180:9) code=ECONNRESET, errno=ECONNRESET, syscall=read
2018-12-19T19:52:30.558Z - error: uncaughtException: read ECONNRESET date=Wed Dec 19 2018 19:52:30 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.10.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=34340864, heapTotal=17121280, heapUsed=13967328, external=25804, loadavg=[0.0068359375, 0.0517578125, 0.1640625], uptime=2120, trace=[column=11, file=util.js, function=_errnoException, line=1022, method=null, native=false, column=25, file=net.js, function=TCP.onread, line=628, method=onread, native=false], stack=[Error: read ECONNRESET,     at _errnoException (util.js:1022:11),     at TCP.onread (net.js:628:25)]
2018-12-19T19:52:30.560Z - debug: close event had_err=yes
2018-12-19T19:52:30.577Z - debug: connected after 1 attempt(s)

... last 12 lines repeated ...

2018-12-21T09:35:43.293Z - debug: close event had_err=yes
2018-12-21T09:35:43.309Z - debug: connected after 1 attempt(s)
2018-12-23T16:09:35.487Z - debug: Setting timeout to 60000
2018-12-23T16:09:35.490Z - debug: sending request {"messageID":1,"protocolOp":"BindRequest","version":3,"name":"[email protected]","authenticationType":"Simple","credentials":"redacted","controls":[]}
2018-12-23T16:10:35.517Z - debug: response received msg=undefined
2018-12-23T16:10:35.520Z - debug: Setting timeout to 60000
2018-12-23T16:10:35.522Z - debug: sending request {"messageID":2,"protocolOp":"SearchRequest","baseObject":{"rdns":[{"attrs":{"ou":{"value":"users","name":"ou","order":0}},"spLead":0,"spTrail":0},{"attrs":{"ou":{"value":"3","name":"OU","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"example","name":"dc","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"local","name":"dc","order":0}},"spLead":0,"spTrail":0}],"_format":{}},"scope":"sub","derefAliases":0,"sizeLimit":0,"timeLimit":10,"typesOnly":false,"filter":"(sAMAccountName=test.user00001)","attributes":[],"controls":[]}
2018-12-23T16:10:35.527Z - info: request timeout (client interrupt)
2018-12-23T16:10:35.531Z - info: ::ffff:172.16.66.1 - test.user00001 [23/Dec/2018:16:10:35 +0000] "GET /auth HTTP/1.1" 401 12 "-" "curl/7.29.0"
2018-12-23T16:11:35.521Z - debug: response received msg=undefined

@ephracis
Copy link
Contributor Author

I just tried again and here are the logs for it:

2018-12-29T10:33:36.187Z - debug: Setting timeout to 60000
2018-12-29T10:33:36.189Z - debug: sending request {"messageID":1,"protocolOp":"BindRequest","version":3,"name":"[email protected]","authenticationType":"Simple","credentials":"redacted","controls":[]}
2018-12-29T10:34:36.197Z - debug: response received msg=undefined
2018-12-29T10:34:36.200Z - debug: Setting timeout to 60000
2018-12-29T10:34:36.201Z - debug: sending request {"messageID":2,"protocolOp":"SearchRequest","baseObject":{"rdns":[{"attrs":{"ou":{"value":"users","name":"ou","order":0}},"spLead":0,"spTrail":0},{"attrs":{"ou":{"value":"Example","name":"OU","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"example","name":"dc","order":0}},"spLead":0,"spTrail":0},{"attrs":{"dc":{"value":"com","name":"dc","order":0}},"spLead":0,"spTrail":0}],"_format":{}},"scope":"sub","derefAliases":0,"sizeLimit":0,"timeLimit":10,"typesOnly":false,"filter":"(sAMAccountName=test.user00001)","attributes":[],"controls":[]}
2018-12-29T10:34:36.202Z - info: request timeout (client interrupt)
2018-12-29T10:34:36.207Z - info: ::ffff:172.16.66.1 - test.user00001 [29/Dec/2018:10:34:36 +0000] "GET /auth HTTP/1.1" 401 12 "-" "curl/7.29.0"
2018-12-29T10:34:38.916Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2018-12-29T10:34:38.918Z - info: ::ffff:172.16.96.0 - - [29/Dec/2018:10:34:38 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"
2018-12-29T10:34:39.260Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2018-12-29T10:34:39.262Z - info: ::ffff:172.16.96.0 - - [29/Dec/2018:10:34:39 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"
2018-12-29T10:35:36.200Z - debug: response received msg=undefined

@ephracis
Copy link
Contributor Author

ephracis commented Dec 29, 2018

If I kill the pod and have Kubernetes recreate it, then it starts working again (for a day or two). The LDAP server is a vanilla Active Directory on Windows Server 2016 with minimal configuration.

But it contains a very large directory with over 5k users inside the group.

@juckerf
Copy link
Contributor

juckerf commented Jan 4, 2019

Hmm.. the ldap library should reconnect to the server, but someone mentioned that this might not work very well with active directory (ldapjs/node-ldapjs#498 (comment)). I'm going to have a look into this (probably recreate the ldap connection for every request)

@var-kar
Copy link

var-kar commented Jan 4, 2019

@juckerf In our company we have similar applications written on Spring framework. We use this class https://docs.spring.io/spring-ldap/docs/current/apidocs/org/springframework/ldap/core/support/LdapContextSource.html

The Java applications that our team wrote with this class are working well with the same LDAP source with similar connection details.

@juckerf
Copy link
Contributor

juckerf commented Jan 8, 2019

@ephracis
I made some improvements and tested reconnection with Active Directory (provoking a connection loss by restarting the AD service), however I'm not sure if this will really apply to your case.
I released the changes with version 1.3.0 of kube-ldap. This version also includes a new configuration parameter to disable ldap starttls (LDAP_STARTTLS) so you probably don't have to build your own image anymore.

@var-kar
Thanks for your input. However I'm not sure what you're trying to say. Can you please be more specific what this java class has to do with kube-ldap?

@juckerf juckerf self-assigned this Jan 8, 2019
@ephracis
Copy link
Contributor Author

ephracis commented Jan 8, 2019

Yes I noticed that new parameter. Thanks a lot! Makes my life much easier.

I will upgrade and deploy a lab today with the latest version and see how it goes.

@var-kar
Copy link

var-kar commented Jan 9, 2019

@juckerf ldapjs/node-ldapjs#498
Earlier you linked the above issue with this one.

Any solution on the core ldapjs library to this connection issue?

Thank you.

@ephracis
Copy link
Contributor Author

Been running the 1.3.0 for 2 days now and still not seeing any issue. I will let it run for a few more days but it usually shows up within 3 days, most often after only 1 day. Never seen it go for more than 3 days without an issue.

So I am optimistic that 1.3.0 solves the issue. But I'll keep this issue open for a few more days until I am certain that it's solved.

@ephracis
Copy link
Contributor Author

The problem is back.

So in the logs I usually see this every 3 minutes:

2019-01-10T21:05:12.435Z - debug: error event: Error
    at Socket.onSocketError (/srv/www/kube-ldap/node_modules/ldapjs/lib/client/client.js:1167:50)
    at emitOne (events.js:116:13)
    at Socket.emit (events.js:211:7)
    at emitErrorNT (internal/streams/destroy.js:66:8)
    at _combinedTickCallback (internal/process/next_tick.js:139:11)
    at process._tickCallback (internal/process/next_tick.js:181:9) errno=ECONNRESET, code=ECONNRESET, syscall=read
2019-01-10T21:05:12.436Z - error: uncaughtException: read ECONNRESET date=Thu Jan 10 2019 21:05:12 GMT+0000 (UTC), pid=1, uid=0, gid=0, cwd=/, execPath=/usr/local/bin/node, version=v8.15.0, argv=[/usr/local/bin/node, /srv/www/kube-ldap/build/index.js], rss=54902784, heapTotal=26034176, heapUsed=19457784, external=74980, loadavg=[0.0078125, 0.041015625, 0.09716796875], uptime=203986, trace=[column=25, file=net.js, function=TCP.onread, line=622, method=onread, native=false], stack=[Error: read ECONNRESET,     at TCP.onread (net.js:622:25)]
2019-01-10T21:05:12.437Z - debug: close event had_err=yes
2019-01-10T21:05:12.542Z - debug: connected after 1 attempt(s)

But then last night that stopped and now when I try to access /auth I get this:

2019-01-11T07:40:51.922Z - debug: sending request {"messageID":1,"protocolOp":"BindRequest","version":3,"name":"[email protected]","authenticationType":"Simple","credentials":"redacted","controls":[]}
2019-01-11T07:42:51.752Z - info: ::ffff:172.16.12.1 - test.user00001 [11/Jan/2019:07:42:51 +0000] "GET /auth HTTP/1.1" - - "-" "curl/7.29.0"
2019-01-11T07:42:54.486Z - info: Error while verifying token: [JsonWebTokenError] with message [jwt malformed]
2019-01-11T07:42:54.487Z - info: ::ffff:172.16.73.0 - - [11/Jan/2019:07:42:54 +0000] "POST /token?timeout=30s HTTP/1.1" 200 100 "-" "Go-http-client/1.1"

@juckerf
Copy link
Contributor

juckerf commented Jan 14, 2019

@var-kar
unfortunately I don't have a solution for the problem yet.

@ephracis
OK, is it possible to share the current ldap policies (https://support.microsoft.com/en-us/help/315071/how-to-view-and-set-ldap-policy-in-active-directory-by-using-ntdsutil) of your lab AD? I guess the recurring ECONNRESET events are related to the MaxConnIdleTime policy and they cause an automatical reconnect, but somehow the reconnect mechanism stops working after some time.

@ephracis
Copy link
Contributor Author

Don't think we have changed anything.

Policy                          Current(New)

MaxPoolThreads                  4
MaxPercentDirSyncRequests                       0
MaxDatagramRecv                 4096
MaxReceiveBuffer                        10485760
InitRecvTimeout                 120
MaxConnections                  5000
MaxConnIdleTime                 900
MaxPageSize                     1000
MaxBatchReturnMessages                  0
MaxQueryDuration                        120
MaxDirSyncDuration                      0
MaxTempTableSize                        10000
MaxResultSetSize                        262144
MinResultSets                   0
MaxResultSetsPerConn                    0
MaxNotificationPerConn                  5
MaxValRange                     1500
MaxValRangeTransitive                   0
ThreadMemoryLimit                       0
SystemMemoryLimitPercent                        0

@ephracis
Copy link
Contributor Author

ephracis commented Feb 1, 2019

Seems like the best option is to switch from ldapjs to https://github.com/ldapts/ldapts since that one is better maintained.

How big of an undertaking would that be? Any plans on doing such a switch?

@juckerf
Copy link
Contributor

juckerf commented Feb 5, 2019

Seems like the best option is to switch from ldapjs to https://github.com/ldapts/ldapts since that one is better maintained.

How big of an undertaking would that be? Any plans on doing such a switch?

I was already thinking about switching to ldapts, too. I was hoping that ldapjs maintenance gets picked up by someone, but the longer I think about I guess that we've no choice than to switch to ldapts. It shouldn't be a lot of work, but I'm very busy at the moment so it could take another couple weeks...

@juckerf
Copy link
Contributor

juckerf commented May 31, 2019

@ephracis I finally found the time to switch to ldapts.
The changes are available on the dev-branch and the dev-docker image tag.

@juckerf
Copy link
Contributor

juckerf commented Jul 16, 2019

There was a bug in v2.0.0 which lead to very similar symptoms. The bug has been identified and fixed in v2.0.1. Chances are that this problem is also resolved now.

@jeinwag
Copy link

jeinwag commented Oct 2, 2019

@juckerf Looks like I'm experiencing the same issue. I'm currently running 2.0.1. Authentication works fine, but after a while, the pod becomes unresponsive and gets restarted until it goes into a CrashLoopBackOff-state.
I already set LOGLEVEL to debug, but it doesn't show any more information in the logs than with the default setting.

@jeinwag
Copy link

jeinwag commented Oct 23, 2019

@juckerf never mind, it was a misconfiguration in my part. The schema and port of the liveness probe had to be adjusted.

@juckerf juckerf removed their assignment Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants