Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot consume or produce data to topic on docker version 1.13.1 in kafka cluster with external apis #434

Closed
hemanshupaliwa7 opened this issue Dec 3, 2018 · 10 comments

Comments

@hemanshupaliwa7
Copy link

hemanshupaliwa7 commented Dec 3, 2018

Clarification

This issues is regarding a configuration that is working with docker swarm and docker 18.x.x but does not work with docker 1.13.1


I am able to successfully produce and consume the data inside each kafka container running on each node. However, when it try to produce or consume data externally through APIs. I am not able to do so.

I have created kafka cluster with docker stack deploy using following configuration in yml file

version: "3.1"
services:
  kafka:
    image: local-registry:5000/kafka
    deploy:
        replicas: 2
    ports:
     - "9094:9094"
    networks:
     - zoonet
    environment:
      HOSTNAME_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2" 
      KAFKA_ZOOKEEPER_CONNECT: zk1:2181,zk2:2181,zk3:2181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: INSIDE://:9092,OUTSIDE://_{HOSTNAME_COMMAND}:9094
      KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:9094
      KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
      BROKER_ID_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
networks:
  zoonet:
    driver: overlay

Another strange behavior is that when I create a topic inside a kafka container with replication factor 1, it gets created with some random replication factor.

bash-4.4# ./kafka-topics.sh --create --zookeeper zk1:2181, zk2:2181, zk3:2181 --replication-factor 1 --partitions 1 --topic test1
Created topic "test1".
bash-4.4# ./kafka-topics.sh --describe --zookeeper zk1:2181, zk2:2181, zk3:2181 --topic test1
Topic:test1     PartitionCount:1        ReplicationFactor:1     Configs:
        Topic: test1    Partition: 0    Leader: 3       Replicas: 3     Isr: 3

I built the same set up on docker version 18x and everything works fine. I can produce and consume data with external apis.

@hemanshupaliwa7 hemanshupaliwa7 changed the title Cannot consume or produce data to topic on docker version 1.13.1 in kafka cluster Cannot consume or produce data to topic on docker version 1.13.1 in kafka cluster with external apis Dec 3, 2018
@sscaling
Copy link
Collaborator

sscaling commented Dec 3, 2018

It sounds like you have similar issues to #432.

If you haven't, I'd recommend reading the Connectivity Guide

Someone had a similar issue with the resolvable address of the Swarm cluster and opened a PR: #377. Perhaps using the FQDN will solve your situation.


Another strange behavior is that when I create a topic inside a kafka container with replication factor 1, it gets created with some random replication factor.

I think you are just mis-reading the output

bash-4.4# ./kafka-topics.sh --describe --zookeeper zk1:2181, zk2:2181, zk3:2181 --topic test1
Topic:test1     PartitionCount:1        ReplicationFactor:1     Configs:
        Topic: test1    Partition: 0    Leader: 3       Replicas: 3     Isr: 3

It states that test1 has PartitionCount of 1 and ReplicationFactor of 1. The table below is just listing how all the partitions are allocated to their respective brokers.

Topic: test1 Partition: 0 Leader: 3 Replicas: 3 Isr: 3

This reads, "broker 3 is currently the leader for paritition 0 for test1 topic. It is only replicated on broker 3. Currently the in-sync replicas (ISRs) are broker 3"

If you create a topic with --replication-factor 1 --partitions 10, you will see a list of 10 partitions all distributed amongst the different brokers.

@hemanshupaliwa7
Copy link
Author

It sounds like you have similar issues to #432.

If you haven't, I'd recommend reading the Connectivity Guide

Someone had a similar issue with the resolvable address of the Swarm cluster and opened a PR: #377. Perhaps using the FQDN will solve your situation.

Another strange behavior is that when I create a topic inside a kafka container with replication factor 1, it gets created with some random replication factor.

I think you are just mis-reading the output

bash-4.4# ./kafka-topics.sh --describe --zookeeper zk1:2181, zk2:2181, zk3:2181 --topic test1
Topic:test1     PartitionCount:1        ReplicationFactor:1     Configs:
        Topic: test1    Partition: 0    Leader: 3       Replicas: 3     Isr: 3

It states that test1 has PartitionCount of 1 and ReplicationFactor of 1. The table below is just listing how all the partitions are allocated to their respective brokers.

Topic: test1 Partition: 0 Leader: 3 Replicas: 3 Isr: 3

This reads, "broker 3 is currently the leader for paritition 0 for test1 topic. It is only replicated on broker 3. Currently the in-sync replicas (ISRs) are broker 3"

If you create a topic with --replication-factor 1 --partitions 10, you will see a list of 10 partitions all distributed amongst the different brokers.

Thanks for the reply. I misinterpreted the topic description. However, I am not sure about FQDN, because i haven't done anything different on the same set up with docker version 18x.

@sscaling
Copy link
Collaborator

sscaling commented Dec 3, 2018

What is the value of HOSTNAME_COMMAND if you look inside a running container?

What version of docker-compose are you using? There was a new container naming format introduced in 1.23.0 - but they have reverted it in 1.23.2 because of "addressability issues".

@hemanshupaliwa7
Copy link
Author

hemanshupaliwa7 commented Dec 3, 2018

What is the value of HOSTNAME_COMMAND if you look inside a running container?

What version of docker-compose are you using? There was a new container naming format introduced in 1.23.0 - but they have reverted it in 1.23.2 because of "addressability issues".

The value will be = "docker info | grep ^Name: | cut -d' ' -f 2", it is used in the shell script to derive the Host Machine name and further the advertised listener is set with Host Machine Name in server.properties. I am using docker stack deploy command to create cluster.

@sscaling
Copy link
Collaborator

sscaling commented Dec 3, 2018

I'm asking what is the value, not how it is generated.

Have you made sure that is returning the value you are expecting? (i.e. is it advertising a name that is resolvable by the client?, Does it match the expected "Host Machine" name). As previously stated, there were some updates to docker-compose that broke addressability. Checking the version is a simple check docker-compose --version.

Have you performed basic network checks / can you access the ports on the derived hosts?

Have you checked in Zookeeper to confirm the correct configuration is stored?

Also, I don't see any pinning of brokers to physical host. Just a broker ID - so if you are not clearing state between restarting brokers there is a 50/50 chance that the broker is starting with the incorrect config and claiming to be responsible for offsets that it does not have on disk. However, with that being said, I don't think your BROKER_ID_COMMAND make sense - unless your hostname is just a small number < 1000 - and even then, i'm not sure how clients will resolve that as it's a numerical non-TLD.

@hemanshupaliwa7
Copy link
Author

hemanshupaliwa7 commented Dec 3, 2018

I'm asking what is the value, not how it is generated.

Have you made sure that is returning the value you are expecting? (i.e. is it advertising a name that is resolvable by the client?, Does it match the expected "Host Machine" name). As previously stated, there were some updates to docker-compose that broke addressability. Checking the version is a simple check docker-compose --version.

Have you performed basic network checks / can you access the ports on the derived hosts?

Have you checked in Zookeeper to confirm the correct configuration is stored?

Also, I don't see any pinning of brokers to physical host. Just a broker ID - so if you are not clearing state between restarting brokers there is a 50/50 chance that the broker is starting with the incorrect config and claiming to be responsible for offsets that it does not have on disk. However, with that being said, I don't think your BROKER_ID_COMMAND make sense - unless your hostname is just a small number < 1000 - and even then, i'm not sure how clients will resolve that as it's a numerical non-TLD.

Yes the value is as expected and it is as here docker5 is my machine name
advertised.listeners=INSIDE://:9092,OUTSIDE://docker5:9094
Docker Compose version is "docker-compose version 1.18.0" and docker version is 1.13.1.

Yes ports are opened and I can see they are in a list of listening status using

netstat - l| grep 9094
tcp6       0      0 [::]:9094               [::]:*                  LISTEN

Yes Zookeeper seems to be working fine as I can produce and consume data from inside the container on each node.

BROKER_ID_COMMAND is only used to assign distinct broker.id on each node. Extracting numeric value from Host Machine Name.

Same setup on with Docker version 18x and Docker Compose version 1.23.1, everything seems to be working. DO you there can be an issue with versions?

There is a difference I see on both version is that
Docker 18x, container is running as

0181645bcf17        local-registry:5000/aqua_kafka:latest   "start-kafka.sh"         4 days ago          Up 4 days           0.0.0.0:9094->9094/tcp   aqua_kafka.4.fro3za350wagc4vzbdzdl4atg

Docker 1.13.1, container is running as

4a25efa7750e        local-registry:5000/aqua_kafka@sha256:0a7ebcf570148db5820212b14858960310a1ba7aa2074cbe747b453696818a52   "start-kafka.sh"         2 days ago          Up 2 days                                    aqua_kafka.1.t64uui0dr6qyrbyp1hi06fucg

I don't see the port opened when I do "docker ps" on nodes with docker 1.13.1.

My Compose file from Docker 18x is as

version: '3.2'
services:
  kafka:
    image: local-registry:5000/aqua_kafka
    deploy:
      replicas: 4
    ports:
      - target: 9094
        published: 9094
        protocol: tcp
        mode: host
    environment:
      BROKER_ID_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"
      HOSTNAME_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"
      KAFKA_ZOOKEEPER_CONNECT: zk1:2181,zk2:2181,zk3:2181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INSIDE:PLAINTEXT,OUTSIDE:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: INSIDE://:9092,OUTSIDE://_{HOSTNAME_COMMAND}:9094
      KAFKA_LISTENERS: INSIDE://:9092,OUTSIDE://:9094
      KAFKA_INTER_BROKER_LISTENER_NAME: INSIDE
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

@hemanshupaliwa7
Copy link
Author

I even upgraded to docker-compose version to 1.23.1 and still the same issue on set up with docker version 1.13.1

@sscaling
Copy link
Collaborator

sscaling commented Dec 3, 2018

BROKER_ID_COMMAND is only used to assign distinct broker.id on each node. Extracting numeric value from Host Machine Name.

Not according to the docker-compose file you provided. HOSTNAME_COMMAND and BROKER_ID_COMMAND are the same command as far as I can see:

HOSTNAME_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"
BROKER_ID_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"

Given your hostname of docker5, both values would be docker5.

How does your client resolve docker5? DNS / static-host?

DO you there can be an issue with versions?

Possibly. If you are using the wurstmeister/kafka images, all current branches should have the same underlying base image and code for whichever Kafka version you are trying to use. There should not be any difference from the image perspective. It looks like you've built a custom image local-registry:5000/aqua_kafka so I can't offer any advice regarding that.

For reference the build server uses:

$ docker --version
Docker version 17.09.0-ce, build afdb6d4
$ docker-compose --version
docker-compose version 1.17.1, build 6d101fb

From what I can see, this issue is nothing to do with the kafka-docker image. This seems more of a docker / docker-swarm / docker-compose / configuration issue.

@hemanshupaliwa7
Copy link
Author

BROKER_ID_COMMAND is only used to assign distinct broker.id on each node. Extracting numeric value from Host Machine Name.

Not according to the docker-compose file you provided. HOSTNAME_COMMAND and BROKER_ID_COMMAND are the same command as far as I can see:

HOSTNAME_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"
BROKER_ID_COMMAND: "docker info | grep ^Name: | cut -d' ' -f 2"

Given your hostname of docker5, both values would be docker5.

How does your client resolve docker5? DNS / static-host?

DO you there can be an issue with versions?

Possibly. If you are using the wurstmeister/kafka images, all current branches should have the same underlying base image and code for whichever Kafka version you are trying to use. There should not be any difference from the image perspective. It looks like you've built a custom image local-registry:5000/aqua_kafka so I can't offer any advice regarding that.

For reference the build server uses:

$ docker --version
Docker version 17.09.0-ce, build afdb6d4
$ docker-compose --version
docker-compose version 1.17.1, build 6d101fb

From what I can see, this issue is nothing to do with the kafka-docker image. This seems more of a docker / docker-swarm / docker-compose / configuration issue.

My Servers are hosted on Google Cloud in the same network. SO all nodes are accessible with in network.

Yes, I also think that there is no issue with Kafka Image. It has to be docker version, because the same image is working fine on docker 18x.

Thanks for your help!!

@sscaling
Copy link
Collaborator

sscaling commented Dec 3, 2018

OK. I will close the issue then.

@sscaling sscaling closed this as completed Dec 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants