`spread cluster start` fails to start up cluster #63

hharnisc · 2016-05-24T14:35:17Z

It's unclear how I got in this state, but I'm not able to start up a localkube cluster.

I've tried stoping/starting the cluster, removing all images and containers, re-creating a docker machine, and even going as far as re-installing docker.

The container seems to continuously restart

CONTAINER ID        IMAGE                            COMMAND             CREATED             STATUS                         PORTS               NAMES
5a1e299c3124        redspreadapps/localkube:latest   "start.sh"          15 minutes ago      Restarting (0) 2 minutes ago                       localkube

When I grab the container logs (docker log 5a1e299c3124) I get the following:

0bb5c03101f0f473218733b67258b04c07176225413651703e62295686adc014
1ef0f1618a97621cf0cca908d428cf466d3dc4b5f8ac4c1112d8829bb31dc147
10.32.0.1
Starting LocalKube...
Starting etcd...
2016-05-24 14:27:53.477939 I | etcdserver: recovered store from snapshot at index 460046
2016-05-24 14:27:53.478088 I | etcdserver: name = kubeetcd
2016-05-24 14:27:53.478126 I | etcdserver: data dir = /var/localkube/data
2016-05-24 14:27:53.478152 I | etcdserver: member dir = /var/localkube/data/member
2016-05-24 14:27:53.478175 I | etcdserver: heartbeat = 100ms
2016-05-24 14:27:53.478197 I | etcdserver: election = 1000ms
2016-05-24 14:27:53.478218 I | etcdserver: snapshot count = 10000
2016-05-24 14:27:53.478245 I | etcdserver: advertise client URLs = http://localhost:2379
2016-05-24 14:27:53.478289 I | etcdserver: loaded cluster information from store: <nil>
2016-05-24 14:27:54.295145 C | etcdserver: read wal error (walpb: crc mismatch) and cannot be repaired
Plugin is not running.

The text was updated successfully, but these errors were encountered:

mfburnett · 2016-05-24T22:00:04Z

Hey @hharnisc, try to stop localkube and remove all containers with spread cluster stop -r and then restart with spread cluster start - let me know if that fixes it.

hharnisc · 2016-05-25T00:44:14Z

@mfburnett still no luck

$ spread cluster stop -r
Stopping container '5a1e299c3124b361b895f1279f612f1174f7c5e2e9b5287a8ae077b12708f803'
Removing container '5a1e299c3124b361b895f1279f612f1174f7c5e2e9b5287a8ae077b12708f803'

then starting it

$ spread cluster start                                           
Creating localkube container...
Starting localkube container...

then checking the cluster

$ kubectl cluster-info
The connection to the server 192.168.99.100:8080 was refused - did you specify the right host or port?

hharnisc · 2016-05-25T01:58:21Z

Looking at that log it looks like etcd is having a bad time. Potentially blowing up here: https://github.com/coreos/etcd/blob/master/wal/wal.go#L271

hharnisc · 2016-05-25T01:59:50Z

@mfburnett @ethernetdan does localkube cache anything on the host filesystem?

hharnisc · 2016-05-25T02:54:51Z

rm -rf ~/.localkube seems to have got me unstuck. I wish I would have thought to keep of copy of data in there so you could use it to debug. If it happens again I'll be sure to include it.

mfburnett · 2016-05-25T05:04:13Z

@hharnisc hm glad you got unstuck, thanks for documenting it!

ibmendoza · 2016-06-08T07:29:00Z

It also happened to me under Turnkey Linux 14.1 but fortunately below worked.
Thanks @mfburnett

spread cluster stop -r

spread cluster start

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`spread cluster start` fails to start up cluster #63

`spread cluster start` fails to start up cluster #63

hharnisc commented May 24, 2016

mfburnett commented May 24, 2016

hharnisc commented May 25, 2016

hharnisc commented May 25, 2016

hharnisc commented May 25, 2016

hharnisc commented May 25, 2016

mfburnett commented May 25, 2016

ibmendoza commented Jun 8, 2016

spread cluster start fails to start up cluster #63

spread cluster start fails to start up cluster #63

Comments

hharnisc commented May 24, 2016

mfburnett commented May 24, 2016

hharnisc commented May 25, 2016

hharnisc commented May 25, 2016

hharnisc commented May 25, 2016

hharnisc commented May 25, 2016

mfburnett commented May 25, 2016

ibmendoza commented Jun 8, 2016

`spread cluster start` fails to start up cluster #63

`spread cluster start` fails to start up cluster #63