Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler silently fails on malformed ZK URLs #950

Open
PerilousApricot opened this issue Sep 4, 2018 · 0 comments
Open

Scheduler silently fails on malformed ZK URLs #950

PerilousApricot opened this issue Sep 4, 2018 · 0 comments

Comments

@PerilousApricot
Copy link

@PerilousApricot PerilousApricot commented Sep 4, 2018

Describe the bug
In a couple places [1] [2], the user is instructed to postfix the ZK connection string with a directory (zk node?) /cook. If the user does this, the scheduler for some reason will never connect to the mesos master.

[1] https://github.com/twosigma/Cook/blob/master/scheduler/docs/configuration.adoc
[2] https://github.com/twosigma/Cook/blob/master/scheduler/example-prod-config.edn#L15

To Reproduce
Download the latest Cook, build, and manually set the :zookeeper {: connection} config option to have a trailing /cook. The scheduler will begin some preparatory work, then seemingly hang, just periodically writing heartbeat messages to the log. I can turn this failure mode on and off by adding/removing that suffix.

Expected behavior
I'd expect an explicit crash in this case. I presume that the scheduler can't attempts to perform master election and fails because of the invalid ZK hostname. Since I never saw an error, and one of the final lines in the log is from Cook trying to find the mesos scheduler, I tried debugging that interaction, when the true failure was elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.