Avoid fatal error on MongoDB connection failures
Open, HighPublic

Description

The Manticore service starts on boot, and may be started before a MongoDB database server is available.

Currently, the startup for Manticore requires MongoDB to be available and ready, but really the only thing the service starting can depend on is network.

If we were to make the Manticore service to depend on mongod.service, then this locks in a single-host deployment, and does not allow for an external MongoDB service to be used.

We would rather not have Manticore not error out fatally, but instead retry (with an interval to prevent endlessly looping tirelessly).

Details

Ticket Type
Task
vanmeeuwen assigned this task to Adityab.Feb 8 2016, 1:09 PM
vanmeeuwen added a subscriber: Adityab.

@Adityab, assigning to you for an initial assessment on whether this can reasonably be achieved technically.

Please also see https://kanarip.wordpress.com/2016/02/08/manticore-may-start-before-mongodb/ for additional options we have in delivery / packaging (basically most of them spelling "better guesswork", and not the sole part of the proper solution either way).

@Adityab it's been a while now, could you let us know what your assessment is?

Is this a feature of the backend driver that is not enabled, or a feature to be implemented on the side of the consumer of the provider? Or something else entirely?

Adityab added a comment.EditedMar 7 2016, 11:47 AM

@vanmeeuwen I'm testing a 30-second timeout for attempting a MongoDB connection, following which Manticore would terminate gracefully. Which is necessary to have anyway in an environment where one runs Manticore directly without systemd.

That said, systemd should be made to wait for MongoDB to have started before Manticore.

In T981#15995, @Adityab wrote:

@vanmeeuwen I'm testing a 30-second timeout for attempting a MongoDB connection, following which Manticore would terminate gracefully. Which is necessary to have anyway in an environment where one runs Manticore directly without systemd.

I can already delay the start-up of Manticore, this doesn't resolve the problem.

These environments without systemd are not only future legacy, there is no reason for Manticore to terminate -- gracefully or otherwise -- if it is able to shrug off the failure and continue trying.

It may spit out a message or two (preferably with an indication of ECONNREFUSED vs. a timeout vs. whatever else).

What you're construing would make such a message a fatal error message, what I'm saying is it must be a warning.

That said, systemd should be made to wait for MongoDB to have started before Manticore.

Incorrect, Manticore may not require MongoDB to be provided on the same node, and as such there is no means to dictate and/or trust MongoDB is available at Manticore startup.

pasik added a subscriber: pasik.Nov 25 2017, 2:30 PM