A Cyrus IMAP Discrete Murder topology requires backends to be configured to make reservations for, and activate, mail folders against MUPDATE -- configuring these options make a backend participate in the murder.
When a backend receives a CREATE MAILBOX foo command, and the folder foo does not already exist in its local mailboxes.db, it attempts to make a reservation for the folder against the MUPDATE server.
The MUPDATE server checks if the folder foo does not already exist elsewhere in the Cyrus IMAP Murder, and allows the reservation if it is not. The backend -- using this response -- performs the necessities (local mailboxes.db, spool filesystem, etc.). When everything is complete, the backend activates the folder in MUPDATE, and the new folder propagates to frontends.
When a Cyrus IMAP backend (the master) is replicated to another backend (the replica), the master is the backend participating in the murder (making reservations and activating mailboxes) and the replica cannot also participate in the murder.
The replica can not also participate in the murder, because the following would occur:
- The master backend would make a reservation that succeeds,
- The master would communicate the new mailbox being created to the replica,
- The replica would attempt to create the same mail folder though a reservation in mupdate,
- Mupdate will deny the reservation attempt of the replica, because the mailbox already exists "elsewhere in the murder", i.e. on the master backend.
To allow failover from the master to the replica, at least the replica backend's cyrus-imapd service will need to be reconfigured and reloaded or restarted, in order for the replica to functionally participate in the Murder when it comes back.
In this scenario, both the master and the replica would, preferably, present themselves with the same server name (/etc/imapd.conf setting servername), so that the initialization of Cyrus IMAP, mainly uploading the list of local mailboxes to the mupdate server using ctl_mboxlist -m, does not require a disproportionate number of changes throughout the murder.
In other words, two backends may exist:
- backenda1 (original master, murder configured)
- backenda2 (original replica, murder NOT configured)
In the case of fail over, the target scenario is as follows:
- backenda1 (original master, to become replica, murder NOT configured)
- backenda2 (original replica, to become master, murder configured)
Both systems would present themselves to the murder as backenda, so that frontends proxy connections to backenda -- this resolves to a service IP address attached to the current master, and during startup, the murder can acknowledge that the mailboxes being reported as "local" on "backenda" are indeed supposed to be "local" on "backenda".
The fail over between the two backends, however, requires a cyrus-imapd service reload or restart -- a new configuration has to be activated, switching on or off the participation in the Murder. This creates untraceable, time-sensitive interference on health checks (that determine whether a system is healthy enough to not trigger failing over to its peer and are executed in rapid succession), and is indeterminate of the proper start situation, as well as the intended target situation.
These service restarts are needed solely to switch configuration files.
These switches of configuration files is needed solely to ensure (best one can), only one participant in a replication pair is a participant in the murder.
Implementation Details
The MUPDATE server should be made to accept, and respond positively to, a second reservation and a second activation if the reservation or activation describes the same location (backend server name and partition) as for which the existing reservation or activation came in (that would have otherwise blocked the second reservation).