Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Introducing pg_auto_failover (citusdata.com)
14 points by ibotty on May 30, 2019 | hide | past | favorite | 2 comments


Given that there is no fencing/stonith mechanism, I’m curious how this avoids possible split brain scenarios. Let’s say you have one primary and two standby in sync replication, as recommended. The primary experiences a network partition from the other two. Is one of the standbys auto-promoted, or not? If so, there is now the possibility of two running primaries and a divergent timeline. If not, then you don’t have auto-failover. What am I missing?

Edit: After looking at the architecture page, it seems that the monitor would ask the failed node kill itself? That is polite but wouldn’t work if monitor is partitioned with the standby nodes. Does this expect the primary will self-suicide if it cannot connect to the monitor? Such an approach could be problematic if the keeper process is hung but pg itself is still running. Would love clarification.


Is this available on Citus Cloud or any hosted Postgres as of today?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: