Abort session if failover/failback is ongoing.
authorTatsuo Ishii <ishii@sraoss.co.jp>
Tue, 2 Apr 2019 03:56:01 +0000 (12:56 +0900)
committerTatsuo Ishii <ishii@sraoss.co.jp>
Tue, 2 Apr 2019 03:56:01 +0000 (12:56 +0900)
If failover/failback is ongoing, there would be a risk that MASTER
node macro cannot be used. If used, it could raise a segfault because
connection to the master node is NULL or bogus.

There are several reports suspected to be caused by this (see bug 481,
482 for example).

Now the guts of the MASTER* macro (pool_virtual_master_db_node_id())
is modified to check Req_info->switching which is true while
failover/failback is ongoing. If true, emit warning message and exit
the process. There's still a small window I know, but this should
greatly reduce the chance to access bogus MASTER connection without
using any locking.

src/context/pool_query_context.c

index b237a8ca44f5347b21b18b8bd404b35e8f985675..837fb01af3e91b935833da540bc15489b7c7cbaf 100644 (file)
@@ -318,6 +318,20 @@ pool_virtual_master_db_node_id(void)
                return REAL_MASTER_NODE_ID;
        }
 
+       /*
+        * Check whether failover is in progress. If so, just abort this session.
+        */
+       if (Req_info->switching)
+       {
+               POOL_SETMASK(&BlockSig);
+               ereport(WARNING,
+                               (errmsg("failover/failback is in progress"),
+                                               errdetail("executing failover or failback on backend"),
+                                errhint("In a moment you should be able to reconnect to the database")));
+               POOL_SETMASK(&UnBlockSig);
+               child_exit(POOL_EXIT_AND_RESTART);
+       }
+
        if (sc->in_progress && sc->query_context)
        {
                int                     node_id = sc->query_context->virtual_master_node_id;