Skip to content

Conversation

@tgross
Copy link
Contributor

@tgross tgross commented Aug 18, 2016

For #44

Move snapshots out of the health check and into a periodic task. This allows us to avoid causing the health check on the primary from failing if the snapshot takes longer than the heartbeat TTL we sent to Consul.

Although the Shippable test rig is still intermittently failing regardless, I managed to get a working test run out of this:

----------------------------------------------------------------------
MySQLStackTest.test_replication_and_failover
----------------------------------------------------------------------
elapsed  | task
1.311280 | docker-compose -f local-compose.yml -p my up -d
0.265898 | docker-compose -f local-compose.yml -p my ps
0.028845 | docker inspect my_consul_1
27.19103 | wait_for_service: mysql-primary 1
0.269763 | docker-compose -f local-compose.yml -p my ps -q mysql
0.082411 | docker exec 2a0ea061b7bf672b64d901f38f9baf ip -o addr
0.356589 | assert_consul_correctness:
1.347476 | docker-compose -f local-compose.yml -p my scale mysql=3
27.99492 | wait_for_service: mysql 2
0.417261 | docker-compose -f local-compose.yml -p my ps -q mysql
0.150771 | docker exec 2a0ea061b7bf672b64d901f38f9baf ip -o addr
0.070887 | docker exec 4ecfea62e4743a52f222d21f6565f3 ip -o addr
0.120668 | docker exec 09fef2bc35185473c11ffb9434f9d1 ip -o addr
0.763964 | assert_consul_correctness:
0.162879 | docker exec my_mysql_1 mysql -u dbuser -p7WLNYNE6VV -e CREATE TABLE tbl1 (field1 INT, demodb
0.103156 | docker exec my_mysql_1 mysql -u dbuser -p7WLNYNE6VV -e INSERT INTO tbl1 (field1, fiel demodb
0.074242 | docker exec my_mysql_1 mysql -u dbuser -p7WLNYNE6VV -e INSERT INTO tbl1 (field1, fiel demodb
0.119470 | docker exec 09fef2bc3518 mysql -u repluser -pMSUI2ZP3V2 -e SHOW SLAVE STATUS\G; demodb
0.055543 | docker exec 09fef2bc3518 mysql -u dbuser -p7WLNYNE6VV -e SELECT * FROM tbl1 WHERE `fiel demodb
0.054787 | docker exec 4ecfea62e474 mysql -u repluser -pMSUI2ZP3V2 -e SHOW SLAVE STATUS\G; demodb
0.101544 | docker exec 4ecfea62e474 mysql -u dbuser -p7WLNYNE6VV -e SELECT * FROM tbl1 WHERE `fiel demodb
4.677375 | docker stop my_mysql_1
38.23517 | wait_for_service: mysql-primary 1
0.287686 | docker-compose -f local-compose.yml -p my ps -q mysql
0.012383 | docker exec 2a0ea061b7bf672b64d901f38f9baf ip -o addr
0.116804 | docker exec 4ecfea62e4743a52f222d21f6565f3 ip -o addr
0.068318 | docker exec 09fef2bc35185473c11ffb9434f9d1 ip -o addr
0.489333 | assert_consul_correctness:
0.002522 | wait_for_service: mysql 1
0.285395 | docker-compose -f local-compose.yml -p my ps -q mysql
0.013100 | docker exec 2a0ea061b7bf672b64d901f38f9baf ip -o addr
0.118188 | docker exec 4ecfea62e4743a52f222d21f6565f3 ip -o addr
0.103041 | docker exec 09fef2bc35185473c11ffb9434f9d1 ip -o addr
0.523726 | assert_consul_correctness:
0.074795 | docker exec 09fef2bc3518 mysql -u dbuser -p7WLNYNE6VV -e INSERT INTO tbl1 (field1, fiel demodb
0.083281 | docker exec 4ecfea62e474 mysql -u repluser -pMSUI2ZP3V2 -e SHOW SLAVE STATUS\G; demodb
0.101003 | docker exec 4ecfea62e474 mysql -u dbuser -p7WLNYNE6VV -e SELECT * FROM tbl1 WHERE `fiel demodb
6.470444 | docker-compose -f local-compose.yml -p my stop
0.335032 | docker-compose -f local-compose.yml -p my rm -f
.
----------------------------------------------------------------------
Ran 1 test in 115.427s

OK

cc @misterbisson

{
"name": "snapshot",
"command": "python /usr/local/bin/manage.py snapshot_task",
"frequency": 5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Don't we need to provide the unit?

Copy link
Contributor Author

@tgross tgross Aug 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're tolerant of fuzzy inputs. Our parsing for these defaults to seconds when we pass in an int (ref). So this would be 5 seconds. That being said, as being intended as an example we should probably pass in "5s" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also concerned that a 5 second interval was awfully fast, but upon reading the code, we're only checking to see if we should back up every 5 seconds. Actual backup frequency is defined elsewhere.

Is it worth renaming this snapshot_check to clarify that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and done. (Rebased these tweaks onto the original.)

@misterbisson
Copy link
Contributor

🏡 🚶

@tgross tgross merged commit f963037 into autopilotpattern:master Aug 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants