QCon London 2012: High availability at Heroku

7 mars 2012 Durée de lecture : 2 minutes

Cet article fait partie de la série QCon London 2012

Mark Mebranaghan
Track: High available systems

Architecture

Lots of load balancing.

Embrace crashing and enabling supervision (like erlang):

Message passing (json format). Handle different versions of messages.

Continuously running.

It’s a distributed system with granular failure.

Brokered queueing (producer/consumer pattern):

RabbitMQ used for a while
the broker node is a SPOF : publish to one, subscribe to all
several brokers : load balanced publish to one, all subscribers subscribed to all brokers

Read call-graph partial failure: graceful termination.

Write call-graph de-synchronization : write a ticket (to a local database) to delay write operations when not available.

Evolving socio-technical ecosystems. Most of the problems are:

Need for a very repeatable deploy:

incremental deploys: deploy to a few nodes and incresing deploy perimeter when confidence grows
incremental rollouts (features): feature flags for dev, beta users then all users
real time visibility: dashboards (60s visibility)
service level assertions : asserts in code, global level for a service. Assert good things too

Flow control and back pressure : some systems can’t absorb all load:

Avez-vous écrit une réponse à ce contenu ? Donnez-moi l'URL concernée :