Hello friends, are you enjoying this article series on microservice architecture? Well, that is why you guys are ready for yet another article. So after the event-driven design, it is now time for the circuit breakers! You must have heard about circuit breakers in electronics subjects, but how is microservice architecture related to circuit breaker? What is the need for circuit breakers? We have discussed that microservice architecture has many advantages like loose coupling, reusability, etc. But at some point in time, the architecture gets complicated because each user action invokes multiple services in most of the cases. When one or more services are unavailable, it results in a cascading failure. In addition to this, the service client retry logic makes things worse. So who is the savior under such circumstances? Yes, you guessed it right, circuit breakers!
There is a particular pattern in the circuit breaker system. This pattern helps in preventing cascading failure issues across multiple systems. The circuit breakers help in building a fault-tolerant resilient system that survives when the services aren’t unavailable or are having high latency. The logic behind circuit breaker system isn’t complicated. The system can be complicated, but the solution is quite simple. The protected function call is wrapped in a circuit breaker object. It is responsible for monitoring failures. If the failure reaches at a predefined point or at a threshold, the circuit breaker trips. The same logic that is used in real circuit breakers, if the voltage increases, the circuit breaker trips. In our case, when the circuit breaker trips, all further calls to the circuit breaker return with an error. It doesn’t allow the protected call as well! Usually, there is some monitor alert as well!
There are three different states of circuit breakers. The first is the Closed State, the second is the Open State, and the third is the Half-Open State. When the circuit breaker is in the Closed State, all the calls shall go through the supplier microservice. It further responds without any latency. Let us discuss each stage in detail.
When the supplier microservice experiences an issue like a slow response, the circuit breaker will start receiving timeouts for the requests. It is like someone knocking on your door and asking you to respond. In case, if there no response or in technical terms, if the number of timeout reaches a predefined threshold, the circuit breaker trips. Understand that till now, the circuit breaker was in a Closed State. Once it trips, it is in the Open State. The calls will now be returned with an error from the circuit breaker. It won’t even forward the request to the supplier microservice. The situation could be worse if there was no trip in the circuit breaker. With the tripping, the supplier microservice gets a chance to reduce its load. So when everything is working perfectly without any issues, it is the Closed State.
So something is wrong and there is no response, that is why the circuit breaker is in the Open State. But issues can’t be there for a long time, right? Or else, there is no meaning of having an efficient system. So the circuit breaker needs to know if the supplier microservice has recovered or not! This can be done with the help of a monitoring and feedback mechanism. So a trial call is made to the supplier microservice periodically. This helps in knowing the status of the supplier microservice. Trial calls take place in the Half-Open State. So when the circuit breaker is in the Open State, it usually jumps to the Half-Open State. If the supplier microservice doesn’t reply within a specific threshold, the circuit breaker will switch back to the Open State from the Half-Open State. Fortunately, if the supplier microservice responds, the circuit breaker goes back to the Closed State.
We have already discussed all the states above. So in the Half-Open state, there are two options, either Closed State or Open State. The circuit breaker doesn’t remain in the Half-Open state for a long time, it is just meant for checking the status of the supplier microservice.
The circuit breaker is quite valuable because it offers real-time monitoring. Any minor change and the state changes instantly. It ensures the availability of services. Also, the testing of states helps in finding fault. It leads to code improvement as well. For example, if a particular service isn’t available, then why isn’t it available? Where has the problem occurred? All such questions will appear. In short, the circuit breaker pattern helps in handling downtime, improving speed, and reducing the load. Now, once we have discussed the basic concepts and states of the circuit breaker, it is time to discuss the implementation part.
Hystrix is a library that has been designed to obtain latency and fault tolerance. It isolates the points of access to remote systems, services, and 3rd party libraries. It tackles cascading failure and enable’s resilience in complex distributed systems. Hystrix was open-sourced by Netflix a couple of years ago. It is the most widely used framework for implementing the circuit breaker pattern.
So how does Hystrix achieve the circuit breaker pattern? First of all, it wraps up all the calls directed to external systems using a HystrixCommand or HystrixObservableCommand object. These commands are executed in a separate thread. Secondly, it times-out calls that consume more time or violates a pre-defined threshold. There is always a default threshold but one is free to define or customize their own threshold as per the requirement. This can be done with the help of properties functionality. Thirdly, it maintains a thread-pool for each dependency. If this pool is loaded or full, requests for a particular dependency are immediately rejected. The thread-pool is small. Fourth, it maintains a record of success, failures, timeouts, and thread rejections. This helps in keeping a performance track and helps in the finding the scope for improvement. Fifth, it trips the circuit-breaker to stop requests from a particular service after a particular amount of time. Sixth, it performs the fallback logic if a request fails, times-out, or rejected. It also monitors the metrics and configurations changes in almost real-time. This is how Hystrix achieves the circuit-breaker pattern.
So what is the relation of microservice architectures and Hystrix? Microservice architectures and cloud-native platforms like CloudFoundry and Kubernetes leverage Hystrix by building resilient microservice deployments. In addition to this, one can also use the Hystrix Dashboard for getting real-time visualization. The dashboard is pretty much efficient and does its job precisely. We know that majority of microservice architecture is complex or sophisticated. But still, one might be able to leverage the benefits with the use of tools like Zoomdata, Striim, or TIBCO Live Datamart. In addition to monitoring live streaming data, it also helps in continuous aggregations, rules, consolidation, and predictive analytics for automated or human-driven decision-making.
In this article, we discuss the importance of circuit breakers. The concept in itself isn’t difficult. We begin the article by discussing the concept and the need for circuit breakers in the microservices architecture. We then discuss the three states of circuit breakers. These states are known as close, open, and half-open states. So this was the basic conceptual discussion.
In the next section, we discuss the implementation part and we study the Hystrix Netflix open-source implementation. Here we discuss the importance of the Hystrix library. We then shift to the next section which is related to integration. So it is all about Hystrix and the Middleware. Overall, the most important thing to understand is that if you are doing circuit breakers in the right way, they don’t need to be coded into individual microservices. They are a part of the infrastructure itself! Circuit breakers precisely guard the system from transient failures and buggy code. In addition to all this, they provide vital information regarding the health of the microservices ecosystem. So always try and implement as well as integrate the circuit breakers! See you guys in the upcoming article!
Here is the link to the previous article of this series.