Retrying and recovery via Spring Boot using Spring Retry.

6 min readSep 28, 2023

As per official documentation of the Spring Framework

We use retry functionality as below To make processing more robust and less prone to failure, it sometimes helps to automatically retry a failed operation in case it might succeed on a subsequent attempt. Errors that are susceptible to intermittent failure are often transient in nature.
Examples include remote calls to a web service that fails because of a network glitch or a DeadlockLoserDataAccessException in a database update.

Pre-requisites

Maven. (to pull dependencies)
JDK 8+.
IDE of your choice. (I prefer Spring Tool Suite, which comes with out-of-box features related to Spring and its boot ;) )

As part of the demo, we will run 2 minimal Spring Boot applications in Client-Server Architecture where the Client will have retry capabilities to connect and pull data from the server in case of intermittent failures. We will simulate failure by bringing down the Server application and UP in the given retry policies.

We will use Spring Initializr to generate a minimalistic spring boot application with Web dependency below one for the client and the other for the server. Rest dependencies we will add on the fly:

Server

There is nothing special here, a simple controller which exposes to return a String. We have changed the port for the server to 9001 to avoid conflicts.

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@SpringBootApplication
public class ServerApplication {

    public static void main(String[] args) {
        SpringApplication.run(ServerApplication.class, args);
    }

}

@RestController
class ServerResurce {

    @GetMapping(path = "/generate")
    public String generate() {
        return "Server Response";
    }
}

Client

The client will just consume the service exposed by the Server and print the response to logs in vanilla flow as below:

LOGGER.info("Connecting to Server");
ResponseEntity<String> responseFromServer = _rt.exchange(new URI("http://localhost:9001/generate"),
                HttpMethod.GET, null, String.class);
LOGGER.info(String.format("Response recieved from Server: %s", responseFromServer.getBody()));

As we can see if due to a network glitch or any other intermittent problem Client is not able to connect to the Server it will throw an exception which is wrapped as org.springframework.web.client.RestClientException. So ideally in a real-world scenario for such critical services, we may want to retry to handle failures before giving up on returning some wrapped error-based response or fetching the same from some other data source.

We will check brute force and then the Out of box feature of Spring with Spring Retry.

Brute Force Approach

We will use a tried and tested brute force approach of counter and retry with delay when a client is not able to connect to the Server. So in this case we will retry if we catch org.springframework.web.client.RestClientException.

public void callServerBruteForce() throws InterruptedException, URISyntaxException {

        int maxRetry = 5;
        int retryAttempts = 0;

        while (retryAttempts < maxRetry) {
            LOGGER.info(String.format("Retry attempt: %d", retryAttempts + 1));
            try {
                ResponseEntity<String> responseFromServer = _rt.exchange(new URI("http://localhost:9001/generate"),
                        HttpMethod.GET, null, String.class);
                LOGGER.info(String.format("Response recieved from Server: %s", responseFromServer.getBody()));
                break;
            } catch (RestClientException e) {
                LOGGER.error(String.format("Unable to connect to Server: %s", e.getMessage()));
                retryAttempts++;
                Thread.sleep(2000);
            }
        }

        if (retryAttempts == maxRetry) { // Recover if all retries are exhausted.
            LOGGER.info("Do something else, as the client was not able to call Server.");
        }

    }

The code is self-explanatory, we will retry 5 before we attempt to recover. Logs below show when a server is down, the client attempts 5 times before triggering recover inside if condition.

This is all good, however, the code looks quite messy with this counter and delay, etc.

Spring-based Retry Approach

We need the following additional dependency in the Client Application as follows ( AOP is an additional dependency for the declarative approach at runtime):

<dependency>
        <groupId>org.springframework.retry</groupId>
        <artifactId>spring-retry</artifactId>
</dependency>
<dependency>
        <groupId>org.springframework</groupId>
        <artifactId>spring-aop</artifactId>
</dependency>
<dependency>
        <groupId>org.aspectj</groupId>
        <artifactId>aspectjweaver</artifactId>
</dependency>

There are 2 primary annotations in this declarative approach to achieve Retry-Recover functionality:

As the name suggests Retryable is used for setting up retry policy configuration Recover is for recovering when retries are exhausted.

The below code shows the Spring Retry approach of the above Brute Force approach in more cleaner and declarative approach.

Now we will keep the server down, and let all retries get exhausted so recovery is triggered. We can see this in the logs below. ( NOTE: Delay is 2 seconds but the log shows a bit different ;) because the REST call takes some time before it throws RestClientExxception hence 4 seconds.)

To prove why 4 seconds we can just throw an exception rather than calling the server, it would show a retry attempt after 2 seconds as below:

Spring Retry Logs

We can see now retry attempt is after 2 seconds as there is no real processing or I/O call. The idea is retry attempt with a given delay will be made after an exception is thrown from the executing function.

Now we will revert back and bring the server after the client has made calls, to prove when the server comes online in between one of retry will be successful.

As we can see from the first screenshot above client made 2 attempts but failed as the server was down, as soon as the server came up as per the 2nd screenshot above client was able to receive a response from the server stopping further retry attempts.

Retryable and Recover can be used on any method doesn’t have to be a case of HTTP executing strategy only. One use case would be Cache and Database, where the call to fetch data from the cache doesn’t complete resulting in Cache Miss in that case, you can recover via Database. So it’s a very useful clean and declarative approach to retry.

Bonus

The above shows how we made use of Retry and Recover. Will cover 2 additional useful features for the above annotation which may be required in a real-world scenario.

If you have multiple Retry and Recover, you can use the recover property of Retryable to point to the right recovery method. Value to recover should be the name of the method annotated with Recover. Example as below:

Say you want to retry in some incremental way rather than fixed Delay as above, we can use properties available on @Backoff annotation.
We will write a simulation method to achieve, we want to double our delay after every attempt. As per below, we will attempt 5 times, so after every retry time will double of current delay. So with a start delay of 2000 and multiplier of 2 and attempts of 5, we will have attempts at 0, 2, 4, 8, 16 seconds. The logs below prove the same:

Sometimes you may want that after say some attempt, the delay should be fixed, here we can use max delay so even if the multiplied value in the next retry is more than max delay attempts will be made at the time specified in max delay.

Resources

Thank you for reading, If you have reached it so far, please like the article, It will encourage me to write more such articles. Do share your valuable suggestions, I appreciate your honest feedback and suggestions!

I would love to connect with you at Twitter | LinkedIn.

Originally published at https://virendraoswal.com.