1. Overview

In this article, we’re going to show how to expand URLs using HttpClient.

A simple example is when the original URL has been shortened once – by a service such as bit.ly.

A more complex example is when the URL has been shortened multiple times, by different such services, and it takes multiple passes to get to the original full URL.

If you want to dig deeper and learn other cool things you can do with the HttpClient – head on over to the main HttpClient tutorial.

2. Expand the URL Once

Let’s start simple, by expanding a URL that has only been passed through a shorten URL service once.

The first thing we’ll need is an HTTP client that doesn’t automatically follow redirects:

CloseableHttpClient client = 
  HttpClientBuilder.create().disableRedirectHandling().build();

This is necessary because we’ll need to manually intercept the redirect response and extract information out of it.

We start by sending a request to the shortened URL – the response we get back will be a 301 Moved Permanently.

Then, we need to extract the Location header pointing to the next, and in this case – the final URL:

private String expandSingleLevel(final String url) throws IOException {
    try {
        HttpHead request = new HttpHead(url);
        String expandedUrl = httpClient.execute(request, response -> {
            final int statusCode = response.getCode();
            if (statusCode != 301 && statusCode != 302) {
                return url;
            }
            final Header[] headers = response.getHeaders(HttpHeaders.LOCATION);
            Preconditions.checkState(headers.length == 1);

            return headers[0].getValue();
        });
        return expandedUrl;
    } catch (final IllegalArgumentException uriEx) {
        return url;
    }
}

Finally, a simple live test with an “un-shortened” URL:

@Test
public final void givenShortenedOnce_whenUrlIsExpanded_thenCorrectResult() throws IOException {
    final String expectedResult = "https://www.baeldung.com/rest-versioning";
    final String actualResult = expandSingleLevel("http://bit.ly/3LScTri");
    assertThat(actualResult, equalTo(expectedResult));
}

3. Process Multiple URL Levels

The problem with short URLs is that they may be shortened multiple times, by altogether different services. Expanding such an URL will need multiple passes to get to the original URL.

We’re going to apply the expandSingleLevel primitive operation defined previously to simply iterate through all the intermediary URLs and get to the final target:

public String expand(String urlArg) throws IOException {
    String originalUrl = urlArg;
    String newUrl = expandSingleLevel(originalUrl);
    while (!originalUrl.equals(newUrl)) {
        originalUrl = newUrl;
        newUrl = expandSingleLevel(originalUrl);
    }
    return newUrl;
}

Now, with the new mechanism of expanding multiple levels of URLs, let’s define a test and put this to work:

@Test
public final void givenShortenedMultiple_whenUrlIsExpanded_thenCorrectResult() throws IOException {
    final String expectedResult = "https://www.baeldung.com/rest-versioning";
    final String actualResult = expand("http://t.co/e4rDDbnzmk");
    assertThat(actualResult, equalTo(expectedResult));
}

This time, the short URL – http://t.co/e4rDDbnzmk – which is actually shortened twice – once via bit.ly and a second time via the t.co service – is correctly expanded to the original URL.

4. Detect on Redirect Loops

Finally, some URLs cannot be expanded because they form a redirect loop. This type of problem would be detected by the HttpClient, but since we turned off the automatic follow of redirects, it no longer does.

The final step in the URL expansion mechanism is going to be detecting the redirect loops and failing fast in case such a loop occurs.

For this to be effective, we need some additional information out of the expandSingleLevel method we defined earlier – mainly, we need to also return the status code of the response along with the URL.

Since java doesn’t support multiple return values, we’re going to wrap the information in an org.apache.commons.lang3.tuple.Pair object – the new signature of the method will now be:

public Pair<Integer, String> expandSingleLevelSafe(String url) throws IOException {

And finally, let’s include the redirect cycle detection in the main expand mechanism:

public String expandSafe(String urlArg) throws IOException {
    String originalUrl = urlArg;
    String newUrl = expandSingleLevelSafe(originalUrl).getRight();
    List<String> alreadyVisited = Lists.newArrayList(originalUrl, newUrl);
    while (!originalUrl.equals(newUrl)) {
        originalUrl = newUrl;
        Pair<Integer, String> statusAndUrl = expandSingleLevelSafe(originalUrl);
        newUrl = statusAndUrl.getRight();
        boolean isRedirect = statusAndUrl.getLeft() == 301 || statusAndUrl.getLeft() == 302;
        if (isRedirect && alreadyVisited.contains(newUrl)) {
            throw new IllegalStateException("Likely a redirect loop");
        }
        alreadyVisited.add(newUrl);
    }
    return newUrl;
}

And that’s it – the expandSafe mechanism is able to expand URL going through an arbitrary number of URL shortening services, while correctly failing fast on redirect loops.

5. Conclusion

This tutorial discussed how to expand short URLs in java – using the Apache HttpClient.

We started with a simple use case with a URL that is only shortened once and then implemented a more generic mechanism, capable of handling multiple levels of redirects and detecting redirect loops in the process.

The implementation of these examples is available over on GitHub.


» 下一篇: Hibernate映射异常