Unshorten URLs with PHP and cURL

While working on a site that provides previews of URLs embedded in tweets using the awesome PhantomJS scriptable WebKit browser, and encountered difficulties when an URL shortener such as Bit.ly was used (as is almost always done when tweeting out a link to an interesting article or photo).

After a little experimenting, I discovered that cURL makes it super easy to un-shorten a URL that has been shortened, without depending on a third-party service.

Improving slightly on this function, here’s how I did it in PHP:

function unshorten_url($url) {
    $ch = curl_init($url);
    curl_setopt_array($ch, array(
        CURLOPT_FOLLOWLOCATION => TRUE,  // the magic sauce
        CURLOPT_RETURNTRANSFER => TRUE,
        CURLOPT_SSL_VERIFYHOST => FALSE, // suppress certain SSL errors
        CURLOPT_SSL_VERIFYPEER => FALSE, 
    ));
    curl_exec($ch);
    $url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
    curl_close($ch);
    return $url;
}

That’s all you have to do. This works even if multiple URL shorteners are used in sequence on the same link, and in the event of an error, you’ll just get the same link back that you started with.

This seems to be a bit more robust than some of the other solutions floating around the ‘net, since it doesn’t try to hack the response headers or body (which could change without notice), and it will recurse as many times as it needs to up to the value of the CURLOPT_MAXREDIRS setting.

Enjoy!

7 thoughts on “Unshorten URLs with PHP and cURL

  1. Tim

    Thanks a bunch for the function Jon! This is exactly what I am needing, I am noticing that it does not work with Dropbox’s db.tt links. I tried a couple unshortener services on one and some did not work either, but unshorten.me does. Any thoughts?

  2. Jonathon Post author

    Can you give me an example link that it doesn’t work with? I’ll have to reverse-engineer their service a bit before I can recommend anything.

  3. Diogo Soares

    Hi, Jonathon!

    Is there any limit of requests?

    I did some requests in the same URLs, but sometimes returns the same short URL (the event of an error) or another short URL (from another short url server). Eventually returns to the original URL.

    I noticed which some users reported an inaccuracy at original URL detection. Have you noticed this strange behavior before? Is there any clue to solve it?

  4. Diogo Soares

    I recovered the http status of the request and sometimes the connection timeout expires at “t.co” server (my short URLs are from twitter)

  5. Wayde Tse

    Thanks a lot but what does `CURLOPT_SSL_VERIFYHOST => FALSE` mean? The manual said that `CURLOPT_SSL_VERIFYHOST` should be an integer:

    1 to check the existence of a common name in the SSL peer certificate. 2 to check the existence of a common name and also verify that it matches the hostname provided. In production environments the value of this option should be kept at 2 (default value).

  6. Pingback: Unshorten t.cn and t.co URLs in WordPress | Half Yuan Party

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>