web | Breaking Code

April 4, 2016

How to clean up your Twitter account

Filed under: Privacy, Programming, Tools — Tags: LinkedIn, python, tool, Twitter, web, webapp — Mario Vilas @ 5:47 am

Recently I decided to get rid of all of my old tweets, specifically all of them from last year and before. I had no use for most of them and curating them would have been too much of a burden (over 66.000 tweets! so much procrastination!).

Now, there are a number of online applications to do that, but they have at least one of the following problems, fundamentally the last one:

They pull your Twitter posts from the API, which only allows you to read at most the latest 200 tweets, so removing the older ones becomes impossible.
Some of them get around this by asking you to upload your entire Twitter archive… ~~which contains a lot more than just your tweets (i.e. your private messages)~~. (EDIT: I’m being told this is no longer the case, now it just contains your public timeline)
I don’t trust them.

So naturally I rolled my own. The code is crude but it worked for me. It uses the Twitter archive zip file as well, but since it works locally you don’t need to trust a third party with your personal data. With this I managed to delete over 60.000 tweets in a day and a half, roughly – it can’t be done much faster because of the API rate limiting, but then again, what’s the rush? 🙂

(more…)

Comments (2)

September 19, 2012

Cheating on XKCD

Filed under: Just for fun — Tags: LinkedIn, python, web — Mario Vilas @ 3:23 pm

In case you missed it, today’s XKCD comic titled Click and Drag is simply amazing! Go check it out first, spend a few hours lost in it, and come back only when you’re done having fun. I’ll wait here. 🙂

…

Ok, you’re back. Naturally you’ll want to cheat on it at some point, to make sure you didn’t miss out on any hidden easter eggs! So let’s take a look at the web page.

The easiest route is loading the comic on Google Chrome, or Chromium. Just right click on the image and select “inspect element”. This quickly reveals how the neat trick works.

Taking a peek under the hood…

The “world” is divided into tiles of fixed size, and at all times the page loads the tile you’re currently viewing and the surrounding ones, in order to seamlessly stitch them together when scrolling. The clickable area is a map and the coordinates are used to build the URL to the images, which always follows the same pattern (north, south, and east and west coordinates). Trying out a few numbers reveals the “north” coordinate goes from 1 to 5, the “east” coordinate goes from 1 to 48 and the “west” coordinate goes from 1 to 33. Not all coordinates seem to work around the edges of the world (north 2 west 5 doesn’t work for example) and I couldn’t get south to work with manual tries. I suppose a couple empty images are used for those (one for black and one for white) but I didn’t confirm it.

The first thing I tried was just accesing the parent directory to see if directory indexing was enabled, but no such luck. Instead, I wrote this quick and dirty script in Python to download all images, using urllib to download them and shutil to write them to disk. Missing tiles are simply skipped.

This should be enough to check for easter eggs, but it’d be interesting of someone assembles a big image containing all the tiles. Let me know if you do! 🙂

Update 1: I originally missed the east coordinate, so the script was updated to try and bruteforce in all directions 1 to 10 north and south, and 1 to 50 east and west. This means a lot more HTTP requests, so I also added a pause between them as good netizens should.

Update 2: This seems to be the complete list of valid image URLs.

Update 3: A commenter pointed out somebody did assemble the entire world image! Check it out here.

Update 4: @prigazzi on Twitter pointed out ~~this fully navegable map~~ as well, based on Google Maps. Check it out! It’s IMHO the best one yet. 🙂

Update 5: The previous link no longer works, but this works pretty much the same way: xkcd-map.rent-a-geek.de

(more…)

Comments (7)

June 29, 2010

Using Google Search from your Python code

Filed under: Tools, Web applications — Tags: Google, information gathering, LinkedIn, open source, python, recon, search, tool, web — Mario Vilas @ 6:31 pm

Hi everyone. Today I’ll be showing you a quick script I wrote to make Google searches from Python. There are previous projects doing the same thing -actually, doing it better-, namely Googolplex by Sebastian Wain and xgoogle by Peteris Krumins, but unfortunately they’re no longer working. Maybe the lack of complexity of this script will keep it working a little longer… 🙂

The interface is extremely simple, the module exports only one function called search().

        # Get the first 20 hits for: "Breaking Code" WordPress blog
        from googlesearch import search
        for url in search('"Breaking Code" WordPress blog', stop=20):
            print(url)

You can control which one of the Google Search pages to use, which language to search in, how many results per page, which page to start searching from and when to stop, and how long to wait between queries – however the only mandatory argument is the query string, everything else has a default value.

        # Get the first 20 hits for "Mariposa botnet" in Google Spain
        from googlesearch import search
        for url in search('Mariposa botnet', tld='es', lang='es', stop=20):
            print(url)

A word of caution, though: if you wait too little between requests or make too many of them, Google may block your IP for a while. This is especially annoying when you’re behind a corporate proxy – I won’t be made responsible when your coworkers suddenly develop an urge to kill you! 😀

EDIT (Jan 2017): Wow, this little code has expanded a lot since its creation. Now it’s an installable package and had contributions from many people. Thanks everyone! 🙂

Source code

Get the source code from GitHub: https://github.com/MarioVilas/googlesearch

Comments (143)

January 11, 2010

Having fun with URL shorteners

Filed under: Tools, Web applications — Tags: LinkedIn, python, tool, web, webapp — Mario Vilas @ 2:14 am

I’ve taken some interest in URL shorteners recently. URL shorteners are the latest fad in web services – everybody wants to have their own, even if it’s not yet entirely clear how to profit from it, at least not for us mere mortals. They were born out of Twitter users need to compress their microblogging posts (some links are actually longer than 160 characters, believe it or not) and it’s spread everywhere on the Internet. Major web sites like Facebook, Google and Youtube are into it and -some say- it’s one step closer to destroying the Web as we know it.

Of course, this can also be used for evil purposes. 🙂

So I made myself a Python module to toy with these things. It can be used as a command line tool as well as a module to be imported in projects of your own, and contains no non-standard dependencies. You can download it from here: shorturl.py.

For example, this is how you convert a long URL into a short one (the shortest possible choice is automatically selected):

$ ./shorturl.py http://www.example.com/ http://u.nu/63e

And it’s reverse, converting that short URL back into the long one:

$ ./shorturl.py -l http://u.nu/63e http://www.example.com/

You can also choose a specific URL shortener service. Let’s try Twitter’s new favorite, bit.ly:

$ ./shorturl.py http://www.example.com/ -u bit.ly http://bit.ly/3hDSUb

Now, the fun stuff 🙂 what would happen if we try to shorten, say, a javascript: link? We can find out with the -t switch (output edited for brevity):

$ python shorturl.py -t "javascript:alert('Pwn3d');" Testing bit.ly: RuntimeError: No data returned by URL shortener API Testing cru.ms: RuntimeError: <h3>URL has wrong format! Forgot 'http://'?</h3> Testing easyuri.com: Short [1]: http://easyuri.com/657ae Long [0]: http://javascript:alert('Pwn3d'); Testing tinyurl.com: Short [1]: http://tinyurl.com/yhabhth Long [0]: javascript:alert('Pwn3d'); Testing xrl.us: Short [1]: http://xrl.us/bgsge5 Long [0]: http://xrl.usjavascript:alert('Pwn3d');

Most services either report an error condition or simply don’t return anything at all. Two services return a garbage response (easyuri.com and xrl.us). But to my surprise TinyURL, the second largest URL shortener service is vulnerable to this! True, this is probably not news to many of you, and blocking this kind of links doesn’t provide that much of an advantage (clicking on a link with an unknown target is pretty much game over anyway) but I still fail to see the reason to support it in the first place. I can’t think of any legitimate reasons to shorten a javascript link.

As a matter of fact, TinyURL seems to accept anything you send it, may it look like a valid URL or not. This has lead some people to build a rogue filesystem called TinyDisk, world readable, yet very hard (if not impossible) to trace back to it’s creator. The download link seems to be down but the Slashdot article is still there. I also found this article explaining how such a filesystem would work.

Back on the matter, the way people react to URL shorteners make them perfect for an attacker to hide the true target of his/her exploits, or bypass spam filters as it’s being done in the wild for quite some time now. There are some countermeasures. Many URL shorteners (including TinyURL) have a preview feature to let you examine the URL target before going there (for example, http://preview.tinyurl.com/yhabhth shows the sample javascript code we used above). The Echofon plugin for Firefox is a Twitter client that makes heavy use of URL shorteners, and among other things it automatically expands short URLs when the mouse hovers over them. The longurl.com service offers URL expansion as well, both from the web and as a Firefox plugin.

But… how well do they behave when you shorten the URL multiple times? First, let’s try shortening a link with TinyURL three successive times:

$ ./shorturl.py -c 3 -u tinyurl.com http://www.example.com/ http://tinyurl.com/ya3k2yy

The preview feature only seems to work for the first redirection (see for yourself). Now let’s try shortening our example URL three times, using random providers:

$ ./shorturl.py http://www.example.com -v -c 3 Service: thurly.net Service: ito.mx Service: cru.ms http://cru.ms/53c14

I found the Echofon plugin fails to expand this short URL. However the longurl.com service works like a charm – it even told me how many redirections I used! Click here to see it for yourself.

Example of Echofon expanding a short URL

Just for fun, how about an infinite recursion? First, we create a link to a shortened URL that doesn’t yet exist.

$ python shorturl.py -v -c 5 http://tinyurl.com/thisisaninfiniteloop Service: ito.mx Service: migre.me Service: thurly.net Service: is.gd Service: xrl.us Service: onodot.com http://onodot.com/xeaw

Now let’s go to the TinyURL webpage and create the link manually. Voila! We have an infinite redirection. The longurl.com service seems to break now (see for yourself) but this is of no consequence – the link doesn’t really take you anywhere. Other apps like wget get stuck in an infinite loop however. This is probably of no real use to an attacker but I thought it was fun to try it out. 🙂

As a final note: while writing this I came across this article describing even more risks inherent to URL shorteners: information disclosure. A dedicated attacker could dump randomly chosen shortened URLs to retrieve information like corporate intranet links, user IDs and even passwords. I liked the idea, so I whipped out a small script to test it out:

    from random import randint
    from shorturl import longurl, is_short_url

    characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
    while 1:
        token = ''.join([ characters[ randint(0, len(characters) - 1) ] for i in xrange(6) ])
        try:
            url = longurl('http://bit.ly/%s' % token)
        except Exception:
            print token
            continue
        if not is_short_url(url):
            print "%s => %s" % (token, url)

But the results of bruceforcing didn’t seem so great. Then again, one could also wonder how random this short URLs really are… maybe there’s a way to predict them, or at least reduce the number of URLs one has to try. One could also try a dictionary based approach for services like bit.ly, which provide the users the ability to pick custom short URLs. Update: Those TinyURL links weren’t so random after all 😉 here’s an interesting project on URL scraping.