Breaking Code

April 4, 2016

How to clean up your Twitter account

Filed under: Privacy, Programming, Tools — Tags: , , , , , — Mario Vilas @ 5:47 am

Recently I decided to get rid of all of my old tweets, specifically all of them from last year and before. I had no use for most of them and curating them would have been too much of a burden (over 66.000 tweets! so much procrastination!).

Now, there are a number of online applications to do that, but they have at least one of the following problems, fundamentally the last one:

  • They pull your Twitter posts from the API, which only allows you to read at most the latest 200 tweets, so removing the older ones becomes impossible.
  • Some of them get around this by asking you to upload your entire Twitter archive… which contains a lot more than just your tweets (i.e. your private messages). (EDIT: I’m being told this is no longer the case, now it just contains your public timeline)
  • I don’t trust them.

So naturally I rolled my own. The code is crude but it worked for me. It uses the Twitter archive zip file as well, but since it works locally you don’t need to trust a third party with your personal data. With this I managed to delete over 60.000 tweets in a day and a half, roughly – it can’t be done much faster because of the API rate limiting, but then again, what’s the rush? 🙂

Now, to use it, follow these steps:

  1. Get your Twitter archive. You can do this by going to Settings -> Account -> Your Twitter archive. This will send you an email with a download link to a zip file.
  2. Get my script and place it in the same directory as the zip file you just downloaded.
  3. Go to https://apps.twitter.com/ and create a new application. This will get you the consumer key and the consumer secret, take note of those.
  4. Authorize your app to access your own account (you do it in the same place right after creating your new app). Now you have the access token key and secret, take note of those too.
  5. Edit the script and add all those tokens and secrets at the beginning of the file. Add the name of the zip file as well. Since you’re at it, review the source code – you shouldn’t be running code you downloaded from some random blog without reading it first! 😉
  6. Run the script. This will take some time. Lock your PC. Get off the chair. Go out. Enjoy the real world. It’s the ultimate 3D experience, and it comes in HD!

Download

twitter_cleanup.py

Source

    #!/usr/bin/env python

    import anyjson
    import twitter
    import zipfile

    # Insert your API secrets here.
    api = twitter.Api(
        consumer_key='rhYvKSSciMHA9c8Dnp6OjzhRn',
        consumer_secret='1vRFTts73Mkaq2dgAGf5XYamKWHN10EihOH1zbJwp2U2TKzET1',
        access_token_key='37139784-SsXYm6jU5xqifX1hKiNfqynTEzRAoHKFPJfb1Wgbw',
        access_token_secret='bh71h9S17lvDxdRV6TrgAuNUlGc3EkwSDDSu8vcfBFn3n',
    )

    # Insert the name of the zip file with your Twitter archive here.
    twitter_archive = 'twitter.zip'

    # If you had to interrupt the script, just put the last
    # status ID here and it will resume from that point on.
    last = 0

    def read_fake_json(zip, filename):
        data = zip.open(filename, 'rU').read()
        first_line, data = data.split("\n", 1)
        first_line = first_line.split("=", 1)[1]
        data = first_line + "\n" + data
        return anyjson.deserialize(data)

    def parse_tweets_zipfile(filename):
        print "Parsing file: %s" % filename
        tweet_ids = {}
        with zipfile.ZipFile(filename, 'r') as zip:
            tweet_index = read_fake_json(zip, 'data/js/tweet_index.js')
            for item in tweet_index:
                tweets_this_month = read_fake_json(zip, item['file_name'])
                assert len(tweets_this_month) == item['tweet_count']
                tweet_ids["%d/%02d" % (item['year'], item['month'])] = [x['id'] for x in tweets_this_month]
        return tweet_ids

    if __name__ == "__main__":
        begin = False
        tweet_ids = parse_tweets_zipfile(twitter_archive)
        for date in sorted(tweet_ids.keys(), reverse=True):
            year, month = date.split("/")
            if int(year) < 2016:
                print "Deleting tweets from: %s" % date
                for tid in tweet_ids[date]:
                    if begin or last == 0 or tid == last:
                        begin = True
                        error_counter = 0
                        while True:
                            try:
                                api.DestroyStatus(tid)
                                print "%d: DELETED" % tid
                                break
                            except twitter.error.TwitterError, e:
                                try:
                                    message = e.message[0]['message']
                                    retry = False
                                except:
                                    message = repr(e.message)
                                    retry = True
                                print "%d: ERROR   %s" % (tid, message)
                                error_counter += 1
                                if error_counter > 5:
                                    print "Too many errors, aborting!"
                                    exit(1)
                                if not retry:
                                    break
Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: