Recently I decided to get rid of all of my old tweets, specifically all of them from last year and before. I had no use for most of them and curating them would have been too much of a burden (over 66.000 tweets! so much procrastination!).
Now, there are a number of online applications to do that, but they have at least one of the following problems, fundamentally the last one:
- They pull your Twitter posts from the API, which only allows you to read at most the latest 200 tweets, so removing the older ones becomes impossible.
- Some of them get around this by asking you to upload your entire Twitter archive…
which contains a lot more than just your tweets (i.e. your private messages). (EDIT: I’m being told this is no longer the case, now it just contains your public timeline) - I don’t trust them.
So naturally I rolled my own. The code is crude but it worked for me. It uses the Twitter archive zip file as well, but since it works locally you don’t need to trust a third party with your personal data. With this I managed to delete over 60.000 tweets in a day and a half, roughly – it can’t be done much faster because of the API rate limiting, but then again, what’s the rush? 🙂
Now, to use it, follow these steps:
- Get your Twitter archive. You can do this by going to Settings -> Account -> Your Twitter archive. This will send you an email with a download link to a zip file.
- Get my script and place it in the same directory as the zip file you just downloaded.
- Go to https://apps.twitter.com/ and create a new application. This will get you the consumer key and the consumer secret, take note of those.
- Authorize your app to access your own account (you do it in the same place right after creating your new app). Now you have the access token key and secret, take note of those too.
- Edit the script and add all those tokens and secrets at the beginning of the file. Add the name of the zip file as well. Since you’re at it, review the source code – you shouldn’t be running code you downloaded from some random blog without reading it first! 😉
- Run the script. This will take some time. Lock your PC. Get off the chair. Go out. Enjoy the real world. It’s the ultimate 3D experience, and it comes in HD!
Download
twitter_cleanup.py
Source
#!/usr/bin/env python
import anyjson
import twitter
import zipfile
# Insert your API secrets here.
api = twitter.Api(
consumer_key='rhYvKSSciMHA9c8Dnp6OjzhRn',
consumer_secret='1vRFTts73Mkaq2dgAGf5XYamKWHN10EihOH1zbJwp2U2TKzET1',
access_token_key='37139784-SsXYm6jU5xqifX1hKiNfqynTEzRAoHKFPJfb1Wgbw',
access_token_secret='bh71h9S17lvDxdRV6TrgAuNUlGc3EkwSDDSu8vcfBFn3n',
)
# Insert the name of the zip file with your Twitter archive here.
twitter_archive = 'twitter.zip'
# If you had to interrupt the script, just put the last
# status ID here and it will resume from that point on.
last = 0
def read_fake_json(zip, filename):
data = zip.open(filename, 'rU').read()
first_line, data = data.split("\n", 1)
first_line = first_line.split("=", 1)[1]
data = first_line + "\n" + data
return anyjson.deserialize(data)
def parse_tweets_zipfile(filename):
print "Parsing file: %s" % filename
tweet_ids = {}
with zipfile.ZipFile(filename, 'r') as zip:
tweet_index = read_fake_json(zip, 'data/js/tweet_index.js')
for item in tweet_index:
tweets_this_month = read_fake_json(zip, item['file_name'])
assert len(tweets_this_month) == item['tweet_count']
tweet_ids["%d/%02d" % (item['year'], item['month'])] = [x['id'] for x in tweets_this_month]
return tweet_ids
if __name__ == "__main__":
begin = False
tweet_ids = parse_tweets_zipfile(twitter_archive)
for date in sorted(tweet_ids.keys(), reverse=True):
year, month = date.split("/")
if int(year) < 2016:
print "Deleting tweets from: %s" % date
for tid in tweet_ids[date]:
if begin or last == 0 or tid == last:
begin = True
error_counter = 0
while True:
try:
api.DestroyStatus(tid)
print "%d: DELETED" % tid
break
except twitter.error.TwitterError, e:
try:
message = e.message[0]['message']
retry = False
except:
message = repr(e.message)
retry = True
print "%d: ERROR %s" % (tid, message)
error_counter += 1
if error_counter > 5:
print "Too many errors, aborting!"
exit(1)
if not retry:
break
I am getting ERROR: Bad Authentication Data.
can you tell me how to fix it?
Comment by Mr Robot — January 24, 2018 @ 3:41 pm
Thank you i got it 🙂
I had an extra space in my keys…
Comment by Mr Robot — January 24, 2018 @ 3:50 pm