Quantcast
Channel: CodeSection,代码区,Python开发技术文章_教程 - CodeSec
Viewing all articles
Browse latest Browse all 9596

A script for backing up Tumblr posts and likes →

$
0
0
backup_tumblr

This is a set of scripts for downloading your posts and likes from Tumblr.

The scripts try to download as much as possible, including:

Every post and like All the metadata about a post that's available through the Tumblr API Any media files attached to a post (e.g. photos, videos)

I've had these for private use for a while, and in the wake of Tumblr going on a deletion spree, I'm trying to make them usable by other people.


A script for backing up Tumblr posts and likes →

Pictured: a group of Tumblr users fleeing the new content moderation policies. Image credit: Wellcome Collection , CC BY.

Getting started

Install python 3.6 or later. Instructions on the Python website .

Check you have pip installed by running the following command at a command prompt:

$ pip3 --version pip 18.1 (python 3.6)

If you don't have it installed or the command errors, follow the pip installation instructions

Clone this repository:

$ git clone git@github.com:alexwlchan/backup_tumblr.git $ cd backup_tumblr

Install the Python dependencies:

$ pip3 install -r requirements.txt

Get yourself a Tumblr API key by registering an app at https://www.tumblr.com/oauth/apps .

You need the OAuth Consumer Key from this screen:


A script for backing up Tumblr posts and likes →
Usage

There are three scripts in this repo:

save_posts_metadata.py save_likes_metadata.py save_media_files.py

They're split into separate scripts because saving metadata is much faster than media files.

You should run (1) and/or (2), then run (3). Something like:

$ python3 save_posts_metadata.py $ python3 save_likes_metadata.py $ python3 save_media_files.py

If you know what command-line flags are: you can pass arguments (e.g. API key) as flags. Use --help to see the available flags.

If that sentence meant nothing: don't worry, the scripts will ask you for any information they need.

Unanswered questions and notes

I have no idea how Tumblr's content blocks interact with the API, or if blocked posts are visible through the API.

I've seen mixed reports saying that ordering in the dashboard has been broken for the last few days. Again, no idea how this interacts with the API.

Media files can get big. I have ~12k likes which are taking ~9GB of disk space. The scripts will merrily fill up your disk, so make sure you have plenty of space before you start!

These scripts are provided "as is". File an issue if you have a problem, but I don't have much time for maintenance right now.

Sometimes the Tumblr API claims to have more posts than it actually returns, and the effect is that the script appears to stop early, e.g. at 96%.

I'm reading the total_posts parameter from the API responses, and paginating through it as expected -- I have no idea what causes the discrepancy.

Acknowledgements

Hat tip to @cesy for nudging me to post it, and providing useful feedback on the initial version.

Licence

MIT.


Viewing all articles
Browse latest Browse all 9596

Trending Articles