
I wrote about mine in past posts (here and here) but ever since PB has updated their design, my implementation has become broken. The method I used was no longer working… overall the tool was very popular, was reposted a few times and I received lots much valued feedback for it :) but I didn’t continue because I didn’t need it myself, among other things.
However my dear friend Daxda was kind enough to find a new way to do it and he re-wrote the tool in a more error-tolerant manner, that I believe should work for much longer than mine.
Right now it’s still in development stage but it works more or less and is very simple to use. It’s called “PB Shovel” (https://github.com/Daapii/pb_shovel).
Here’s an output of the script:
usage: pb_shovel.py [-h] [-r] [-o OUTPUT_DIRECTORY] [--omit-existing] [-v VERBOSE] (-f FILE | -u URLS [URLS ...]) [--images-only | --videos-only] [-n USERNAME] [-p PASSWORD] optional arguments: -h, --help show this help message and exit -r, --recursive Recursively extracts images and videos from all passed sources. -o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY The directory the extracted images getting saved in. --omit-existing -v VERBOSE, --verbose VERBOSE -f FILE, --file FILE A file containing one or more Photobucket links which you want to download. -u URLS [URLS ...], --urls URLS [URLS ...] One or more links which point to an album or image which is hosted on Photobucket. --images-only Do not download any other filetype besides image. --videos-only Do not download any other filetype besides video. Authentication: -n USERNAME, --username USERNAME The username or email which is used to authenticate with Photobucket. -p PASSWORD, --password PASSWORD The matching password for your account.
For a private album the format is like this:
python pb_shovel.py -u "password@http://photobucket.com/example"
To run it, you will need 2 libraries called “Requests” and “BeautifulSoup”, both for your convenience been uploaded here, simply extract the archive into the “PB Shovel” folder and done.
Edit 2014.04.17: added an option to skip existing images. They will be skipped if image names match. If that option is not set then the script will not replace or skip images. To skip images just add “–omit-existing” to the command line.
Edit 2014.04.07: lots of things have changed and a lot of things have been added. It’s now more functional than it was. Supports downloading from your own account, downloading locked (private) archives, you can choose what types of files (video or image) to get and download whole albums, including sub-albums.
So, to use it, typically you run this command:
python pb_shovel.py -u "http://photobucket.com/example" -r -o PB_Shovel
If anything comes up, please post here so the tool could be improved :)
I tried running the tool to pull an album (starting with a small one to test), and got the following output in the command line:
Estimated files in current album: 3
Traceback (most recent call last):
File “pb_shovel.py”, line 355, in
pb.extract()
File “pb_shovel.py”, line 132, in extract
image_links = self._album(source)
File “pb_shovel.py”, line 332, in _album
obj[“likeCount”], obj[“commentCount”],
KeyError: ‘likeCount’
I’m not sure why it seems to be choking on likeCount…
Hello Old Windways. The script has been updated just now. Try it again :)
If it breaks again, it would be very helpful if you could share the album URL so we could see for ourselves.
I had to install beautifulsoup4, but it seems to be working now (both using a single link, and with a file listing 4 links)! Hooray!!
Hi, me again.
I noticed your script is not downloading the original size of photos.
~original
The original size path should be
By the way, the extension I made has a free version.
It limited to 100 photos/album and compressed size of photos(fullsizeUrl)
http://goo.gl/2aGxLC
The extension is totally Javascript, but it will not open source
until Photobucket officially released the download album function :)
AFAIK PB already has an album downloading feature, however only for your own albums.
Thanks for the tips though, we’ll see what can be done :)
Since the author is too shy to post here, I’ll say that the script has been updated to download full size images now and connection timeout increased for slow internet speeds. Enjoy! :)
Hi just stopped by to say thanks, you just saved me hours of boring work.
Keep rocking
Hey you’re welcome :) keep checking, the author might include some great features in the future ;)
Any way to skip already downloaded images?
Hey ripper,
Such function isn’t available right now, but Daxda is working on that and should be implemented in a few days or something :)
Script has been updated.
Added an option to skip existing images. They will be skipped if image names match. If that option is not set then the script will not replace or skip images. To skip images just add “–omit-existing” to the command line.
Does PB Shovel work for albums that are private and don’t have input form for password? Do you have to get password for private albums?
@KL: It should work for all kinds of private albums. If the album is also locked with the password, it will have to be provided. PB shovel doesn’t crack them :P
Just wanted to say “thanks” where it’s due. In this case, it’s thanks to YOU (and “Daxda” maybe) for leading me to how to login to guest-pass protected albums. I hadn’t really been sweating it much since most people tend not to have guest passes to use. But it’s a nice feature to have either way.
I’m not using Python, so it took a bit of translation. But your code was clean and well-commented which made life very easy. :-)
If you can compile this (so people don’t need to download a Python interpreter and various libraries) and put it inside a nice GUI … in other words dummy-proofing it for the general public … you might be able to compete with me. ;-)
Cheers!
Oh, and if anyone out there DOES have the hacking wizardry skills to crack private albums and is interested in being buried in money, feel free to click on my name above to go to my home page and click on the “Contact” button. My old hacking partner decided to take his ball and go home (immigrated to the US and left bucket hacking behind). So I am searching for a new partner.
I’m a developer primarily and NOT a hacker. I’ll admit it; no shame in my game. ;-)
@phatWares: Daxda is the author, and about making a GUI, well that might come in the future, however the tool wasn’t intended to be the best one out there, it just works very well :)
Btw you seem like a knowledgeable dude, if you don’t mind, hop onto Evilzone’s irc channel (irc.evilzone.org, #evilzone) to chat and maybe find fellow devs :)
So I couldn’t actually get this to download any private albums. It would connect and all that but it would just say download files 0/0. I tried a bunch of private profiles but no luck. Does it no longer work?
AFAIK it should still be working. It would help a lot if you could post some album that doesn’t work for you :)
Yea im having the same 0/0 issue. Not sure if private albums need a specific url or not. No issue with publics. This returned no results “http://s69.photobucket.com/user/xxxxx/library/”
Any words on an update for this problem of 0/0 downloads?
If I may ask how are you invoking the tool? obviously to download a private album you need to provide a password for it.
For guest albums command is:
python pb_shovel.py -u “password@http://photobucket.com/example”
And to download a completely private album, you need to use -n (username) and -p (password) switches, otherwise you get the same 0/0 issue.
^yea I think guest pass isn’t working as a whole for pb.com. doesn’t work through the webpage either
Well then we’ll have to wait and see :)
Hey could anyone “shovel” this for me? http://s275.photobucket.com/user/vanessa60201/library/
No one can unless you have a password for it…
does this still work I was using your old one just started getting back into PB. I only need it to download complete buckets because the download button sometimes is stuck in a loop
My old one doesn’t work AFAIK, however this one (PB Shovel) was working last time I checked. Try it and see.
do you have a tutorial on how to make this work? not my field of expertise python wise
You do not need a tutorial to make this work. Install python and run
from command line
I wrote that at the very end of the post… You have to be completely clueless if you can’t understand from here.
Just keep getting syntax errors from ‘pb_shovel.py’ :[
Would be nice if you posted those syntax errors so I can help more.
Btw, The author of this tool has deprecated it and I’m afraid but he’s no longer going to work on it :( however it should still be working?
Wish I could help more but I’m not a tech person to be honest. One of the first times I’m using the cmd line to do anything. I installed python fine, According to your last post I should input this into the command line [python pb_shovel.py -u “http://photobucket.com/example” -r -o PB_Shovel] substituting of course, the actual photobucket address I want to use.
It says file (stdin) line 1.
Invalid syntax
With Python you also need to install 2 other libraries, that might be the cause. I repacked those libs for convenience, you can download them from here and extract that to PB Shovel folder.
Edit: Just tried it, still works good. You are most likely missing the libs, which means you didn’t read my post to the end :)
Still can’t get it to work. I’m working on a mac and I downloaded canopy. Then I opened shovel.
In the top box I get a bunch of code and in the lower box is where I can type in the request.
I have one folder inthere I have the file pb°showel.py and I have 2 folders bs4 and request.
when I enter
python pb_shovel.py -u http://s1237.photobucket.com/user/fcknbrii/library/ -r -o PB_Shovel in the lower box I get syntaxerror: invalid syntax same for
python pb_shovel.py -u http://s1237.photobucket.com/user/fcknbrii/library/ and
-u http://s1237.photobucket.com/user/fcknbrii/library/
Any suggestions ?
You need to put the URL in double quotes, like so:
That’s essentially how the command line works :)
this is what i’m getting (btw thanks for the fast reply)
This is a really weird problem. Might be related to that Canopy thing you use, why are you even using that? Just Python and Command line is enough…
I am getting the following error:
File “pb_shovel.py”, line 6, in from urlparse import urljoin, urlparse ImportError: No module named ‘urlparse’
Then you have a flawed Python installation. Reinstall and try again, using Python 2.7.
do you know how to get it to work on a mac? I’m using python 2.7.8
Like I said… uninstall whatever you installed that is supposed to be Python and install normal Python: https://www.python.org/downloads/mac-osx/
Then try again.
Hi,
Is there an argument to download full size files? I’m only getting the compressed versions. I can click through on the webpage to the fullsizes. Thanks.
Uhm… AFAIK it gets full-sized pics, if not then too bad.
I’ve actually read and followed your instructions quite a few times. I always take my time to read over and I’m still stumped. I have installed the libs into the folder and still cant get it.
http://imgur.com/Gn6OxyA – Installation folder
http://imgur.com/Z9Tozk6 – The Invalid syntax error
The problem here is that you launch the interactive Python shell before running the command… Don’t run python, just paste that command. Also you have Python 3.4, when I said it over and over that you need Python 2.7.
Un-Installed 3.4 and Installed 2.7.
Now I’m getting this
http://imgur.com/WWbJDBi
This is from the command line, I stopped running python prior the command
Jesus man… you need to CD into the dir before you can use it…
Just an FYI, this tool isn’t for completely clueless people. At this point I’ll tell you to RTFM, please, if you can’t figure this stuff out.
Sorry. I did say earlier I had never used command prompt before, but I figured out what you meant. It’s now working, but I did not read the comments previously where people were 0/0 downloads out of private photobuckets without the password. But thanks for your patience. Much appreciated!
I’ve just tried this today and I’m having an odd issue. I am using the -r recursive switch and then just -u and the URL of a user — something like “http://s105.photobucket.com/user/BlahBlah/library/?sort=3&page=1”
When I do this, it says “Processing” the URL and then says a quick summary:
Images: 273
Sub-albums: 0
Videos: 17
It then loops forever doing:
Collected links: 24Images: 273
Sub-albums: 0
Videos: 17
Then Collected links: 48″… Then 72… And so on.
The problem is that is goes forever. It never seems to find the end of the album even when it passes the 290 mark (which is the 273 images plus the 17 videos). It never stops.
So, after I know it is past the 290 mark, I send a Cntl-C and it then says:
Downloaded files: 0/600 (where 600 is where I finally stopped it from looping)
It will then download the entire thing — so it “works” — but it just gets confused once it gets past the 290 actual files.
When it finishes downloading the actual files, it gives an error:
‘page’ is not recognized as an internal or external command.
Basically, it looks like it gets confused looking for some keyword and then eventually tries to actually execute the word “page”.
Also, taking a quick look at the code, it seems to be looping looking for “end of album” to exist in the HTML. But, looking at the PB source, I don’t see that that ever occurs (which would explain the endless looping).
Maybe an update is needed?
EDIT: Upon further review, it looks to me like it actually only does the FIRST page of downloads — then tries to do them over and over again. So, something is definitely broken with the paging — it doesn’t find the “next” page to get more images — and it doesn’t find the “last” page to know when to stop looking.
Thank you for your very informative reply MTAK :) Yes, the tool could be broken, I wouldn’t be surprised. Parsing HTML is a tricky thing and PB with this whole new design is bound to change some things around from time to time and when that happens, tools like this tends to usually break as well. I know that very well :P
However, neither me nor the original author Daxda has time to work on it and fix it. I believe it would be an easy fix for someone who knows Python however…
Maybe you could give it a go MTAK? :) clone the project from github.