-
-
Notifications
You must be signed in to change notification settings - Fork 939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions, Feedback, and Suggestions #4 #5262
Comments
For most sites I'm able to sort files into year/month folders like this:
However for redgifs it doesn't look like there's a date keyword available for |
There's a typo in
|
There's also another typo in |
Can you grab all the media from quoted tweets? Example. |
#5262 (comment) It's implemented as a search for 'quoted_tweet_id:…' on Twitter.
#5262 (comment) This on was on the same line as the previous one ... (9fd851c)
Regarding typos, thanks for pointing them out. @biggestsonicfan |
EDIT: Actually, I think there's just something wrong with that URL. I had it saved for a long time and searching that tag normally gives a different URL ( |
You could use |
Is there support to remove metadata like this?
Post-processor: "filter-metadata":
{
"name": "metadata",
"mode": "delete",
"event": "prepare",
"fields": ["preview[images][0][resolutions]"]
} I've tried a few variations but no dice. "fields": ["preview[images][][resolutions]"] "fields": ["preview[images][N][resolutions]"] "fields": ["preview['images'][0]['resolutions']"] |
Hello, I left a comment in #4168 . Does the |
@taskhawk def remove_resolutions(metadata):
for image in metadata["preview"]["images"]:
del image["resolutions"] (untested, might need some check whether @YuanGYao |
@mikf |
Not sure if I'm missing something, but are directory specific configurations exclusive to running gallery-dl via the executable? Basically, I have a directory for regular tags, and a directory for artist tags. For regular tags I use So right now the only way I know to get this per-directory configuration to work, is to copy the gallery-dl executable everywhere I want to use a master configuration override. Am I missing something? It feels like there should be a better way. |
Huh? No, the configuration works always in the same way. You're simply using different configuration files? |
From the readme:
I want to override my master configuration |
You can load additional configuration files from the console with:
You just need to specify the path to the file and any options there will overwrite your main configuration file. Edit: From my understanding, yeah, automatic loading of local config files in each directory is only possible having the standalone executable in each directory. Are different directory options the only thing you need? |
Thanks, that's exactly what I was looking for! Guess I didn't read the documentation thoroughly enough. For now the only thing I'd want to override is the directory structure for artist tags. I don't think it's possible to determine from the metadata alone if a given tag is the name of an artist or not, so I thought the best way to go about it is to just have a separate directory for artists, and use a configuration override. So yeah, loading that override with the -c flag works great for that purpose, thanks again! |
You kinda can, but you need to enable "gelbooru": {
"directory": {
"search_tags in tags_artists": ["{category}", "{search_tags[0]!u}", "{search_tags}", "{date:%Y}", "{date:%m}"],
"" : ["{category}", "{search_tags}", "{date:%Y}", "{date:%m}"]
},
"tags": true
}, Set Of course, this depends on the artists being correctly tagged. Not sure if it happens on Gelbooru, but at least in other boorus and booru-like sites I've come across posts with the artist tagged as a general tag instead of an artist tag. Another limitation is that your search tag can only include one artist at a time, doing more will require a more complex expression to check all tags are present in What I do instead is that I inject a keyword to influence where it will be saved, like this:
And in my config I have "gelbooru": {
"directory": ["boorus", "{search_tags_type}", "{search_tags}"]
}, You can have: "gelbooru": {
"directory": {
"search_tags_type == 'artists'": ["{category}", "{search_tags[0]!u}", "{search_tags}", "{date:%Y}", "{date:%m}"],
"" : ["{category}", "{search_tags}", "{date:%Y}", "{date:%m}"]
}
}, You can do this for other tag types, like general, copyright, characters, etc. Because it's a chore to type that option every time I made a wrapper script, so I just call it like this because artists is my default:
For other tag types I can do:
|
Thanks for pointing out there's a tags option available for the gelbooru extractor. I already used it in the kemono extractor to get the name of the artist, but it didn't occur to me that gelbooru might also have such an option (and just accepted that the tags aren't categorized). For artists I store all the url's in their respective gelbooru.txt, rule34.txt, etc files like so:
And then just run |
When I'm making an extractor, what do I do if the site doesn't have different URL patterns for different page types? Every single page is just a numerical ID that could be a forum post, image, blog post, or something completely different. |
@Wiiplay123 You handle everything with a single extractor and decide what type of result to return on the fly. The |
Hi, what options should I use in my config file to change the format of dates in metadata files? I would like to use And would it also be possible to do this for json files that ytdl creates? I downloaded some videos with gallery-dl but the dates got saved as |
How do I make sure reddit GIFs are downloaded using yt-dlp? I've set extractor.reddit.videos to "ytdl", but it looks like gallery-dl still downloads reddit animated GIFs directly as if they were images. yt-dlp does seem to work correctly, because some redgifs videos are downloaded to a ytdl folder (although most are downloaded using the redgifs downloader). I want to download .mp4 files instead of .gif and I've set the appropriate ytdl format settings under extractor.ytdl.format (also under reddit>ytdl) but it seems reddit GIF urls are not being sent to ytdl. |
I have put an ffmpeg postprocessor in gallery-dl to convert the GIFs locally, which works, but now I'm having issues with my archive database. For 32 subreddits, gallery-dl only downloads new files as expected. But for 3 subreddits some 300-400 files are downloaded every time, going back months and for these, newly downloaded files are not being saved to the database. This is with I thought something in the database had been corrupted, but Using a backup sqlite3 archive from 3 weeks ago, only new files from the past 3 weeks are downloaded for those 3 "broken" subreddits, but newly downloaded files still are not saved to the database. What's going on? Edit 4 (or 5? I lost track) - I think I have it fixed now. I set |
If I import gallery-dl as a python dependency, can I use it's filename sanitation function? |
I use a shadowsocks proxy to circumvent bans in my country. All of the sites can be viewed in browser, but gallery-dl throws errors for some of them (like kemono and furaffinity). |
@biggestsonicfan Yes, you can import and use functions from gallery-dl just as you like. |
You need to be more specific. What's your command? g-dl supports |
@fireattack, I link proxy in the config file: Furaffinity:
Kemono:
|
@Skyofflad |
@mikf, thanks, it works |
hi, so, hm... are there any plans to add basic support for Facebook? |
There's an open pull request: #5626 |
I'm trying to download galleries from Is there a way to get gallery-dl to automatically recognize these cookies from my profile? Or am I missing something here? |
Hello.
In the above example, if there is a directory starting with {user[id]}, use it, otherwise create a directory Thank you! |
This is a postprocessor I use for every gallery type:
I have now run into a situation where I updated a gallery which I am no longer subscribed, and all json files were replaced. The problem being, all the json data contained passwords to access the content of each post. So now each post is relocked until I resubscribe and redownload the json metadata. Is there a flag in which I can say "if exists, do not overwrite" for a json postprocessor? |
@klazoklazo @Coro365 @biggestsonicfan |
@mikf
This seems to work for every other website I poll through gallery-dl, except for Fur Affinity where it doesn't grab any NSFW images. I'd like to be able to sync all my cookies with my browser automatically if possible since, like you said, even if it takes a while it does expire eventually, and I'd rather not be taken by surprise if possible. |
@mikf |
@mikf |
Guys, I need some help. I tried to look for this but I couldn't find anything similar. Maybe there is and I'm just a noob, so I apologize in advance. Here's my situation: Let's say I downloaded some artworks from "random website A" a while ago. tldr: I would like to know if there is a way to skip files based on what's inside the folder instead of the archive. Just like yt-dlp for instance. Or is wiping my archive the only way? Thanks for reading through. UPDATE: I got this. So I basically edited "archive.sqlite3" with sqlitebrowser. Deleted all the entries related to the artist. Next time I entered the command line, it basically recognized what was already in the folder and downloaded what was missing. Maybe there is an easier way? I don't know, but it worked like a charm. |
Yeah, that's what I meant. My bad haha. I will edit the original post. |
One question, why was the following line: self.out.skip(pathfmt.path) moved after the I'm manipulating the output of gallery-dl to give me a bit more information and after upgrading to a later version it broke my output and I'm wondering if it will cause any issue if I just move it back in my local copy. I think what's happening is that the code in the |
is it possible to specify multiple cookies in the config such that g-dl will cycle through them as needed? eg: i'm also curious if it would be possible to randomly cycle through a cookie list to help prevent account bans eg when downloading instagram. |
Heavily related to my previous post, I've now encountered a new patron who edits the text, image attachment, and file attachment of a single post to update rewards from month to month. Since they do change the title of the post, my filename schema shouldn't match and it might be redownloaded as a new post. I won't know until next month, but I guess I'll cross that bridge when I get there. How expensive would it be, computational-wise, to check specific fields within json dumps to determine if an enumerate file should be downloaded or not? |
is there a way to make there are times when my IG cookies get expired but it doesn't show me any errors, so it justs keeps downloading files, lol. is there a way to stop it when cookies get expired? even if it doesn't show any errors? this only works when there are explicit errors: "error:NotFoundError|AuthorizationError|HttpError|HTTP redirect to login page": "exit 0" |
I am getting the error I see many mentions of this error: but I read through many of them trying to understand what to do, and I cannot figure it out. Will someone please tell me how to fix this. Also, just to vent, I had no idea how long this had been happening, or if any of my attempts to download pixiv profiles prior had been subject to this. I can't retroactively check any logs, since I think I used to have logs, but it would cause redownloading profiles to skip media it already downloaded, which annoyed me. I didn't know if I could disable that specifically, so I just gave up on having logs. So, I potentially am missing media when I intended to get everything. I am a bit sad about it. Also, the "logs" I am describing might actually be something entirely different, and might not have told me of this error anyway. I don't know. I barely manage to get gallery-dl working for myself, so it working at all is essentially where my knowledge on the program ends. |
I've come across this too often. Regular auditing of your archives sucks but is almost a necessary thing to do if you want to make sure you have it all. I'd recommend polishing up on some Python skills, and while you don't have to work with gallery-dl's code itself necessarily, you can write your own little audit scripts as needed. I wish we all were at a point where we could say a program is bulletproof, but not knowing everyone's scenarios and every gallery type out there throws curveballs and exceptions into the mix. |
@biggestsonicfan part of the problem was I delayed updating to windows 10 for a very long time, so the cmd window allowing seemingly an infinite amount of text (or at least enough that it dwarf's windows 7's not even allowing a |
Trying to download this: using: produces this error: until it hits 5/5 then fails. It happens for all misskey.gg links. In contrast, misskey.io links work without even needing to preface the link with "misskey:". For example: Is there anything I can do to make misskey.gg links work? |
Continuation of the previous issue as a central place for any sort of question or suggestion not deserving their own separate issue.
Links to older issues: #11, #74, #146.
The text was updated successfully, but these errors were encountered: