Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reddit extractor not calling child extractors and how to format complex reddit titles for filename export #7111

Open
oliveremodeler opened this issue Mar 3, 2025 · 3 comments

Comments

@oliveremodeler
Copy link

oliveremodeler commented Mar 3, 2025

I'm trying to give gallery-dl a list of post links and automatically call other extractors based on the content of the post, but it's not working and I can't figure it out. The docs are pretty light on details about parent/child extractors.
Example: I want to download a post and the linked audio from gonewildaudio, so I give gallery-dl the command gallery-dl https://www.reddit.com/r/gonewildaudio/comments/1j2pxfn/f4m_your_timid_neighbor_asks_you_to_turn_your/. There is a soundgasm link in the post, which should trigger the soundgasm extractor, but nothing happens and no files are created.
Here are my configs (I'm sure these are not going to be helpful, since they have never worked and I've changed them so much to try and make them work, frankly I've lost track of all the changes) and logs.
reddit extractor config:

        "reddit":
        {
            "client-id"    : "********",
            "user-agent"   : "********",
            "refresh-token": "********",

            "directory"   : ["reddit", "{author}"],
            "comments"    : 0,
            "morecomments": false,
            "embeds"      : true,
            "selftext"    : true,
            "date-min"    : 0,
            "date-max"    : 253402210800,
            "date-format" : "%y.%m.%d-T%H:%M:%S",
            "id-min"      : null,
            "id-max"      : null,
            "previews"    : true,
            "recursion"   : 5,
            "metadata"    : true,
            "videos"      : true,
            "filename"    : "{subreddit}.{title}.{extension}",
            "parent-directory": true,
            "category-transfer": true,
            "metadata-parent":true, 
            "soundgasm"   : {
                "sleep-request": "1.5-2.5"
            }

Soundgasm extractor config:

        "soundgasm":
        {
            "sleep-request": "0.5-1.5",
            "filename"     : "{subreddit:?/./}{title}.{extension}"

Thank you for any help in advance, and please tell me if I need to include or explain anything else

@mikf
Copy link
Owner

mikf commented Mar 3, 2025

You need to set comments to a value greater than 0 for gallery-dl to extract links from posted text, even the OP.

$ gallery-dl https://www.reddit.com/r/gonewildaudio/comments/1j2pxfn/f4m_your_timid_neighbor_asks_you_to_turn_your/
# ... nothing

$ gallery-dl -o comments=1 https://www.reddit.com/r/gonewildaudio/comments/1j2pxfn/f4m_your_timid_neighbor_asks_you_to_turn_your/
./soundgasm/chuwa/Your Timid Neighbor A…r Music Down So You Fuck Her Stupid.m4a

@oliveremodeler
Copy link
Author

Thanks. Now it does pick up the soundgasm link, but I'm getting the following error, with no file downloaded:

PS C:\Users\User> gallery-dl -v --no-skip https://www.reddit.com/r/gonewildaudio/comments/1j2pxfn/f4m_your_timid_neighbor_asks_you_to_turn_your/
←[0;37m[gallery-dl][debug] Version 1.28.5 - Executable (stable/windows)←[0m
←[0;37m[gallery-dl][debug] Python 3.8.10 - Windows-10-10.0.19044←[0m
←[0;37m[gallery-dl][debug] requests 2.32.3 - urllib3 2.2.3←[0m
←[0;37m[gallery-dl][debug] Configuration Files ['%USERPROFILE%\\gallery-dl\\config.json']←[0m
←[0;37m[gallery-dl][debug] Starting DownloadJob for 'https://www.reddit.com/r/gonewildaudio/comments/1j2pxfn/f4m_your_timid_neighbor_asks_you_to_turn_your/'←[0m
←[0;37m[reddit][debug] Using RedditSubmissionExtractor for 'https://www.reddit.com/r/gonewildaudio/comments/1j2pxfn/f4m_your_timid_neighbor_asks_you_to_turn_your/'←[0m
←[0;37m[reddit][debug] Using custom API credentials (client-id ********)←[0m
←[0;37m[urllib3.connectionpool][debug] Starting new HTTPS connection (1): oauth.reddit.com:443←[0m
←[0;37m[urllib3.connectionpool][debug] https://oauth.reddit.com:443 "GET /comments/1j2pxfn/.json?limit=1&raw_json=1 HTTP/11" 200 2659←[0m
←[0;37m[reddit][debug] Using download archive '********\gallery-dl.sqlite3'←[0m
←[0;37m[soundgasm][debug] Using SoundgasmAudioExtractor for 'https://soundgasm.net/u/chuwa/Your-Timid-Neighbor-Asks-You-To-Turn-Your-Music-Down-So-You-Fuck-Her-Stupid'←[0m
←[0;37m[urllib3.connectionpool][debug] Starting new HTTPS connection (1): soundgasm.net:443←[0m
←[0;37m[urllib3.connectionpool][debug] https://soundgasm.net:443 "GET /u/chuwa/Your-Timid-Neighbor-Asks-You-To-Turn-Your-Music-Down-So-You-Fuck-Her-Stupid HTTP/11" 200 None←[0m
←[0;37m[soundgasm][debug] Using download archive '********\gallery-dl.sqlite3'←[0m
←[1;31m[soundgasm][error] An unexpected error occurred: TypeError - can only concatenate tuple (not "str") to tuple. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .←[0m
←[0;37m[soundgasm][debug] ←[0m
Traceback (most recent call last):
  File "gallery_dl\job.pyc", line 153, in run
  File "gallery_dl\job.pyc", line 201, in dispatch
  File "gallery_dl\job.pyc", line 380, in handle_directory
  File "gallery_dl\job.pyc", line 623, in initialize
  File "gallery_dl\extractor\common.pyc", line 127, in _config_shared_accumulate
TypeError: can only concatenate tuple (not "str") to tuple

@oliveremodeler
Copy link
Author

I was able to solve this for myself by removing the "category-transfer": true, line from my config.

What's the best way to extract all those tags from the reddit title and condensing them for the filename limits? The soundgasm titles are often not very accurate, so I'd like to extract the basic reddit title, delete any emojis, and move all the tags to the end of the title while transforming them all to use fewer characters (standard abbreviations, removing any whitespaces/commas/square brackets and replacing them with single characters to denote a tag change).
Would something like this be a good start, or would I need to take a different approach?

"filename": "{_reddit[title]:(regex to delete everything except the main title)} {_reddit[title]:(regex to delete everything outside of brackets and then convert everything in brackets to a recognizable tagging format)}{extension}"

Using the example post from above, I'd like to take the title

[F4M] Your Timid Neighbor Asks You To Turn Your Music Down So You Fuck Her Stupid 🎉💢💦 [Script Fill] [Timid Speaker] [Confident Listener] [Degradation] [Mindbreak] [Facefucking] [Cock Kissing] [Ego Feeding] [Creampie] [Hair Pulling] [Spanking] [Face Down Ass Up] [Speaker and Listener Orgasm]

and turn it into

Your Timid Neighbor Asks You To Turn Your Music Down So You Fuck Her Stupid #Script-Fill#F4M#Timid-Speaker#Confident-Listener#Degradation#Mindbreak#Facefucking#Cock-Kissing#Ego-Feeding#Creampie#Hair-Pulling#Spanking#Face-Down-Ass-Up#Speaker-and-Listener-Orgasm.m4a

(replacement of common tags not shown here, but think something along the lines of turning #Script-Fill or [Script Fill] into #S-F) while still limiting the total filename length. Extra bonus points to anyone who can help me figure out the following:

  1. how to pull any additional tags from the post to add
  2. If the tags (after condensing) would make the filename too long, how to write all tags to a txt file with the same name in the same location, so later on I can either manually select tags or run a script to format it in the future (or a json file or whatever other format would be best)

I'm planning to get pretty complicated with this and attempt to standardize/condense some of the more common tags in my outputs. I welcome any recommendations to help make this work.

@oliveremodeler oliveremodeler changed the title Reddit extractor not calling child extractors Reddit extractor not calling child extractors and how to format complex reddit titles for filename export Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants