-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Ability to slurp multiple pages using [] range format #51
Comments
could you provide a few examples of sites you would use this on? |
Well, currently found one: |
Hmm slurp seems to have grabbed it but loading the markdown crashes obsidian for me on Android. Anyway, I wanted to try it because Slurp/Readability is built to parse news articles, blog posts, and things like that, not tabular data. If you're mostly thinking of using this feature for that kind of page, I suspect it will disappoint you. |
Worked perfectly for me, buddy, but of course the volume was too large for Obsidian. But going thru the numbers one by one is tedious. Any way to hook your plugin to a Templater script, maybe? |
yeah when i was back at my PC, i noticed that the mobile client was able to parse, save, and even sync the file. kind of surprised that it crashed the app though, it's only ~2MB. big for a plaintext file sure, but not excessive or complex. 🤷
i'm not sure about templater scripting. haven't really used it. i do see the use case for this. there's just going to be some landmines to avoid. eg: if you assume that it takes an average of 3s to download, parse, and save a page like that, then 200 of those pages is going to take ~10 minutes. i'm not sure how obsidian or the various OSs its running on will react. hitting a server could get slurp's user agent or the user's IP (or both) banned for bot scraping. i'll have a look into it at some point and see what's possible. in the short term, i'd probably recommend running a simple script. you should be able to do something like this: #!/bin/bash
URL="https://...your url here.../"
START=1
END=200
for i in $(seq $START $END); do
# use slurp's obsidian's URL integration
# on MacOS *i think* you can use "open" instead of "xdg-open"
xdg-open obsidian://slurp?url=${URL}${i}
# give slurp time to do the thing
sleep 3
done save that somewhere with whatever name you like, eg:
i'm sure the same could be accomplished with PowerShell on Windows as well but i'm not really sure how. i do know that running the bash script in WSL on Windows will not work though. |
Good (and I daresay "un-inhuman") of you to take the time to provide this information.
This method would suffice for me, surely. So I'd say only implement something if the FR racks up a dozen likes or so. Cheers mate All the best, |
I've used this in the past in with the browser extension DownThemAll.
There the syntax to extract pdf pages was:
https://adt.arcanum.com/check-access-save/MNYTESZ_Hun_1/?pg=[0:1149]
Now, you understand I don't want pdf's with this plugin. I just showed the syntax.
So syntax would be something like
[0:9]
or[0-9]
.Plugin creator would not be responsible for knowing how many digits or items there are in the range (some sites use 1, 01 or even 001 as first item or sometimes the range you'd want is from 22 to 45 only) -- that's for the user to feel out.
Great plugin,
Cheers
Z.
The text was updated successfully, but these errors were encountered: