Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluesky.from_as1: convert links, @-mentions, and hashtags to facets #675

Open
snarfed opened this issue Feb 5, 2024 · 15 comments
Open

Bluesky.from_as1: convert links, @-mentions, and hashtags to facets #675

snarfed opened this issue Feb 5, 2024 · 15 comments

Comments

@snarfed
Copy link
Owner

snarfed commented Feb 5, 2024

We should support facet output in from_as1! Specifically, convert mention and hashtag tags to app.bsky.richtext.facet#mentions and #tags, respectively, and HTML links in content to #links.

This will be a bit tricky, since we don't currently generate indices for those tags in eg microformats.json_to_object and Source.postprocess_object, and we don't generate anything at all for plain links yet. We'd want to look into how AS1 tag indices work anyway, eg do they always index into plain text content, eg content.value?

cc @JoelOtter

@snarfed snarfed added the now label Feb 5, 2024
@JoelOtter
Copy link
Contributor

Possibly not the place for this(?) but we also need to think about link cards on Bluesky as those aren't implicitly created like they are on e.g. Mastodon. Should it be the first or last link in a post, by default? My gut would be last based on how I usually format posts

@snarfed
Copy link
Owner Author

snarfed commented Feb 5, 2024

@JoelOtter Ooh good point! Last link sounds fine to me, but I wonder if there's a more native mf2 way to do it. Will ask on #microformats.

@snarfed
Copy link
Owner Author

snarfed commented Feb 7, 2024

From snarfed/bridgy#1661 (comment) :

Apart from @-mentions specifically, I'll also echo here what I mentioned in #675 in general : this is going to be difficult to implement. We have to "disassemble" span-based HTML markup into Bluesky index-based facets. For arbitrary content HTML, we have to parse it, extract just the tags we care about (currently links and microformats2 hashtag u-categories), discard other tags (including overlapping ones like the span here), extract the plain text, calculate the start and end indices of the tags we care about into the plain text, convert those indices to bytes in the HTML document's character encoding, and populate all of that into Bluesky facets. Phew.

@kevinmarks sent the old XOXO parser as an example that does this, which is great! Still though. This feels like a nontrivial project.

@snarfed snarfed removed the now label Feb 7, 2024
@snarfed
Copy link
Owner Author

snarfed commented Feb 8, 2024

@JoelOtter we discussed link preview cards briefly on #microformats and ended up with the proposal that users could use u-featured to indicate the link(s) to preview: https://indieweb.org/link-preview#which_link_to_preview . If none of the links have that, we could default to first, or whatever.

@JoelOtter
Copy link
Contributor

Makes good sense to me. I think I'd still opt to make it the last one but as long as it's documented and configurable either should be good!

@JoelOtter
Copy link
Contributor

I guess the other question is do we still include the URL in the post itself or do we remove it in favour of just the link card? Bluesky lets you do this

@snarfed
Copy link
Owner Author

snarfed commented Feb 8, 2024

Oh sorry, yes, definitely last!

Whether to remove the link or not, good question. Maybe yes when it's the very last text in content, and we generate a preview for it, otherwise no?

snarfed added a commit that referenced this issue Apr 1, 2024
...even if it was converted from HTML content. for #675
@snarfed
Copy link
Owner Author

snarfed commented Apr 1, 2024

@JoelOtter
Copy link
Contributor

I fear this has broken Bluesky publish - it seems to be creating facets for all tags, not just hashtags, which may not be present in the text. Example for this post:

https://www.joelotter.com/notes/2024/04/05-japan1/

https://brid.gy/log?module=default&start_time=1712287175&key=agdicmlkLWd5clkLEg1QdWJsaXNoZWRQYWdlIjJodHRwczovL3d3dy5qb2Vsb3R0ZXIuY29tL25vdGVzLzIwMjQvMDQvMDUtamFwYW4xLwwLEgdQdWJsaXNoGICA-O_I7JYLDA

@snarfed
Copy link
Owner Author

snarfed commented Apr 5, 2024

Oh no! Sorry about that, you're right. Will fix.

@snarfed
Copy link
Owner Author

snarfed commented Apr 5, 2024

OK @JoelOtter that should be fixed, feel free to try again.

@JoelOtter
Copy link
Contributor

JoelOtter commented Apr 6, 2024 via email

@snarfed
Copy link
Owner Author

snarfed commented Apr 6, 2024

@snarfed
Copy link
Owner Author

snarfed commented Apr 6, 2024

@JoelOtter feel free to try hashtags with p-category, and @-mentions with any link to a bsky.app user with link text starting with @, if you want!

@snarfed
Copy link
Owner Author

snarfed commented Jul 17, 2024

The one remaining bit here is to convert arbitrary HTML links, ie <a href> tags, to Bluesky facets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants