Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Graph link preview image according to the document to open #224

Open
smoya opened this issue Jan 9, 2022 · 69 comments
Open

Open Graph link preview image according to the document to open #224

smoya opened this issue Jan 9, 2022 · 69 comments
Labels
enhancement New feature or request keep-open

Comments

@smoya
Copy link
Member

smoya commented Jan 9, 2022

OpenGraph Studio issue

Reason/Context

Thanks to the ?url=<url-of-file> and ?base64=<base64-encoded-doc> query param, Studio can load most of files (yes, not all of them, see #127). I expect users will use that to share their AsyncAPI docs.

Whenever a link to Studio (with or without those query params) is pasted into social media (Twitter, Linkedin, Facebook, Slack...), the preview image is this one:

https://user-images.githubusercontent.com/1083296/148680670-98b88679-c2b4-449b-a671-d03aa06c1f83.png

It is a great pic, however it says nothing about the file being shared.

What if we could dynamically generate the preview image based on the file being shared? For example, the title, description and some stats could be shown.

I created a POC based on https://github.com/vercel/og-image (deprecated atm), available in my fork (It's just a POC) which is a server that generates dynamic images for being used on Open Graph image meta tags. This works by generating a dynamic HTML, making an screenshot of it through headless Chromium, and serving the resulting image.

The server accepts a ?base64=<base64-encoded-doc> query param, and generates an image that contains the AsyncAPI doc Title, Description, number of servers, channels and messages.

Despite the horrible design, the service is able to generate the following:

Based on the following AsyncAPI doc:

See
asyncapi: '2.2.0'
info:
  title: Account Service
  version: 1.0.0
  description: This service is in charge of processing user signups
channels:
  user/signedup:
    subscribe:
      message:
        $ref: '#/components/messages/UserSignedUp'
components:
  messages:
    UserSignedUp:
      payload:
        type: object
        properties:
          displayName:
            type: string
            description: Name of the user
          email:
            type: string
            format: email
            description: Email of the user

Open in Studio

Studio will need to modify the og:image tag so it points to this new service.

<meta property="og:image" content="http://<service-url>/*.png?theme=light&base64=YXN5bmNhcGk6ICcyLjIuMCcKaW5mbzoKICB0aXRsZTogQWNjb3VudCBTZXJ2aWNlCiAgdmVyc2lvbjogMS4wLjAKICBkZXNjcmlwdGlvbjogVGhpcyBzZXJ2aWNlIGlzIGluIGNoYXJnZSBvZiBwcm9jZXNzaW5nIHVzZXIgc2lnbnVwcwpjaGFubmVsczoKICB1c2VyL3NpZ25lZHVwOgogICAgc3Vic2NyaWJlOgogICAgICBtZXNzYWdlOgogICAgICAgICRyZWY6ICcjL2NvbXBvbmVudHMvbWVzc2FnZXMvVXNlclNpZ25lZFVwJwpjb21wb25lbnRzOgogIG1lc3NhZ2VzOgogICAgVXNlclNpZ25lZFVwOgogICAgICBwYXlsb2FkOgogICAgICAgIHR5cGU6IG9iamVjdAogICAgICAgIHByb3BlcnRpZXM6CiAgICAgICAgICBkaXNwbGF5TmFtZToKICAgICAgICAgICAgdHlwZTogc3RyaW5nCiAgICAgICAgICAgIGRlc2NyaXB0aW9uOiBOYW1lIG9mIHRoZSB1c2VyCiAgICAgICAgICBlbWFpbDoKICAgICAgICAgICAgdHlwZTogc3RyaW5nCiAgICAgICAgICAgIGZvcm1hdDogZW1haWwKICAgICAgICAgICAgZGVzY3JpcHRpb246IEVtYWlsIG9mIHRoZSB1c2Vy" />

The preview image would then look like (note that https://shaggy-stingray-56.loca.lt/ was a local tunnel to my localhost serving a simple html with the og-image tag):

https://user-images.githubusercontent.com/1083296/148680677-a6c037fd-f2be-476e-94be-eecada0702bb.png

By the way, all of this could run on serverless functions such as the Netlify functions (which are AWS Lambda) available in free tier :)

Description

Here is a sequence diagram showing the big picture of the flow a request made by an Open Graph crawler (crawlers used for querying the open graph image whenever you share a link) will follow:

sequenceDiagram
Open Graph Crawler->>+Studio: /?base64=<encoded_doc>
Studio->>Studio: Set og:title, and og:description metatags. Set og:image to <og-generator-service>/generate.png?title=foo&description=bar&operations=4&servers=2 
Studio->>-Open Graph Crawler: Pre-rendered Studio HTML webpage
Open Graph Crawler->>+OpenGraph Generator: <og-generator-service>/generate.png?title=foo&description=bar&operations=4&servers=2 
OpenGraph Generator->>-Open Graph Crawler: og-image.png
Loading

Note that, as explained in this comment, we would need to configure pre-rendering in Netlify for doing the og:image content URL replacement on each request made by a crawler.

Alternatively, whatever technology we use (for example NextJS), the flow for rendering the Studio page would be something like the following:

flowchart TD
    A[User] --> B(https://studio.asyncapi.com)
    B --> C{contains ?base64 or ?url}
    C -->|No| D[Static rendering]
    C -->|Yes| E[Dynamic rendering]
    E --> F(Parsing AsyncAPI doc + etc)
Loading

In case the image can't be generated due to whatever reason, the default AsyncAPI Studio should be served instead: https://studio.asyncapi.com/img/meta-studio-og-image.jpeg

What you will need to do

Note that the design of the Open Graph image card is also part of this task. Ask @Mayaleeeee for help on this (Thanks! 🙌 ).

Prerequisites

  1. Fork Studio.
  2. Deploy it to your own Netlify free account. I recommend you to do it via Netlify’s website UI and not via Netlify CLI. With few clicks your site will be configured to be deployed on each push to the branch you specify.
  3. Enable Prerendering in your new Netlify site. This will allow web crawlers (such as the ones used for fetching the OpenGraph meta tags) receive a fully rendered version of the website, including content loaded by Javascript.

Work to do

  1. Create a new Github repository where your Open Graph image generator service will be tracked.

  2. Create then a new service that exposes an HTTP API that generates an Open Graph image based on few query params (use the names you want, the following are just suggestions):

    1. doc_url: a URL pointing to a raw AsyncAPI document.
    2. doc_base64: an AsyncAPI document encoded in base64.

    Some hints:

    • You will need to use the AsyncAPI Parser-JS to parse your document and extract the data you need from it
    • In order to generate the image, you can use @vercel/og package. (og-image is deprecated now). Documentation on how to use it is available here. Alternatively, if that package is not compatible in a non-vercel world, you might want to take a look to https://github.com/vercel/satori, which is what that package uses under the hood.
  3. Deploy this new service somewhere. I recommend you to deploy it via Netlify Functions. Or even better if we can get it as a Netlify Edge Function (support of npm packages is still experimental) since I believe we will be able to implement a caching mechanism easily.

  4. Once we have a public URL of that service, include a Javascript code somewhere in the Studio website that modifies the og:image meta tag content to point to the new service URL including the doc_url or doc_base64 query param with the right content. That will be the trick that will make the OpenGraph image shown dynamically based on such parameters.

  5. Performance is a must. Both serving the Open Graph tags + generating the image should not take more than few secs (~3), otherwise crawlers will timeout (for example, Slack's crawler timeouts at 5 secs)

  6. Investigate about caching. If hosted as a Netlify function, I believe we could just trust in cached responses. See https://docs.netlify.com/platform/caching/#supported-cache-control-headers. Otherwise, we could give a try to Netlify Blobs and store each generated image using the base64 hash (or a reproducible and atomic hash) so every new request first check if that image is already generated and in the case it is, serve the blob directly (not 100% if this use case can be supported, but I guess it is).

    Anyway, more investigation on how to implement the service should be taken, so please do not take my words here as the right way to do it as I didn’t spent time on it when I created this issue.

GSoC 2024

This issue got accepted as part of the GSoC 2024. @helios2003 is assigned as mentee.

We are using the following read-only Project board to track the current status of it's work: https://github.com/orgs/asyncapi/projects/49/views/1

@smoya smoya added the enhancement New feature or request label Jan 9, 2022
@loteoo
Copy link

loteoo commented Jan 10, 2022

Hope I'm not butting in too much but in case you guys want a quick and easy solution for this (until you develop your own), you can use https://thumbsmith.com/

(Full disclaimer - I'm the founder 😬 )

Usually this is more for websites that can't easily deploy services like you guys tho (eg: wordpress sites)

@magicmatatjahu
Copy link
Member

@smoya Thanks for that awesome idea! I understand implementation of that but the main problem in current solution (how Studio works) is that we have full SPA application, so everything is loaded and resolved in the runtime in the JS, so even if we update og:meta in the head of the page the web crawlers (like in fb for og:meta) will have already read this metadata and will not wait for JS to load and execute. I don't know how to get around this problem. One of the possible solutions could be to intercept the server-side request and check if the queryer is e.g. a crawler and return him the appropriate html with the appropriate metadata, but for this we should have a simple server.

@loteoo Hi and thanks! :) It may sound stupid, but in your solution there is a support for SPA applications that would change this metadata at runtime (see my comment above)?

@loteoo
Copy link

loteoo commented Jan 10, 2022

Hi @magicmatatjahu, my pleasure! Unfortunately as you mentioned the meta tags really needs to be in the initial HTML for the crawlers to pick it up. You will have to do SSR, or wait for something we had actually planned on our roadmap similar to this. (But you will have to share a custom URL, not the original URL)

@smoya
Copy link
Member Author

smoya commented Jan 10, 2022

@magicmatatjahu as we are hosting the Studio in Netlify, we can make use of prerendering which will prerender pages when those are requested by crawlers. See https://docs.netlify.com/site-deploys/post-processing/prerendering/

EDIT: I tested this creating a new site from Studio in my personal Netlify account. Just by enabling prerendring and adding a simple js script, we can make it work:

<script>
      const params = new URLSearchParams(window.location.search)
      if (params.has('base64')) {
        document.querySelector('meta[property="og:image"]').setAttribute("content", "https://example.com/?base64=" + params.get('base64'));
      }
    </script>

Source: https://github.com/smoya/studio/blob/master/public/index.html#L32-L37
replace example.com with the preview image service URL we would deploy

curl "https://hopeful-liskov-2a657c.netlify.app/?_escaped_fragment_=&base64=BASE64HERE" | grep og:image

...

<meta property="og:image" content="https://example.com/?
base64=BASE64HERE">

@magicmatatjahu
Copy link
Member

@smoya https://www.youtube.com/watch?v=9CS7j5I6aOc Can I say that we are in home? 😄

@magicmatatjahu
Copy link
Member

@smoya So, do you wanna handle that feature and create such a lambda for Netlify? I don't know where we should keep the source code of this lambda, in the Studio or as a new repo in the organization?

@smoya
Copy link
Member Author

smoya commented Jan 10, 2022

@smoya So, do you wanna handle that feature and create such a lambda for Netlify? I don't know where we should keep the source code of this lambda, in the Studio or as a new repo in the organization?

The pros of having it in this repository is that it will be always in sync with the code and everything will be handled by Netlify at the deploy level.
The cons is that we will be adding more code to this repository that is not really needed for running Studio.
I still need to cleanup the code so I would say if the final code is not so big, I would advocate for adding it to this repository.

WDYT?

@magicmatatjahu
Copy link
Member

@smoya then we have it in this repo :)

@smoya
Copy link
Member Author

smoya commented Jan 12, 2022

@mcturco I would like to ask you if you could help me with the design part for this.

@mcturco
Copy link
Member

mcturco commented Jan 12, 2022

@smoya yes, I can help out with a design for this! Are there any limitations as far as layout/styling goes? I will use your example images to reference what content will be included, but was just wondering how it would be implemented since I see the open graph image is gathering meta information and not using HTML (unless I am incorrect?)

@smoya
Copy link
Member Author

smoya commented Jan 12, 2022

@smoya yes, I can help out with a design for this! Are there any limitations as far as layout/styling goes? I will use your example images to reference what content will be included, but was just wondering how it would be implemented since I see the open graph image is gathering meta information and not using HTML (unless I am incorrect?)

Those images I attached are generated from HTML. What we need is to design those cards in HTML (CSS, TS, whatever).
The HTML used for those images is located here.
I guess we could ask @magicmatatjahu or any other user from the community to help with that part, but still the design is needed.
The minimum size would be 1200 x 630 (as recommended for high-res displays).

About the data we can display on it, I'm up to suggestions. I thought on:

  • Title and version
  • Description (truncated) - (Should we render markdown?)
  • Num of Servers
  • Num of Channels
  • Num of Publish Send operations
  • Num of Subscribe Receive operations
  • Num of Messages

Anything you all think we could add/remove from it? (we should try to avoid overloading the image, so some could be drop if needed).

@mcturco
Copy link
Member

mcturco commented Jan 12, 2022

@smoya sounds good! yeah just wanted to make sure that I can apply some of the new styles that we have been using as part of the brand refresh. Cool! I can get to work on that 😄

@mcturco
Copy link
Member

mcturco commented Mar 8, 2022

Hi all! Sorry for the delay on my part for this issue. Going to add this back onto my list as we are working towards launching the new brand stuff. I will be using the new logo/colors/typography for this open graph 👍

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Jul 7, 2022
@mcturco mcturco moved this to Contributions Needed in Design [old] Jul 7, 2022
@github-actions
Copy link

github-actions bot commented Nov 5, 2022

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Nov 5, 2022
@smoya
Copy link
Member Author

smoya commented Nov 8, 2022

still relevant

@BOLT04
Copy link
Member

BOLT04 commented Nov 12, 2022

Hello everyone 👋,

I was looking for some issues to contribute and help out, and found this one related to SEO 😃.
@smoya @mcturco Is there anything code related that we can create a PR, even just a draft PR to start with?
From what I understood the code you have on your fork @smoya could just go to this PR. and we'd deploy that Netlify function.

Would love your feedback, if there is anything I can help let me know 👍

@magicmatatjahu
Copy link
Member

@BOLT04 It's hard to say where to start because we need to test the Netlify preview feature, write a lambda function that would generate such images (using this vercel-og project and changing it a bit for our use case) and then connect it together. It's hard for me to write where to start and how difficult it is. However, if you want we can discuss it :)

@BOLT04
Copy link
Member

BOLT04 commented Nov 27, 2022

one question @magicmatatjahu, could we start development using a sort of mock design? Then when @mcturco has the final version for this open graph preview, we change the code to use that?

@magicmatatjahu
Copy link
Member

@BOLT04 Yeah, we can mock "preview image" in the development time and at the end change it. Sergio exactly made it in this way, he focus on logic and make mock of preview image #224 (comment) We can even reuse his code - but code has 1 year so probably we need adjust it to the latest "standard" of netlify :) Do you wanna handle it? btw. sorry for delay in response!

@github-actions
Copy link

github-actions bot commented May 7, 2023

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label May 7, 2023
@smoya
Copy link
Member Author

smoya commented May 10, 2023

still relevant

@RegretfulWinter
Copy link

I did test sharing a link through your deployed Studio instance and indeed it is so slow that Slack didn't render the preview image of the link 🤔

Thank you for the feedback @smoya . Sorry for the late reply. I tested again today and see it's not producing the same result when I commented. Will debug today and see how I can streamline my logic of the code.

@Athul0491
Copy link

Athul0491 commented Mar 12, 2024

Oh, I see now it works slow, but it works. 👍 Thanks for sharing. Anyway, I encourage you to debug where the time is spent and see how can be improved.

I optimized my code and tested the hosted API on postman. On an average, it takes about 2.5 seconds for the response now. The image generation part of the code takes less than 0.01 seconds and the rest of the time is because of the parsing.

@GiteshDewangan
Copy link

@smoya Hello sir , #224 (comment) I started working on this...

@tihom4537
Copy link

Greetings sir @smoya ,this project, (#224) seems very interesting to work on , I have been researching and working on it ,will surely come up with my proposal.

@Athul0491
Copy link

It was mentioned in the application template that I could submit a draft proposal. How can I submit a draft proposal for this project? I wanted to get feedback regarding my proposal.

@smoya
Copy link
Member Author

smoya commented Mar 25, 2024

It was mentioned in the application template that I could submit a draft proposal. How can I submit a draft proposal for this project? I wanted to get feedback regarding my proposal.

EDIT: Hi @Athul0491. Please see my message below. 🚀

@smoya
Copy link
Member Author

smoya commented Mar 25, 2024

FYI, you all have the following GSoC Application Template in case you want to craft impressive proposals.
You can submit a draft proposal early to get feedback and iterate early. Be sure to read Google's guide to writing a proposal.

You can share the proposal via DM (to me) in Slack, or rather sharing it here (depending if it is data sensitive or not).

@smoya
Copy link
Member Author

smoya commented Apr 2, 2024

Added a note about the possibility of using https://github.com/vercel/satori in case @vercel/og is not compatible with a non-vercel environment (I doubt it but just in case).

Alternatively, if that package is not compatible in a non-vercel world, you might want to take a look to https://github.com/vercel/satori, which is what that package uses under the hood.

@smoya
Copy link
Member Author

smoya commented Apr 2, 2024

Added a mention to the possibility of using cached responses when using Netlify functions as a first possible solution for caching.

If hosted as a Netlify function, I believe we could just trust in cached responses. See https://docs.netlify.com/platform/caching/#supported-cache-control-headers.

@smoya
Copy link
Member Author

smoya commented Apr 2, 2024

@RegretfulWinter are you finally applying? The deadline is today https://developers.google.com/open-source/gsoc/timeline#april_2_-_1800_utc

@Mayaleeeee Mayaleeeee moved this to In Progress in Design May 23, 2024
@Mayaleeeee Mayaleeeee removed this from Design May 23, 2024
@Mayaleeeee Mayaleeeee moved this to Upcoming in Design May 23, 2024
@smoya
Copy link
Member Author

smoya commented May 31, 2024

Just for the record, @helios2003 got selected as GSOC 2024 Mentee and it is working on this issue.

@helios2003
Copy link
Contributor

helios2003 commented Jun 11, 2024

I have run some performance tests between the current production instance of studio-next and my deployed instance of studio, which parses the content at the studio level (see the issue description) to dynamically set the open graph tags. My deployed instance of studio can be found here: https://studio-helios2003.netlify.app.

Without base64 document in the URL params

Metric https://studio-next.netlify.app https://studio-helios2003.netlify.app
Time to first byte 356 ms 345 ms
First Contentful Paint 606 ms 549 ms
Onload time 1400 ms 1300 ms
Largest Contentful Paint 2900 ms 2600 ms
Time to be Interactive 3200 ms 2900 ms
Fully loaded time 4100 ms 4100 ms

With base64 document in the URL params

Metric https://studio-next.netlify.app https://studio-helios2003.netlify.app
Time to first byte 284 ms 2600 ms
First Contentful Paint 426 ms 2800 ms
Onload time 1200 ms 3500 ms
Largest Contentful Paint 4100 ms 6700 ms
Time to be Interactive 3200 ms 5800 ms
Fully loaded time 4200 ms 6700 ms

the base64 doc used can be found here: https://tinyurl.com/57dexzrd

@KhudaDad414
Copy link
Member

@helios2003 what was the approach here? parsing the document twice?

@helios2003
Copy link
Contributor

Nope, the document is being parsed once.

@smoya
Copy link
Member Author

smoya commented Jun 11, 2024

After checking @helios2003 tests, I ran a simple comparison test measuring the response time from https://studio.asyncapi.com and https://studio-next.netlify.app.

The Studio URL was always the same @helios2003 provided in its tests, which loads an AsyncAPI via the base64 query param.

Look at the results:

Version Cache-Status Time Total (seconds)
Studio (regular) "Netlify Edge"; fwd=miss 0.316334
Studio-next (NextJS) "Next.js"; hit,"Netlify Edge"; fwd=stale 4.156618

As you can see, the Next-JS version takes almost 4 seconds more than the regular version. I could understand the timing because almost everything is rendered at server level. However, I don't understand the Cache-Status response header then. It says there is a cache HIT. But how so? If it's a hit, I would expect the response to be a prebuilt one. In that case, the response would have to take much less than that.

Any idea why this is happening? @KhudaDad414 @Amzani

@smoya
Copy link
Member Author

smoya commented Jun 11, 2024

@helios2003 What about intercepting the request made by opengraph crawlers and print, in that case, only the headers? In that case, no extra javascript would be needed to be rendered. Then, I expect the response time should be way lower than 5s

@helios2003
Copy link
Contributor

@helios2003 What about intercepting the request made by opengraph crawlers and print, in that case, only the headers? In that case, no extra javascript would be needed to be rendered. Then, I expect the response time should be way lower than 5s

On doing that this is the result.
image
The time taken is much larger for the first request but reduces substantially for the subsequent calls to the same endpoint. This contains the doc specified in the above message.

@smoya
Copy link
Member Author

smoya commented Jun 13, 2024

On doing that this is the result.

~5 seconds to parse the doc + print basic headers it's too much. Are you sure only required headers are being printed? (no headers loading scripts, etc)

The time taken is much larger for the first request but reduces substantially for the subsequent calls to the same endpoint.

That's due to the cache at Netlify's edge. If you print the cache-status response header, you will notice a hit.

@smoya
Copy link
Member Author

smoya commented Jun 17, 2024

Update: The diff regarding time response in Netlify VS Vercel is so noticeable. After the creation of #1118, @helios2003 is gonna keep working on the main assigned task (this issue) and most probably keep deploying it's changes to both Netlify and Vercel to avoid unexpected performance issues.

Meanwhile, I expect the owners of Studio to prioritize the investigation.

cc @magicmatatjahu @KhudaDad414 @Amzani

@smoya
Copy link
Member Author

smoya commented Jul 29, 2024

@Mayaleeeee seems to be out until the first week of August (as per its absence in Slack and its Slack status message). I hope she can then work on providing the design for the OG card and be on time for the GSOC timeline.

cc @helios2003

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request keep-open
Projects
Status: Upcoming
Status: Backlog
Status: Contributions Needed
Development

No branches or pull requests