Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize youtube-transcript's fetchTranscript API #35

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

rshmhrj
Copy link

@rshmhrj rshmhrj commented Jul 29, 2024

Hi @Kakulukian @mefengl @balmacefa, here is an initial draft at refactoring to modularize the fetchTranscript method, which allows users to override specific parts -- related to #34.

Using YTTranscript from node or from a server works fine, but in the Obsidian Plugin web-view context, we are getting CORS errors. These changes should allow me to do something like this:

import { YoutubeTranscript } from 'youtube-transcript';
import { request } from "obsidian";

class ObsidianYoutubeTranscript extends YoutubeTranscript {
  videoId: string;
  videoPageBody: string;
  transcriptBody: string;
  public async getPageBody(): Promise<string> {
    const videoPageResponse = await request(
    `https://www.youtube.com/watch?v=${this.videoId}`
    );
    this.videoPageBody = videoPageResponse;
    return this.videoPageBody;
  }

  public async getTranscriptResponse(transcriptURL: string): Promise<string> {
    const transcriptResponse = await request(transcriptURL);
    this.transcriptBody = transcriptResponse;
    return this.transcriptBody;
  }
}

ObsidianYoutubeTranscript.fetchTranscript(url).then(console.log);

So I can use the obsidian.request module to perform the fetch for the pageBody and for the transcriptBody and continue to use the rest of the functionality as defined.

Also took the opportunity to add a bunch of tests, as I didn't want to break anything (and to also get a better idea of how everything worked). I used the command npm run test to execute all the tests.

image

What do you all think?

- changed method signature from private to protected to enable testing
- extract each piece of work from fetchTranscript into separate functions
- add class variables to store state during processing and making passing data between methods easier (started with only videoId and config, but extended it to include all key data processing outputs)
- refactor static fetchTranscript to instantiate new `ytt` object and perform the same original processing steps, to keep original interface intact and backward compatible
- add docstrings for each method
- add tests for original static method using fetch and created mocks / mock data for testing other ways of getting data from urls -- didn't want to add `obsidian` as a dev dependency only for testing
- previous version didn't work properly since the static fetchTranscript was called and it reinstantiated the ytt object of type YoutubeTranscript (instead of the overridden / extended child ObsidianYoutubeTranscript)
@rshmhrj
Copy link
Author

rshmhrj commented Aug 22, 2024

Hi @Kakulukian @mefengl @balmacefa -- any chance you can make some time to review this please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant