[YouTube] Throttling parameter decryption is broken, decrypt function is not again fully extracted #902

AudricV · 2022-08-18T13:55:36Z

With player 1f7d5369, the decryption of the throttling parameter fails because the function is not again fully extracted:

Left: what is extracted by the extractor; right: the real function

The extractor still works, because this time an exception catch is properly made.

The text was updated successfully, but these errors were encountered:

Theta-Dev · 2022-08-18T20:49:19Z

I just noticed the same issue. This time regex literals are to blame:

/,,[/,913,/](,)}/,

Avoiding these is not as easy as braces in strings. We cant simply treat slashes like quotes, because regex character ranges can have slashes in them.

Theta-Dev · 2022-08-18T20:52:19Z

At this point, wouldn't it be the best solution to use an actual JavaScript lexer to extract the function?

SamantazFox · 2022-08-18T22:17:52Z

At this point, wouldn't it be the best solution to use an actual JavaScript lexer to extract the function?

Yep, seems the only reasonnable option to me. And I'm pretty sure that functions wil get harder and harder to parse as the time goes on.

Theta-Dev · 2022-08-18T22:50:06Z

I am currently working on a YouTube downloader/client library in Rust (thats how noticed the issue).
So I wrote a test implementation of the fix for it, using the ress lexer.

fn extract_js_fn(js: &str, name: &str) -> Result<String> {
    let scan = ress::Scanner::new(js);
    let mut state = 0;
    let mut level = 0;

    let mut start = 0;
    let mut end = 0;

    for item in scan {
        let it = item?;
        let token = it.token;
        match state {
            // Looking for fn name
            0 => {
                if token.matches_ident_str(name) {
                    state = 1;
                    start = it.span.start;
                }
            }
            // Looking for equals
            1 => {
                if token.matches_punct(ress::tokens::Punct::Equal) {
                    state = 2;
                } else {
                    state = 0;
                }
            }
            // Looking for begin/end braces
            2 => {
                if token.matches_punct(ress::tokens::Punct::OpenBrace) {
                    level += 1;
                } else if token.matches_punct(ress::tokens::Punct::CloseBrace) {
                    level -= 1;

                    if level == 0 {
                        end = it.span.end;
                        state = 3;
                        break;
                    }
                }
            }
            _ => break,
        };
    }

    if state != 3 {
        return Err(anyhow!("could not extract js fn"));
    }

    Ok(js[start..end].to_owned())
}

This works fine with the new player.js.
And it looks like Mozilla Rhino, the JS interpreter we are using, has an API for its parser. So it should be possible to
implement this for NewPipe without additional dependencies.

https://javadoc.io/doc/org.mozilla/rhino/latest/index.html
http://ramkulkarni.com/blog/understanding-ast-created-by-mozilla-rhino-parser/

pukkandan · 2022-08-19T04:35:19Z

A lexer isn't really needed. The function body can be extracted by carefully keeping track of the quotes and braces. Equivalent code in yt-dlp: https://github.com/yt-dlp/yt-dlp/blob/b76e9cedb33d23f21060281596f7443750f67758/yt_dlp/jsinterp.py#L229-L254

But if your dependency already has a Lexer, ig why not use it

Theta-Dev · 2022-08-19T17:57:03Z

I now have a working prototype. It is not pretty and definitely needs cleanup, so I have to do that first before I make a PR. I ended up having to copy Rhino's tokenizer class because it is private. The higher-level parser is accessable, but it only parses entire JS documents into syntax trees, which would take too much time.

I also found an issue with the Rhino JS interpreter. Version 1.7.14 uses javax.lang.model.SourceVersion, which is not available on android. This causes the app to load indefinitely when opening a video. If you have any idea how to fix this without downgrading, please help me. I have no idea why this error did not occur before.
mozilla/rhino#1149

litetex · 2022-08-21T18:20:36Z

The problem described here will also be partially fixed with #882 (comment)

triallax · 2022-08-24T12:20:46Z

A lexer isn't really needed. The function body can be extracted by carefully keeping track of the quotes and braces.

I think that's a good approach.

But if your dependency already has a Lexer, ig why not use it

It does, but as mentioned by @Theta-Dev, it is unfortunately private, and I don't think we should copy the lexer to our codebase.

An alternative is to fork Rhino and make the lexer public.

litetex · 2022-08-24T17:05:32Z

An alternative is to fork Rhino and make the lexer public.

Or maybe contribute the changes to Mozilla ;)

triallax · 2022-08-24T17:20:24Z

If they would accept it, sure. ;)

Stypox · 2023-04-05T16:43:30Z

I am currently working on a YouTube downloader/client library in Rust

@Theta-Dev are you still rewriting NewPipeExtractor in Rust? Is it public yet? ;-)

^{Sorry for writing this comment here, but since you're not on IRC I didn't know how to write to you otherwise.}

Theta-Dev · 2023-04-05T16:55:16Z

@Stypox yes, RustyPipe is basically finished. You can get it here:

https://code.thetadev.de/ThetaDev/rustypipe

btw: how can I join you on IRC?

Stypox · 2023-04-05T17:12:36Z

Check out Contributing.md

AudricV added bug Issue is related to a bug youtube service, https://www.youtube.com/ labels Aug 18, 2022

coletdjnz mentioned this issue Aug 18, 2022

[youtube] nsig extraction failed: You may experience throttling for some formats yt-dlp/yt-dlp#4635

Closed

9 tasks

AudricV mentioned this issue Aug 20, 2022

[YouTube] Some formats and resolutions buffer very frequently TeamNewPipe/NewPipe#8839

Closed

5 tasks

Theta-Dev mentioned this issue Aug 20, 2022

[YouTube] Add JavaScript lexer to parse completely throttling decryption function #905

Merged

2 tasks

litetex mentioned this issue Aug 21, 2022

Fix all tests #882

Merged

2 tasks

AudricV closed this as completed in #905 Sep 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YouTube] Throttling parameter decryption is broken, decrypt function is not again fully extracted #902

[YouTube] Throttling parameter decryption is broken, decrypt function is not again fully extracted #902

AudricV commented Aug 18, 2022

Theta-Dev commented Aug 18, 2022

Theta-Dev commented Aug 18, 2022

SamantazFox commented Aug 18, 2022

Theta-Dev commented Aug 18, 2022

pukkandan commented Aug 19, 2022 •

edited

Loading

Theta-Dev commented Aug 19, 2022

litetex commented Aug 21, 2022

triallax commented Aug 24, 2022 •

edited

Loading

litetex commented Aug 24, 2022

triallax commented Aug 24, 2022

Stypox commented Apr 5, 2023

Theta-Dev commented Apr 5, 2023 •

edited

Loading

Stypox commented Apr 5, 2023

[YouTube] Throttling parameter decryption is broken, decrypt function is not again fully extracted #902

[YouTube] Throttling parameter decryption is broken, decrypt function is not again fully extracted #902

Comments

AudricV commented Aug 18, 2022

Theta-Dev commented Aug 18, 2022

Theta-Dev commented Aug 18, 2022

SamantazFox commented Aug 18, 2022

Theta-Dev commented Aug 18, 2022

pukkandan commented Aug 19, 2022 • edited Loading

Theta-Dev commented Aug 19, 2022

litetex commented Aug 21, 2022

triallax commented Aug 24, 2022 • edited Loading

litetex commented Aug 24, 2022

triallax commented Aug 24, 2022

Stypox commented Apr 5, 2023

Theta-Dev commented Apr 5, 2023 • edited Loading

Stypox commented Apr 5, 2023

pukkandan commented Aug 19, 2022 •

edited

Loading

triallax commented Aug 24, 2022 •

edited

Loading

Theta-Dev commented Apr 5, 2023 •

edited

Loading