Add Enterprise Hub Serverless API Notebook #105

andrewrreed · 2024-06-05T11:19:56Z

What does this PR do?

Adds the Serverless API notebook for Enterprise Hub section

Who can review?

@MoritzLaurer @merveenoyan

review-notebook-app · 2024-06-05T11:20:01Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

HuggingFaceDocBuilderDev · 2024-06-05T11:23:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

andrewrreed · 2024-06-05T12:11:43Z

@merveenoyan @MoritzLaurer @stevhliu

This PR is ready for review, however the only thing that would hold up merging is that there seems to be an issue with suno/bark behind the Inference API. Might make sense to hold off until that is fixed before merging.

One other quick question - anyone know of a way to get the audio widget to be displayed / useable in section: "4. Generating Speech from Text"?

Thanks in advance for the review!!

review-notebook-app · 2024-06-05T13:54:09Z

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-06-05T13:54:08Z
----------------------------------------------------------------

VLMs*

review-notebook-app · 2024-06-05T13:54:10Z

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-06-05T13:54:09Z
----------------------------------------------------------------

nit: To begin using the Serverless Inference API, you'll need a Hugging Face Hub profile: you can register if you don't have or login if you have one.

nit: we highly encourage use of fine-grained tokens (it's default one now) but not sure if it makes sense for a tutorial. maybe we could prompt user to invoke after this.

andrewrreed commented on 2024-06-05T21:30:05Z
----------------------------------------------------------------

Good call - I added this language that's hopefully not to overwhelming for users:

Next, you'll need to create a [User Access Token](https://huggingface.co/docs/hub/security-tokens). A token with read or write permissions will work, however, we highly encourage the use of fine-grained tokens. For this notebook, you'll need a fine-grained token with Inference > Make calls to the serverless Inference API permissions.

review-notebook-app · 2024-06-05T13:54:11Z

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-06-05T13:54:10Z
----------------------------------------------------------------

nice, also TIL of interpreter_login()

andrewrreed commented on 2024-06-05T21:30:49Z
----------------------------------------------------------------

It's my savior with VSCode notebooks :)

review-notebook-app · 2024-06-05T13:54:11Z

View / edit / reply to this conversation on ReviewNB

merveenoyan commented on 2024-06-05T13:54:11Z
----------------------------------------------------------------

nit: subtleties

andrewrreed commented on 2024-06-05T21:44:19Z
----------------------------------------------------------------

Just too subtle :)

merveenoyan

Very well put that I didn't find anything to add other than small nits!

review-notebook-app · 2024-06-05T14:49:39Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:38Z
----------------------------------------------------------------

I'd suggest that we don't show author names to the enterprise recipes and we maintain them as a team

review-notebook-app · 2024-06-05T14:49:39Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:39Z
----------------------------------------------------------------

We should probably recommend creating fine-grained tokens here as a security best practice

review-notebook-app · 2024-06-05T14:49:40Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:40Z
----------------------------------------------------------------

Before this section on querying the API, I think it would be useful to have a brief section (or just 1-2 cells) that explain how the serverless API works in the backend. It's relevant for users to understand that only a small selection of highly used models are actually loaded in memory on our infrastructure and available for use (I'd also probably slightly rephrase the sentence on 500 000 being available in the very beginning, because defacto it's a much smaller selection). Otherwise people get the expectation that any model works and they will get frustrated when most actually don't work via the severless API.

You can also add a cell that displays this command for checking which models are available. (Adding the caveat that this is only an approximation and unfortunately the command is not a live reflection of deployed models at the moment)

print(InferenceClient().list_deployed_models())

And it might be good to add a warning somewhere that this code was run on Date X with the currently available models, but available models will change in the future (and instruct people how to find more recent models when they run this in 6 months and e.g. CodeLlama might not be on the API anymore, e.g. via list_deyployed_models() )

andrewrreed commented on 2024-06-05T22:14:27Z
----------------------------------------------------------------

I'd also probably slightly rephrase the sentence on 500 000 being available in the very beginning

Replaced 500k with "thousands"

that explain how the serverless API works in the backend

I do have some language just after the section that explains it. The goal was to introduce the simplicity and then explain how it worked. But I agree that adding some info about list_deployed_models() is a good idea. Will add that here!

review-notebook-app · 2024-06-05T14:49:41Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:40Z
----------------------------------------------------------------

I feel like it might be better to show example code for a http request in python here instead of curl. I'm often using python http requests with our APIs, because the InferenceClient doesn't have recent TGI features implement (or with a relevant time lag), doesn't support all arguments etc. So if people encounter limitations with InferenceClient, they could build upon the requests example code from here.

andrewrreed commented on 2024-06-05T22:09:55Z
----------------------------------------------------------------

Yes, this is a good call. I swapped out curl for requests

review-notebook-app · 2024-06-05T14:49:42Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:41Z
----------------------------------------------------------------

maybe add return_full_text=True here to make curl and the client call outputs equivalent

review-notebook-app · 2024-06-05T14:49:42Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:42Z
----------------------------------------------------------------

These subtle differences are important to understand because they affect the way in which we should query a particular model. Instruct models are trained with chat templates that are specific to the model, so you need to be careful about the format the model expects and replicate it in your queries.

Maybe add a hyperlink to the chat templates docs or blog post here. Otherwise readers just read about the complexity/challenge, but don't have a source for learning more about it.

andrewrreed commented on 2024-06-05T21:46:18Z
----------------------------------------------------------------

Good call!

review-notebook-app · 2024-06-05T14:49:43Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:43Z
----------------------------------------------------------------

transformers was not installed above

review-notebook-app · 2024-06-05T14:49:44Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:43Z
----------------------------------------------------------------

maybe add that an advantage of this is increased inference speed

review-notebook-app · 2024-06-05T14:49:45Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:44Z
----------------------------------------------------------------

Maybe make explicit: "They can take both text and images as input simultaneously and produce text as output."

review-notebook-app · 2024-06-05T14:49:46Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:45Z
----------------------------------------------------------------

Re your question on the audio widget: The audio widget is displaying correctly for me when I look at the notebook in reviewnb. (I also have the same audio quality issues)

andrewrreed commented on 2024-06-05T21:56:02Z
----------------------------------------------------------------

Interesting, it renders fine here... I guess it's just a limitation of our docs builder. I'm going to look into it a bit further.

review-notebook-app · 2024-06-05T14:49:46Z

View / edit / reply to this conversation on ReviewNB

MoritzLaurer commented on 2024-06-05T14:49:46Z
----------------------------------------------------------------

typos: "use the Serverless"; "what's possible"

review-notebook-app · 2024-06-05T19:50:25Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:24Z
----------------------------------------------------------------

Same comment about using the > [!TIP] callout :)

review-notebook-app · 2024-06-05T19:50:26Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:25Z
----------------------------------------------------------------

"...from our bash shell."

review-notebook-app · 2024-06-05T19:50:27Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:26Z
----------------------------------------------------------------

"The Serverless Inference API exposes..."

"For example, codellama/CodeLlama-7b-hf becomes https://api-inference.huggingface.co/models/codellama/CodeLlama-7b-hf"

review-notebook-app · 2024-06-05T19:50:27Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:27Z
----------------------------------------------------------------

While showing how to call the API with a HTTP request is cool, I wonder if we actually need to show this in the recipe since all the other examples use the InferenceClient.

andrewrreed commented on 2024-06-05T22:09:09Z
----------------------------------------------------------------

The motivation for including this was to show the simplicity before introducing the client which abstracts things away. I think I'll leave it, but take Moritz's advice and convert to Python requests rather than curl

review-notebook-app · 2024-06-05T19:50:28Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:28Z
----------------------------------------------------------------

Maybe add a link to the InferenceClient in case users are interested in checking it out in more detail

review-notebook-app · 2024-06-05T19:50:29Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:28Z
----------------------------------------------------------------

Also use > [!TIP] here

review-notebook-app · 2024-06-05T19:50:29Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:29Z
----------------------------------------------------------------

I think this is more precise and accurate about this section.

"The Serverless Inference API can be used to generate images with Stable Diffusion."

review-notebook-app · 2024-06-05T19:50:30Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2024-06-05T19:50:30Z
----------------------------------------------------------------

Maybe it'd make more sense to have the caching section in the section where you discuss querying with the InferenceClient since it seems like you can use this in the other applications if you wanted to.

stevhliu

Cool examples of the Serverless Inference API! 👏

andrewrreed · 2024-06-05T21:16:28Z

Great TIP :)

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T21:30:06Z

Good call - I added this language that's hopefully not to overwhelming for users:

Next, you'll need to create a [User Access Token](https://huggingface.co/docs/hub/security-tokens). A token with read or write permissions will work, however, we highly encourage the use of fine-grained tokens. For this notebook, you'll need a fine-grained token with Inference > Make calls to the serverless Inference API permissions.

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T21:30:51Z

It's my savior with VSCode notebooks :)

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T21:44:20Z

Just too subtle :)

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T21:46:20Z

Good call!

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T21:56:03Z

Interesting, it renders fine here... I guess it's just a limitation of our docs builder. I'm going to look into it a bit further.

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T22:09:10Z

The motivation for including this was to show the simplicity before introducing the client which abstracts things away. I think I'll leave it, but take Moritz's advice and convert to Python requests rather than curl

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T22:09:56Z

Yes, this is a good call. I swapped out curl for requests

View entire conversation on ReviewNB

andrewrreed · 2024-06-05T22:14:28Z

I'd also probably slightly rephrase the sentence on 500 000 being available in the very beginning

Replaced 500k with "thousands"

that explain how the serverless API works in the backend

I do have some language just after the section that explains it. The goal was to introduce the simplicity and then explain how it worked. But I agree that adding some info about list_deployed_models() is a good idea. Will add that here!

View entire conversation on ReviewNB

MoritzLaurer · 2024-06-11T14:41:19Z

While creating the dedicated endpoint notebook, I noticed that fine-grained token access to serverless API calls alone is not enough, but the fine-grained token also needs access to the model repositories for downloading the tokenizer/processor. In your case that's llama-3 and idefics2 I think. Maybe add a note on this to avoid people being confused when they cannot download the tokenizer.

I might actually recommend using a read-only token for quick experimentation, otherwise users need to add each new model to their fine-grained token permissions where they need to download a tokenizer/processor. The fine-grained token should then be used in production. WDYT?

andrewrreed · 2024-06-11T19:48:44Z

Great catch @MoritzLaurer - I just pushed a commit to add some language on this

andrewrreed · 2024-06-11T19:55:07Z

I think this is all ready to go now - feel free to merge @MoritzLaurer @merveenoyan @stevhliu!

stevhliu

Thanks for the addition :)

andrewrreed added 5 commits May 30, 2024 12:04

begin serverless inference api nb

9353391

wip

eb8bfbc

wip

65d34a8

update examples

ddbcc22

Merge remote-tracking branch 'upstream/main'

ed5ea19

update toctree

cfb4443

andrewrreed added 4 commits June 5, 2024 07:24

remove "coming soon" in overview

e89fd47

cleanup

81c860c

fix image display

8735bcc

fix audio display

a1ff31e

andrewrreed changed the title ~~[WIP] Add Enterprise Hub Serverless API Notebook~~ Add Enterprise Hub Serverless API Notebook Jun 5, 2024

merveenoyan approved these changes Jun 5, 2024

View reviewed changes

stevhliu reviewed Jun 5, 2024

View reviewed changes

wip - incorporate review feedback

09eb62e

andrewrreed added 4 commits June 5, 2024 18:21

incorporate more review feedback

1aa9bcf

re-order nb on overview page

8942483

minor cleanup

be16db0

audio fixed

91f18de

andrewrreed added 2 commits June 11, 2024 15:40

Merge remote-tracking branch 'upstream/main'

4619599

add note about fine grained token requirement

97edf7c

stevhliu approved these changes Jun 11, 2024

View reviewed changes

stevhliu merged commit cb24192 into huggingface:main Jun 11, 2024
1 check passed

Add Enterprise Hub Serverless API Notebook #105

Add Enterprise Hub Serverless API Notebook #105

Conversation

andrewrreed commented Jun 5, 2024

What does this PR do?

Who can review?

review-notebook-app bot commented Jun 5, 2024

HuggingFaceDocBuilderDev commented Jun 5, 2024

andrewrreed commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

merveenoyan left a comment

Choose a reason for hiding this comment

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

review-notebook-app bot commented Jun 5, 2024 • edited Loading

stevhliu left a comment

Choose a reason for hiding this comment

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

andrewrreed commented Jun 5, 2024

MoritzLaurer commented Jun 11, 2024

andrewrreed commented Jun 11, 2024

andrewrreed commented Jun 11, 2024

stevhliu left a comment

Choose a reason for hiding this comment

andrewrreed commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading

review-notebook-app bot commented Jun 5, 2024 •

edited

Loading