Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add dedicated endpoints recipe #122

Conversation

MoritzLaurer
Copy link
Collaborator

What does this PR do?

Adds the enterprise hub cookbook for dedicated endpoints

Who can review?

@merveenoyan and @stevhliu.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@MoritzLaurer MoritzLaurer changed the title add recipe .ipynb with troctree and overview update add dedicated endpoints recipe Jun 17, 2024
@@ -0,0 +1,924 @@
{
Copy link
Member

@stevhliu stevhliu Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Have you ever wanted to create your own machine learning API?"

"...thousands of models on the HF Hub, create your own API on a deployment platform you control, and on hardware you choose."


Reply via ReviewNB

@@ -0,0 +1,924 @@
{
Copy link
Member

@stevhliu stevhliu Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...credit card to the billing settings or your HF organization."


Reply via ReviewNB

@@ -0,0 +1,924 @@
{
Copy link
Member

@stevhliu stevhliu Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...which can take several minutes for large models."


Reply via ReviewNB

@@ -0,0 +1,924 @@
{
Copy link
Member

@stevhliu stevhliu Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"...the TGI container running on the endpoint applies..."

"Here is the slightly modified code..."


Reply via ReviewNB

@@ -0,0 +1,924 @@
{
Copy link
Member

@stevhliu stevhliu Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to nest this under Creating your first endpoint since they're related topics.


Reply via ReviewNB

@@ -0,0 +1,924 @@
{
Copy link
Member

@stevhliu stevhliu Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be a second level section ## Additional information since it is pretty general and not specific to creating endpoints for a variety of models.


Reply via ReviewNB

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 👍

A minor suggestion is to standardize the capitalization of "endpoint". In some places, it is capitalized, and in others, it is not.

@@ -0,0 +1,924 @@
{
Copy link
Collaborator

@merveenoyan merveenoyan Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readability rewrite:

You can use InferenceClient to easily send requests to your endpoint. It's a convenient utility available in the huggingface_hub Python library that allows you to easily make calls to both Dedicated Inference Endpoints and the Serverless Inference API.


Reply via ReviewNB

@@ -0,0 +1,924 @@
{
Copy link
Collaborator

@merveenoyan merveenoyan Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now let's create an endpoint for a vision language model (VLM). VLMs are very similar to LLMs, with the difference of taking text and image inputs simultaneously. These models autoregressively generate text, just like a standard LLM. VLMs can tackle many tasks varying from visual question answering to document understanding. For this example, we will use Idefics2, a powerful 8B parameter VLM.

I feel like document reasoning is a very cool industry use case for multimodal so I thought I'd add it, IDEFICS2 has good doc capabilities, other than that mostly did rewriting for clarity


Reply via ReviewNB

Copy link
Collaborator

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this recipe!

@MoritzLaurer MoritzLaurer merged commit 5bcdc04 into huggingface:main Jul 3, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants