Add llama C++ example #926

natke · 2024-09-25T21:16:43Z

No description provided.

RyanUnderhill · 2024-09-25T21:48:27Z

examples/c/src/llama.cpp

+      // Show usage of GetOutput
+      std::unique_ptr<OgaTensor> output_logits = generator->GetOutput("logits");
+
+      // Assuming output_logits.Type() is float as it's logits


Just fyi, the raw model output can be float16 if the model runs on cuda. Our internal "processed logits" are always float32

So is this correct, or not?

RyanUnderhill · 2024-09-25T21:48:54Z

examples/c/src/llama.cpp

+      generator->GenerateNextToken();
+
+      // Show usage of GetOutput
+      std::unique_ptr<OgaTensor> output_logits = generator->GetOutput("logits");


Is the std::unique_ptr<OgaTensor> for clarity vs auto?

So auto would be simpler? I'll update it

RyanUnderhill · 2024-09-25T21:51:36Z

examples/c/src/llama.cpp

+  std::cout << "Run Llama" << std::endl;
+  std::cout << "-------------" << std::endl;
+
+#ifdef USE_CXX


What is the USE_CXX here for, given the whole file is C++?

baijumeswani · 2024-09-27T17:46:53Z

What changes from one decoder-only model to another? Shall we instead create a standardized example that can work with other decoder-only models?

kunal-vaishnavi · 2024-09-28T00:37:48Z

What changes from one decoder-only model to another? Shall we instead create a standardized example that can work with other decoder-only models?

I agree that one standardized example should be sufficient. From a user's perspective, the main change between decoder-only models is the chat template.

kunal-vaishnavi · 2024-09-28T00:51:50Z

examples/c/CMakeLists.txt

@@ -6,6 +6,7 @@ set(CMAKE_CXX_STANDARD 20)
 option(USE_CUDA "Build with CUDA support" OFF)
 option(USE_CXX "Invoke the C++ example" ON)
 option(PHI3 "Build the Phi example" OFF)
+option(LLAMA "Build the Llama example" OFF)


Could we have something like the following?

Suggested change

option(LLAMA "Build the Llama example" OFF)

option(LLM "Build the large-language model example" OFF)

option(VLM "Build the vision-language model example" OFF)

option(ALM "Build the audio-language model example" OFF)

The abbreviations and names can be different, but the idea would be to create examples grouped by input and output modality.

The LLM could be for decoder-only models (e.g. LLaMA, Phi)

Inputs: text

Outputs: text

The VLM could be for vision-text models (e.g. Phi-3/Phi-3.5 vision)

Inputs: images, text

Outputs: text

The ALM could be for audio-text models (e.g. Whisper)

Inputs: audios, text

Outputs: text

I think this is a great proposal for a new PR!

natke · 2024-10-01T20:50:36Z

What changes from one decoder-only model to another? Shall we instead create a standardized example that can work with other decoder-only models?

I agree that we should have a base decoder class, maybe? And these scripts can inherit from that class. But I'd like to get this sample in soonish so that folks can run Llama

natke added 4 commits September 17, 2024 11:17

Llama example

58a56ab

Changes to build file

03d43f8

Merge branch 'main' into add-llama-example

ab27dfc

Add Llama to README

e6f1500

RyanUnderhill reviewed Sep 25, 2024

View reviewed changes

kunal-vaishnavi reviewed Sep 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add llama C++ example #926

Add llama C++ example #926

natke commented Sep 25, 2024

RyanUnderhill Sep 25, 2024

natke Oct 1, 2024

RyanUnderhill Sep 25, 2024

natke Oct 1, 2024

RyanUnderhill Sep 25, 2024

baijumeswani commented Sep 27, 2024

kunal-vaishnavi commented Sep 28, 2024

kunal-vaishnavi Sep 28, 2024

natke Oct 1, 2024

natke commented Oct 1, 2024

-option(LLAMA "Build the Llama example" OFF)
+option(LLM "Build the large-language model example" OFF)
+option(VLM "Build the vision-language model example" OFF)
+option(ALM "Build the audio-language model example" OFF)

Add llama C++ example #926

Are you sure you want to change the base?

Add llama C++ example #926

Conversation

natke commented Sep 25, 2024

RyanUnderhill Sep 25, 2024

Choose a reason for hiding this comment

natke Oct 1, 2024

Choose a reason for hiding this comment

RyanUnderhill Sep 25, 2024

Choose a reason for hiding this comment

natke Oct 1, 2024

Choose a reason for hiding this comment

RyanUnderhill Sep 25, 2024

Choose a reason for hiding this comment

baijumeswani commented Sep 27, 2024

kunal-vaishnavi commented Sep 28, 2024

kunal-vaishnavi Sep 28, 2024

Choose a reason for hiding this comment

natke Oct 1, 2024

Choose a reason for hiding this comment

natke commented Oct 1, 2024