Anything after <file_sep> to be stripped from completion #268

robertpiosik · 2024-07-06T11:16:21Z

Describe the bug
When using starcoder2:3b my completions contain fim tokens.
My workaround is to split output on <file_sep>.

To Reproduce
starcoder2:3b with ollama

Screenshots

rjmacarthy · 2024-07-08T10:15:37Z

Hello, I just released a new version which adds <file_sep to stop words for starcoder models. Many thanks.

057c1a1

robertpiosik · 2024-07-08T12:36:38Z

@rjmacarthy What your change does is removing <file_sep> token from output. What needs to be done is removing also all the following tokens after <file_sep>. It looks broken on starcoder end but it is what it is 🤷

So now, with this change I can't use my hacky method of splitting on <file_sep> because it's no longer there:P

robertpiosik · 2024-07-08T13:54:46Z

Ok I did a little research and what is needed is ability to set stop sequence on options key in request body.

here is what works with llm vscode extension:

  "llm.requestBody": {
    "stream": true,
    "options": {
      "stop": [
        "<file_sep>"
      ],
      "temperature": 0,
    }
  },

rjmacarthy · 2024-07-08T16:57:50Z

Hello @robertpiosik the change to remove the stop word is the correct approach as we don't want in the final output, in order for you to continue working with starcoder2 you need to create the correct Modelfile for Ollama and specify your stop words in the configuration. All the best. https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

robertpiosik · 2024-07-08T20:55:37Z

Ok, thank you!

rjmacarthy · 2024-07-09T05:55:09Z

Hey @robertpiosik did that work for you? Many thanks.

robertpiosik · 2024-07-09T08:32:28Z

I decided to use llm-vscode. Setting up custom request body with a stop sequence is more convenient to me, also I don't need Twinny's sidebar features as my vram capacity can fit only phi3-mini which I find quite poorly performing in my use case. Anyway, fantastic work with the extension, cheers!

rjmacarthy · 2024-07-09T12:27:07Z

Ok no worries. Thanks for the help, all the best!

robertpiosik mentioned this issue Jul 8, 2024

format for inference in code completion bigcode-project/starcoder2#10

Open

robertpiosik closed this as completed Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anything after <file_sep> to be stripped from completion #268

Anything after <file_sep> to be stripped from completion #268

robertpiosik commented Jul 6, 2024

rjmacarthy commented Jul 8, 2024 •

edited

Loading

robertpiosik commented Jul 8, 2024 •

edited

Loading

robertpiosik commented Jul 8, 2024

rjmacarthy commented Jul 8, 2024 •

edited

Loading

robertpiosik commented Jul 8, 2024

rjmacarthy commented Jul 9, 2024

robertpiosik commented Jul 9, 2024

rjmacarthy commented Jul 9, 2024

Anything after <file_sep> to be stripped from completion #268

Anything after <file_sep> to be stripped from completion #268

Comments

robertpiosik commented Jul 6, 2024

rjmacarthy commented Jul 8, 2024 • edited Loading

robertpiosik commented Jul 8, 2024 • edited Loading

robertpiosik commented Jul 8, 2024

rjmacarthy commented Jul 8, 2024 • edited Loading

robertpiosik commented Jul 8, 2024

rjmacarthy commented Jul 9, 2024

robertpiosik commented Jul 9, 2024

rjmacarthy commented Jul 9, 2024

rjmacarthy commented Jul 8, 2024 •

edited

Loading

robertpiosik commented Jul 8, 2024 •

edited

Loading

rjmacarthy commented Jul 8, 2024 •

edited

Loading