Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the isjsonvalid() and the wordstream() functions #52

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

atantos
Copy link

@atantos atantos commented Nov 18, 2023

Hi there!

You might want to consider adding a default wordstream() function. The function parses the stream and it seems to work as is. If you think it is ok, I could also contribute to the docu of the function.

Regards.

Alex

@roryl23
Copy link
Collaborator

roryl23 commented Nov 18, 2023

Hey Alex, thanks for the PR! I'm getting caught up after a hectic couple weeks. Will take a look by the end of today

@roryl23 roryl23 mentioned this pull request Nov 18, 2023
@svilupp
Copy link
Contributor

svilupp commented Nov 19, 2023

Out of curiousity, what is the use case here? What would be the advantages of implementing it?

It seems that it basically parses JSON twice anyway, once inside a try-catch (in isvalidjson) and then when actually parsing. What is the advantage over just having the try-catch around the parsing itself?

Side note: perhaps you could disable your formatter to only change the proposed functionality? There are a few places picked up by the diff where only spaces have changed.

@atantos
Copy link
Author

atantos commented Nov 19, 2023

Thank you for the insight, @svilupp! I'm thinking of a "default" streamcallback function for word-by-word streaming that would be very useful for a user for a number of reasons. Crafting such a function is not straightforward, as the data stream isn't just a series of JSON-parsable objects prefixed with "Data:". The stream may split, with a packet starting mid-way through an object from the previous packet. You're correct about the double JSON parsing; there might indeed be a more efficient method to verify if the input is JSON before continuing with the wordstream() function. Also, I appreciate your feedback on the formatter—I'll definitely keep that in mind!

@svilupp
Copy link
Contributor

svilupp commented Nov 27, 2023

Thank you for the insight, @svilupp! I'm thinking of a "default" streamcallback function for word-by-word streaming that would be very useful for a user for a number of reasons. Crafting such a function is not straightforward, as the data stream isn't just a series of JSON-parsable objects prefixed with "Data:". The stream may split, with a packet starting mid-way through an object from the previous packet. You're correct about the double JSON parsing; there might indeed be a more efficient method to verify if the input is JSON before continuing with the wordstream() function. Also, I appreciate your feedback on the formatter—I'll definitely keep that in mind!

Apologies for the slow response!

Would you mind outlining some of the use cases for building JSONs on the fly as they are getting streamed? I lack imagination and I tend to associate streaming with nicer UX design for chat interfaces. Are you thinking of some other use case?

I'm not very familiar with the functionality, so I'd appreciate if you could capture a few key test cases and codify them in the test set. It will help us prevent any future regressions and it will help with onboarding new devs, because they'll see a practical example.
I can imagine that parsing text vs JSON mode response vs function_call will require a slightly different approach, so I'm keen to see that covered in the test suite.

Copy link
Collaborator

@cpfiffer cpfiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the PR and think we should merge it in. I left two comments I hope we can incorporate to make it a little more obvious how this stuff works.

Comment on lines +240 to +246
isvalidjson(str) =
try
JSON3.read(str)
true
catch
false
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note but I typically prefer function ... end syntax for multiline functions.

Suggested change
isvalidjson(str) =
try
JSON3.read(str)
true
catch
false
end
function isvalidjson(str)
try
JSON3.read(str)
true
catch
false
end
end

Comment on lines +249 to +251
"""
Default streamcallback function for create_<action> functions.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a little usage section here to show how people should use this?

@svilupp
Copy link
Contributor

svilupp commented Feb 1, 2024

Perhaps we should learn from what others are doing?

Langchain:

astream_events

It’s worth calling out that we’re using our new astream_eventsmethod to easily stream back all events (new tokens, as well as function calls and function results) and surface them to the user. We do some filtering of this stream to get relevant messages or message chunks, and then render them nicely in the UI. If you aren’t familiar with astream_events, it is definitely worth checking it out in more detail here.

Code: https://github.com/langchain-ai/langchain/blob/a0ec04549575de5547faa5381fa69fef61405ac8/libs/core/langchain_core/runnables/base.py#L697

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants