Updated Cookbook: Example for Fetching Scores from Langfuse#857
Updated Cookbook: Example for Fetching Scores from Langfuse#857Sohammhatre10 wants to merge 12 commits intolangfuse:mainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
@Sohammhatre10 is attempting to deploy a commit to the langfuse Team on Vercel. A member of the Team first needs to authorize it. |
|
Your Name seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
👍 Looks good to me! Reviewed everything up to 02ebc24 in 42 seconds
More details
- Looked at
1680lines of code in26files - Skipped
3files when reviewing. - Skipped posting
1drafted comments based on config settings.
1. pages/docs/integrations/dspy.md:242
- Draft comment:
Remove trailing whitespace for cleaner code. This issue is present in multiple files, such asexample-javascript.md,example-python-langgraph.md,example-python-instrumentation-module.md,example-python.md,example-vercel-ai.md,external-evaluation-pipelines.md,integration_dspy.md,integration_instructor.md,integration_langgraph.md,integration_llama-index_instrumentation.md,integration_llama_index_posthog_mistral.md,integration_mirascope.md,integration_mistral_sdk.md,integration_ollama.md,integration_openai_structured_output.md,js_integration_langchain.md,js_tracing_example_vercel_ai_sdk.md,prompt_management_langchain.md. - Reason this comment was not posted:
Confidence changes required:50%
The PR introduces a new example for fetching scores from Langfuse, but there are several instances of trailing whitespace in the markdown files. These should be removed for cleaner code.
Workflow ID: wflow_ON6OiLFA8uvvhNsK
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.
There was a problem hiding this comment.
Disclaimer: Experimental PR review
PR Summary
This pull request adds a comprehensive example of using the fetch_scores() function from Langfuse to retrieve and analyze evaluation metrics, integrating UpTrain and Ragas for model evaluation.
- Added
pages/docs/scores/example_usage_of_fetch_score.mdwith detailed code snippets for setting up, evaluating models, logging scores, and visualizing correlations - Updated
pages/guides/cookbook/example_external_evaluation_pipelines.mdwith a guide on creating external evaluation pipelines using Langfuse, including synthetic data creation and custom evaluations - Made minor formatting and content improvements across multiple integration cookbooks (DSPy, Instructor, LangGraph, etc.) to enhance readability and consistency
- Updated various Langchain examples to demonstrate better integration with Langfuse for tracing and prompt management
26 file(s) reviewed, 7 comment(s)
Edit PR Review Bot Settings | Greptile
marcklingen
left a comment
There was a problem hiding this comment.
thanks for the contribution. It seems like you mostly want to showcase correlation analysis of different scores in Langfuse (which is a good notebook example). Are you sure that your example correlates the scores on a single trace basis for the analysis at the bottom of this notebook
|
@marcklingen Yupp, this was based on a single trace, and the scores were fetched accordingly. Haven't used any specifics for traces, but this was the first trace I created, so it defaulted to the first trace. Should I add more specificity for a single trace? Apologies for the late reply. |
|
@jannikmaierhoefer can you review this? |
Description
This update provides an example of using the
fetch_scores()function from Langfuse to retrieve evaluation metrics. The example integrates UpTrain and Ragas for model evaluation and demonstrates how to log and fetch scores within Langfuse as mentioned in langfuse/langfuse#3505Key Features
Evaluation with UpTrain and Ragas:
Fetching Scores:
fetch_scores_from_langfuse.Correlation Matrix Visualization:
Important
Adds an example for using Langfuse to fetch scores, evaluate models with UpTrain and Ragas, and visualize results using a correlation matrix.
fetch_scores()from Langfuse to retrieve evaluation metrics.dspy.md,instructor.md,example-javascript.md,example-python-langgraph.md,example-python-instrumentation-module.md,example-python.md,example-vercel-ai.md,example_external_evaluation_pipelines.md,integration_dspy.md,integration_instructor.md,integration_langgraph.md,integration_llama-index_instrumentation.md,integration_llama_index_posthog_mistral.md,integration_mirascope.md,integration_mistral_sdk.md,integration_ollama.md,integration_openai_structured_output.md,example-langchain.md,js_integration_langchain.md,js_tracing_example_vercel_ai_sdk.md,prompt_management_langchain.md.This description was created by
for 02ebc24. It will automatically update as commits are pushed.