Skip to content

ddltn/giuseppe-prompting-pizzeria

Repository files navigation

Welcome to Sloppy Giuseppe's - part pizza parlor, part prompting palace!

This project tests LLM responses for pizza order processing using promptfoo. The LLMs are prompted to convert voice transcripts of pizza orders into structured JSON data.

Setup

  1. Install promptfoo:
npm install -g promptfoo
  1. Make sure you have the necessary API keys set up for the LLM providers you're using.

Configuration

The project uses the following configuration:

  • promptfooconfig.yaml: Main configuration file for promptfoo
  • prompts.py: Contains the prompt templates for different LLM approaches
  • pizza_orders_tests.json: Test cases with transcripts and expected JSON outputs

Running Tests

To run the tests:

promptfoo eval

Point the config file to single-test.json if you want to test a single case.

This will:

  1. Process each transcript in the test file
  2. Send it to each LLM provider with each prompt template
  3. Compare the JSON output with the expected output
  4. Save results to results.json

Test Structure

The test file (pizza_orders_tests.json) contains an array of test cases, each with:

  • description: A clear description of what the test case is checking
  • vars: Contains the input variables (transcript)
  • assert: Contains assertions to validate the output

Each test case includes an equality assertion that compares the LLM output with the expected JSON structure.

Assertions

The configuration uses the equals assertion type to validate that the LLM output exactly matches the expected JSON structure. This includes checking:

  • Pizza type (cheese, pepperoni, vegetarian)
  • Size (small, medium, extra large)
  • Extra toppings
  • Processing status

Results

Results are saved to results.json and include:

  • The original prompt
  • The LLM output
  • Test variables
  • Assertion results

Run promptfoo view to visualize the results in the Promptfoo interface.

Adding New Tests

To add new tests:

  1. Add a new object to the array in pizza_orders_tests.json
  2. Include a clear description of what the test is checking
  3. Add the transcript in the vars object
  4. Add an equality assertion with the expected output
  5. Run the tests again to validate

Exercises

Try these exercises to deepen your understanding of prompt engineering and LLM behavior:

  1. Prompt Optimization

    • Can you improve the prompt to achieve a higher success rate with a cheaper model?

    • Try making the prompt shorter while maintaining or improving the success rate

    • Experiment with different prompt structures (few-shot examples, step-by-step reasoning, etc.)

    • Identify patterns in where models succeed or fail

  2. Prompt Engineering Techniques

    • Try implementing chain-of-thought prompting
    • Experiment with different formats for the expected JSON output
    • Test the impact of including/excluding certain context in the prompt
  3. Performance Optimization

    • Measure and compare response times across different prompt versions
    • Try to optimize for both accuracy and speed
    • Experiment with temperature settings and their impact on consistency

Share your findings and improvements with the community!

About

A fun experiment using promptfoo for evaluating LLM responses on pizza order transcripts

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages