Skip to content

unifyai/unify

Repository files navigation


Static Badge X (formerly Twitter) Follow Static Badge

Fully hackable LLMOps. Build custom interfaces for: logging, evals, guardrails, labelling, tracing, agents, human-in-the-loop, hyperparam sweeps, and anything else you can think of ✨

Just unify.log your data, and add an interface using the four building blocks:

  1. tables πŸ”’
  2. views πŸ”
  3. plots πŸ“Š
  4. editor πŸ•ΉοΈ (coming soon)

Every LLM product has unique and changing requirements, as do the users. Your infra should reflect this!

We've tried to make Unify as (a) simple, (b) modular and (c) hackable as possible, so you can quickly probe, analyze, and iterate on the data that's important for you, your product and your users ⚑

Quickstart

Sign up, pip install unifyai, run your first eval ⬇️, and then check out the logs in your first interface πŸ“Š

import unify
from random import randint, choice

# initialize project
unify.activate("Maths Assistant")

# build agent
client = unify.Unify("o3-mini@openai", traced=True)
client.set_system_message(
    "You are a helpful maths assistant, "
    "tasked with adding and subtracting integers."
)

# add test cases
qs = [
    f"{randint(0, 100)} {choice(['+', '-'])} {randint(0, 100)}"
    for i in range(10)
]

# define evaluator
@unify.traced
def evaluate_response(question: str, response: str) -> float:
    correct_answer = eval(question)
    try:
        response_int = int(
            "".join(
                [
                    c for c in response.split(" ")[-1]
                    if c.isdigit()
                ]
            ),
        )
        return float(correct_answer == response_int)
    except ValueError:
        return 0.

# define evaluation
@unify.traced
def evaluate(q: str):
    response = client.generate(q)
    score = evaluate_response(q, response)
    unify.log(
        question=q,
        response=response,
        score=score
    )

# execute + log your evaluation
with unify.Experiment():
    unify.map(evaluate, qs)

Focus on your product, not the LLM 🎯

Despite all of the hype, abstractions, and jargon, the process for building quality LLM apps is pretty simple.

create simplest possible agent πŸ€–
while True:
    create/expand unit tests (evals) πŸ—‚οΈ
    while run(tests) failing: πŸ§ͺ
        Analyze failures, understand the root cause πŸ”
        Vary system prompt, in-context examples, tools etc. to rectify πŸ”€
    Beta test with users, find more failures 🚦

We've tried to strip away all of the excessive LLM jargon, so you can focus on your product, your users, and the data you care about, and nothing else πŸ“ˆ

Unify takes inspiration from:

Whether you're technical or non-technical, we hope Unify can help you to rapidly build top-notch LLM apps, and to remain fully focused on your product (not the LLM).

Learn More

Check out our docs, and if you have any questions feel free to reach out to us on discord πŸ‘Ύ

Unify is under active development 🚧, feedback in all shapes/sizes is also very welcome! πŸ™

Happy prompting! πŸ§‘β€πŸ’»