Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Observability #199

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c3eb507
Add SQLite and Langfuse trackers; Refactor test system
bonk1t Dec 9, 2024
c423a3a
Import fix; Revert Pydantic version
bonk1t Dec 9, 2024
5ecbe46
Import fix
bonk1t Dec 9, 2024
dbaad0f
Pass the model name
bonk1t Dec 9, 2024
c8e91a0
Cleanup, minor refactor
bonk1t Dec 9, 2024
bb27f8c
Implement and test Langfuse and SQLite tracking
bonk1t Dec 10, 2024
d5ca39f
Add more tracing
bonk1t Dec 10, 2024
c72897c
Minor improvement
bonk1t Dec 14, 2024
ec9e381
Reduce unnecessary complexity, Improve local tracking
bonk1t Dec 17, 2024
9d61c85
Implement Langchain Callbacks
bonk1t Dec 19, 2024
2493830
Minor fix in SendMessageSwarm
bonk1t Dec 19, 2024
dbb5ba8
Local Callback Handler fix
bonk1t Dec 19, 2024
e3f5e31
Minor fixes
bonk1t Dec 19, 2024
baced53
Merge pull request #2 from bonk1t/dev/integrate-langchain-callbacks
bonk1t Dec 19, 2024
ae44002
Merge branch 'main' into dev/integrate-observability
bonk1t Dec 19, 2024
8612c3b
Add docs, minor fixes and improve Developer Experience
bonk1t Dec 19, 2024
64f2b54
Bug fixes and improvements
bonk1t Dec 20, 2024
630fa65
Resolve Langfuse issues
bonk1t Dec 20, 2024
ae2c5ac
Minor fix
bonk1t Dec 22, 2024
f53429f
Local Callback Handler improvements
bonk1t Dec 23, 2024
a0e926a
Integrate AgentOps; Minor fixes in thread.py
bonk1t Dec 24, 2024
c9bc665
Minor fixes
bonk1t Dec 24, 2024
a7f02ee
Refactor thread.py for maintainability and extensibility:
bonk1t Dec 26, 2024
fc8ca07
Fix model name
bonk1t Dec 26, 2024
8f99575
Refactor, bug fix
bonk1t Jan 1, 2025
10da41d
Implement Tracking for Streaming
bonk1t Jan 2, 2025
5a438c7
Minor improvements
bonk1t Jan 2, 2025
0ad9be0
Local callback handler: Track tokens when using tools
bonk1t Jan 2, 2025
c98fbe8
Improve tracking messages
bonk1t Jan 2, 2025
f9e9d37
Improve tracking messages
bonk1t Jan 2, 2025
ba1c4e2
Fix yield_messages=True, async_mode=threading, type hints
bonk1t Jan 3, 2025
a125442
Call on_llm_end to track cost on AgentOps
bonk1t Jan 6, 2025
377c902
Apply thread.py bug fix
bonk1t Jan 8, 2025
b85764b
Update observability.md
bonk1t Jan 8, 2025
3f05aaa
Merge branch 'main' into dev/integrate-observability
bonk1t Jan 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add docs, minor fixes and improve Developer Experience
  • Loading branch information
bonk1t committed Dec 19, 2024
commit 8612c3b0e6fef22847b6db42d9668ebeb4829966
10 changes: 5 additions & 5 deletions agency_swarm/__init__.py
Original file line number Diff line number Diff line change
@@ -14,14 +14,14 @@

__all__ = [
"Agency",
"AgencyEventHandler",
"Agent",
"BaseTool",
"AgencyEventHandler",
"get_callback_handler",
"get_openai_client",
"set_openai_client",
"set_openai_key",
"llm_validator",
"init_tracking",
"get_callback_handler",
"llm_validator",
"set_callback_handler",
"set_openai_client",
"set_openai_key",
]
13 changes: 6 additions & 7 deletions agency_swarm/agency/agency.py
Original file line number Diff line number Diff line change
@@ -11,7 +11,6 @@
Dict,
List,
Literal,
Optional,
Type,
TypedDict,
TypeVar,
@@ -67,7 +66,7 @@ def __init__(
max_prompt_tokens: int = None,
max_completion_tokens: int = None,
truncation_strategy: dict = None,
callback_handler: Optional[Any] = None,
callback_handler: Any = None,
):
"""
Initializes the Agency object, setting up agents, threads, and core functionalities.
@@ -214,7 +213,7 @@ def get_completion(
def get_completion_stream(
self,
message: str,
event_handler: Optional[Type[AgencyEventHandler]] = None,
event_handler: Type[AgencyEventHandler] | None = None,
message_files: List[str] = None,
recipient_agent: Agent = None,
additional_instructions: str = None,
@@ -348,7 +347,7 @@ def demo_gradio(self, height=450, dark_mode=True, **kwargs):
recipient_agent = self.main_recipients[0]

chatbot_queue = queue.Queue()
gradio_handler = create_gradio_handler(chatbot_queue=chatbot_queue)
gradio_handler_class = create_gradio_handler(chatbot_queue=chatbot_queue)

with gr.Blocks(js=js) as demo:
chatbot = gr.Chatbot(height=height)
@@ -531,7 +530,7 @@ def bot(original_message, history, dropdown):
target=self.get_completion_stream,
args=(
original_message,
gradio_handler,
gradio_handler_class,
[],
recipient_agent,
"",
@@ -654,7 +653,7 @@ def run_demo(self):
"""
Executes agency in the terminal with autocomplete for recipient agent names.
"""
term_handler = create_term_handler(agency=self)
term_handler_class = create_term_handler(agency=self)

self.recipient_agents = [str(agent.name) for agent in self.main_recipients]

@@ -687,7 +686,7 @@ def run_demo(self):

self.get_completion_stream(
message=text,
event_handler=term_handler,
event_handler=term_handler_class,
recipient_agent=recipient_agent,
)

Original file line number Diff line number Diff line change
@@ -34,7 +34,7 @@ def run(self):
f.close()

res = client.chat.completions.create(
model="gpt-4o-mini",
model="gpt-3.5-turbo",
messages=examples
+ [
{"role": "user", "content": agency_py},
Original file line number Diff line number Diff line change
@@ -22,7 +22,7 @@ def run(self):
content = " ".join(content.split()[:10000])

completion = client.chat.completions.create(
model="gpt-4o-mini",
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
62 changes: 33 additions & 29 deletions agency_swarm/threads/thread.py
Original file line number Diff line number Diff line change
@@ -5,10 +5,9 @@
import re
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import List, Optional, Type, Union
from typing import List, Type, Union
from uuid import UUID

from langchain.schema import AgentAction, AgentFinish, HumanMessage
from openai import APIError, BadRequestError
from openai.types.beta import AssistantToolChoice
from openai.types.beta.threads.message import Attachment
@@ -21,6 +20,11 @@
from agency_swarm.util.oai import get_openai_client
from agency_swarm.util.streaming.agency_event_handler import AgencyEventHandler
from agency_swarm.util.tracking import get_callback_handler
from agency_swarm.util.tracking.langchain_types import (
AgentAction,
AgentFinish,
HumanMessage,
)


class Thread:
@@ -86,13 +90,13 @@ def init_thread(self):
def get_completion_stream(
self,
message: Union[str, List[dict], None],
event_handler: Optional[Type[AgencyEventHandler]] = None,
event_handler: Type[AgencyEventHandler] | None = None,
message_files: List[str] = None,
attachments: Optional[List[Attachment]] = None,
attachments: List[Attachment] | None = None,
recipient_agent: Agent = None,
additional_instructions: str = None,
tool_choice: AssistantToolChoice = None,
response_format: Optional[dict] = None,
response_format: dict | None = None,
):
return self.get_completion(
message,
@@ -110,14 +114,14 @@ def get_completion(
self,
message: Union[str, List[dict], None],
message_files: List[str] = None,
attachments: Optional[List[dict]] = None,
attachments: List[dict] | None = None,
recipient_agent: Union[Agent, None] = None,
additional_instructions: str = None,
event_handler: Optional[Type[AgencyEventHandler]] = None,
event_handler: Type[AgencyEventHandler] | None = None,
tool_choice: AssistantToolChoice = None,
yield_messages: bool = False,
response_format: Optional[dict] = None,
parent_run_id: Optional[UUID] = None,
response_format: dict | None = None,
parent_run_id: UUID | None = None,
):
self.init_thread()

@@ -220,25 +224,7 @@ def get_completion(
)
tool_outputs_and_names = [] # list of tuples (name, tool_output)

# on_agent_action before each tool call
for tc in tool_calls:
# Construct an AgentAction
args = (
json.loads(tc.function.arguments)
if tc.function.arguments
else {}
)
action = AgentAction(
tool=tc.function.name,
tool_input=args,
log=f"Using tool {tc.function.name} with arguments: {args}",
)
if self.callback_handler:
self.callback_handler.on_agent_action(
action=action,
run_id=self._run.id,
parent_run_id=chain_run_id,
)
self._track_tool_calls(tool_calls, chain_run_id)

sync_tool_calls, async_tool_calls = self._get_sync_async_tool_calls(
tool_calls, recipient_agent
@@ -581,7 +567,7 @@ def _create_run(
event_handler,
tool_choice,
temperature=None,
response_format: Optional[dict] = None,
response_format: dict | None = None,
):
try:
if event_handler:
@@ -920,3 +906,21 @@ def get_messages(self, limit=None):
break

return all_messages

def _track_tool_calls(self, tool_calls: List[ToolCall], chain_run_id: str) -> None:
"""Send agent_action before each tool call"""
if not self.callback_handler:
return

for tc in tool_calls:
args = json.loads(tc.function.arguments) if tc.function.arguments else {}
action = AgentAction(
tool=tc.function.name,
tool_input=args,
log=f"Using tool {tc.function.name} with arguments: {args}",
)
self.callback_handler.on_agent_action(
action=action,
run_id=self._run.id,
parent_run_id=chain_run_id,
)
31 changes: 19 additions & 12 deletions agency_swarm/util/tracking/__init__.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
import threading
from typing import Callable, Literal

from langfuse.callback import CallbackHandler as LangfuseCallbackHandler

from .local_callback_handler import LocalCallbackHandler

_callback_handler = None
_lock = threading.Lock()


SUPPORTED_TRACKERS = Literal["langfuse", "local"]


def get_callback_handler():
global _callback_handler
with _lock:
@@ -21,17 +20,25 @@ def set_callback_handler(handler: Callable):
_callback_handler = handler()


def init_tracking(name: Literal["local", "langfuse"]):
if name == "local":
set_callback_handler(LocalCallbackHandler)
elif name == "langfuse":
set_callback_handler(LangfuseCallbackHandler)
else:
raise ValueError(f"Invalid tracker name: {name}")
def init_tracking(tracker_name: SUPPORTED_TRACKERS, **kwargs):
if tracker_name not in SUPPORTED_TRACKERS:
raise ValueError(f"Invalid tracker name: {tracker_name}")

from .langchain_types import use_langchain_types

use_langchain_types()

if tracker_name == "local":
from .local_callback_handler import LocalCallbackHandler

set_callback_handler(lambda: LocalCallbackHandler(**kwargs))
elif tracker_name == "langfuse":
from langfuse.callback import CallbackHandler as LangfuseCallbackHandler

set_callback_handler(lambda: LangfuseCallbackHandler(**kwargs))


__all__ = [
"LocalCallbackHandler",
"init_tracking",
"get_callback_handler",
"set_callback_handler",
23 changes: 23 additions & 0 deletions agency_swarm/util/tracking/langchain_types.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
from typing import TYPE_CHECKING, TypeVar

if TYPE_CHECKING:
from langchain.schema import AgentAction as LangchainAgentAction
from langchain.schema import AgentFinish as LangchainAgentFinish
from langchain.schema import HumanMessage as LangchainHumanMessage

# Define TypeVars that can be either our placeholder or langchain types
AgentAction = TypeVar("AgentAction", bound="LangchainAgentAction")
AgentFinish = TypeVar("AgentFinish", bound="LangchainAgentFinish")
HumanMessage = TypeVar("HumanMessage", bound="LangchainHumanMessage")


def use_langchain_types():
"""Switch to using langchain types after langchain is imported"""
global AgentAction, AgentFinish, HumanMessage
from langchain.schema import AgentAction as LangchainAgentAction
from langchain.schema import AgentFinish as LangchainAgentFinish
from langchain.schema import HumanMessage as LangchainHumanMessage

AgentAction = LangchainAgentAction
AgentFinish = LangchainAgentFinish
HumanMessage = LangchainHumanMessage
73 changes: 73 additions & 0 deletions docs/advanced-usage/observability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Observability

Agency Swarm supports tracking your agents using Langchain callbacks. This allows you to monitor and analyze the behavior and performance of your agents.

To use tracking with Langchain callbacks, you need to install the langchain package:

```bash
pip install langchain
```

## Langfuse

Langfuse is an observability platform that allows you to track and analyze the execution of your agents in detail. It provides features like tracing, metrics, and debugging tools.

To use Langfuse with Agency Swarm, follow these steps:

1. Install the langfuse package:

```bash
pip install langfuse
```

2. Set the LANGFUSE_SECRET_KEY and LANGFUSE_PUBLIC_KEY environment variables:

```bash
export LANGFUSE_SECRET_KEY=<your-secret-key>
export LANGFUSE_PUBLIC_KEY=<your-public-key>
```

You can get your keys from the [Langfuse dashboard](https://cloud.langfuse.com/).

3. Initialize the tracking in your code:

```python
from agency_swarm import init_tracking

init_tracking("langfuse")
```

You can pass additional configuration options to the Langfuse callback handler:

```python
init_tracking(
"langfuse",
debug=True,
host="custom-host",
user_id="user-123",
)
```

For additional parameters and more information on the Langfuse callback handler, see the [Langfuse documentation](https://langfuse.com/docs/integrations/langchain/tracing#add-langfuse-to-your-langchain-application).

## Local

The local tracker provides a lightweight solution for logging agent activities to a SQLite database.

To use the local tracker, simply initialize it in your code:

```python
from agency_swarm import init_tracking

init_tracking("local")
```

This will create a SQLite database in the current working directory.

For custom database location:

```python
from agency_swarm import init_tracking

init_tracking("local", db_path="path/to/your/database.db")
```
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -19,7 +19,6 @@ dependencies = [
"deepdiff==6.7.1",
"docstring_parser==0.16",
"jsonref==1.1.0",
"langfuse==2.55.0",
"openai>=1.55.3,<2.0.0",
"pydantic==2.8.2",
"python-dotenv==1.0.1",
2 changes: 1 addition & 1 deletion requirements_test.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
langchain==0.0.345
langchain==0.3.13
pytest
Loading