Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminal colors #905

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions guidance/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,18 @@
import sys
import textwrap
import types
import re

import numpy as np

from html.parser import HTMLParser
try:
from colored import Fore # type: ignore[import-untyped]
except ImportError:
colored_is_imported = False
else:
colored_is_imported = True


class _Rewrite(ast.NodeTransformer):
def visit_Constant(self, node):
Expand Down Expand Up @@ -261,3 +270,38 @@ def softmax(array: np.ndarray, axis: int = -1) -> np.ndarray:
array_maxs = np.amax(array, axis=axis, keepdims=True)
exp_x_shifted = np.exp(array - array_maxs)
return exp_x_shifted / np.sum(exp_x_shifted, axis=axis, keepdims=True)


class ModelStateHTMLParser(HTMLParser):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's only console related, have the name/module relate to console.

On a similar note, might be a good time to consider re-architecting visualization to have sibling classes between notebook and terminal.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nking-1 and I are taking a look at rethinking how we're representing internal model state in order to make this possible (hence why this PR appears a bit stalled out at the moment). Currently, formatting information is very very closely entangled with internal representation, but yes ideally we can abstract that away

"""Parse jupyter-flavored HTML that contains colored text to color text for the command-line"""
def __init__(self):
super().__init__()
self.colored_text = ''

def feed(self, data):
self.colored_text = ''
# Remove html insertion tags (I suppose it is for jupyter to recognize it as html markdown)
data = data.replace('<||_html:', '').replace('_||>', '')
super().feed(data)
return self.colored_text

def handle_starttag(self, tag, attrs):
# Start ANSI text coloring when span tag opens
if tag == 'span':
if colored_is_imported:
# Use bg color we would have used in jupyter as fg color (get from style attributes)
style = dict(attrs)['style']
# just capture integer rgb parts of rgba color
rgb = re.search(r'background-color:\s*rgba\((\d+)\.?\d*,\s*(\d+)\.?\d*,\s*(\d)+\.?\d*,\s*\d+\.?\d*\)', style).groups()
self.colored_text += Fore.rgb(*rgb)
else:
# Default to ANSI green (32m) if colored is not available
self.colored_text += '\033[32m'

def handle_endtag(self, tag):
# ANSI reset color when tag closes
if tag == 'span':
self.colored_text += '\033[0m'

def handle_data(self, data):
self.colored_text += data
27 changes: 22 additions & 5 deletions guidance/models/_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,19 @@
import numpy as np

try:
from IPython import get_ipython
from IPython.display import clear_output, display, HTML

ipython_is_imported = True
except ImportError:
ipython_is_imported = False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this, or is it replaced by notebook_mode?

notebook_mode = False
else:
ipython_is_imported = True
_ipython = get_ipython()
notebook_mode = (
_ipython is not None
and "IPKernelApp" in _ipython.config
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the way that tqdm.auto determines notebook context -- would be good to test this on multiple machines/platforms...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had to tackle this detection back in InterpretML too, here's how we did it there: https://github.com/interpretml/interpret/blob/develop/python/interpret-core/interpret/provider/_environment.py. We could re-use some of this logic here perhaps

cc @nopdive

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, hadn't been watching my notifications -- code seems reasonable. Yes, this should be tested (either manually/automatically is fine) on multiple platforms before merge: terminal (Window/Linux/Mac), vscode, jupyter notebook/lab, azure/google/amazon/databricks notebooks. We should be relatively okay if these target environments work.

)

try:
import torch

Expand All @@ -39,7 +47,7 @@
)
from .. import _cpp as cpp
from ._guidance_engine_metrics import GuidanceEngineMetrics
from .._utils import softmax, CaptureEvents
from .._utils import softmax, CaptureEvents, ModelStateHTMLParser
from .._parser import EarleyCommitParser, Parser
from .._grammar import (
GrammarFunction,
Expand Down Expand Up @@ -862,6 +870,8 @@ def __init__(self, engine, echo=True, **kwargs):
self._event_parent = None
self._last_display = 0 # used to track the last display call to enable throttling
self._last_event_stream = 0 # used to track the last event streaming call to enable throttling
self._state_html_parser = ModelStateHTMLParser() # used to parse the state for cli display
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before, consider making naming specific to console if it's only used for console mode.

self._last_state_len = 0 # used to track the last state length for appending to cli display

@property
def active_role_end(self):
Expand Down Expand Up @@ -975,11 +985,18 @@ def _update_display(self, throttle=True):
else:
self._last_display = curr_time

if ipython_is_imported:
if notebook_mode:
clear_output(wait=True)
display(HTML(self._html()))
else:
pprint(self._state)
print(
self._state_html_parser.feed(
self._state[self._last_state_len:]
),
end='',
flush=True
)
self._last_state_len = len(self._state)

def reset(self, clear_variables=True):
"""This resets the state of the model object.
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
"openai": ["openai>=1.0"],
"schemas": ["jsonschema"],
"server": ["fastapi", "uvicorn"],
"cli": ["colored"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a super widely-used package, so I am open to alternative solutions if there are any concerns.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used colorama before, which claimed better Windows support if I remember right?

@nopdive might have thoughts here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colorama is a great library for printing "named" ANSI codes (like "green"), but afaik it doesn't support mapping rgb triples to ANSI codes. However, as you note, colorama does have better windows support via their init or just_fix_windows_console functions.

We can manually put together a lookup table if we only want to support a few shades from red to green or something like that, but I'll also take a look at some of the other libraries that colorama itself recommends :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\begin{mutter}If you refactor so that notebook output and console output are separate implementations of the same base class, this problem largely goes away\end{mutter}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem still stands of choosing a way to map probabilities to colors, but yes at the very least the problem of parsing a particular rgba value goes away :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Harsha-Nori, I'm not familiar with colored, and colorama in its current state: not fussed which package is used as long as its maintained and we can confirm it works for our target set of environments.

}

# Create the union of all our requirements
Expand Down
Loading