Skip to content

Conversation

@Zsailer
Copy link
Contributor

@Zsailer Zsailer commented Oct 17, 2025

This adds the ability for personas to specify an avatar image file directly in their package instead of needing to implement their own static file handler.

Personas now set avatar_path to an absolute path pointing to an image file (SVG, PNG, JPG). The server dynamically looks up and serves these files through /api/ai/avatars/{filename}. File lookups are cached using lru_cache to keep things fast.

If multiple personas happen to use the same filename, a warning gets logged and the first one found is used.

Fixes #6

Personas can now specify an avatar_path pointing to an image file in their package. The server will dynamically serve these avatars through a new /api/ai/avatars endpoint. Avatar lookups are cached for performance, and filename conflicts are detected and logged as warnings.

Fixes jupyter-ai-contrib#6
@Zsailer Zsailer added the enhancement New feature or request label Oct 17, 2025
Use JupyterHandler instead of APIHandler to avoid default application/json content-type
Use Optional[str] instead of str | None for Python 3.9 compatibility
@Zsailer Zsailer marked this pull request as ready for review October 17, 2025 22:10
Copy link
Contributor

@3coins 3coins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zsailer
This looks great!
We might be able to avoid the collision issue by using the persona_id hash along with the filename.

# base_persona.py
  @property
  def avatar_path(self) -> str:
      """
      Returns the URL route that serves the avatar shown on messages from this
      persona in the chat. Uses persona ID hash to ensure uniqueness.
      """
      import hashlib

      filename = os.path.basename(self.defaults.avatar_path)
      persona_hash = hashlib.md5(self.id.encode()).hexdigest()[:8]
      return f"/api/ai/avatars/{persona_hash}/{filename}"
# handlers.py
class AvatarHandler(JupyterHandler):
      @tornado.web.authenticated
      async def get(self, persona_hash: str, filename: str):
          """Serve an avatar file by persona hash and filename."""
          # Get the avatar file path from persona managers
          avatar_path = self._find_avatar_file(persona_hash, filename)

          # Rest of the serving logic remains the same...

      @lru_cache(maxsize=128)
      def _find_avatar_file(self, persona_hash: str, filename: str) -> Optional[str]:
          """Find the avatar file path by persona hash and filename."""
          persona_managers = self.settings.get('jupyter-ai', {}).get('persona-managers', {})

          for room_id, persona_manager in persona_managers.items():
              for persona in persona_manager.personas.values():
                  try:
                      # Generate hash for this persona's ID
                      import hashlib
                      computed_hash = hashlib.md5(persona.id.encode()).hexdigest()[:8]

                      if computed_hash == persona_hash:
                          avatar_path = persona.defaults.avatar_path
                          if avatar_path and os.path.basename(avatar_path) == filename:
                              if os.path.exists(avatar_path):
                                  return avatar_path
                              else:
                                  self.log.warning(f"Avatar file not found at path: {avatar_path}")
                  except Exception as e:
                      self.log.warning(f"Error checking avatar for persona {persona.name}: {e}")
                      continue

          return None

Copy link
Contributor

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, BasePersona.id includes the package name & class name. Therefore this should always be a unique ID for any persona in an environment:

@property
def id(self) -> str:
"""
Returns a static & unique ID for this persona. This sets the `username`
field in the data model returned by `self.as_user()`.
The ID is guaranteed to follow the format
`jupyter-ai-personas::<package-name>::<persona-class-name>`. The prefix
allows consumers to easily distinguish AI personas from human users.
If a package provides multiple personas, their class names must be
different to ensure that their IDs are unique.
"""
package_name = self.__module__.split(".")[0]
class_name = self.__class__.__name__
return f"jupyter-ai-personas::{package_name}::{class_name}"

Perhaps we could add another property called hash onto BasePersona, and have the avatar route just point to /api/ai/persona_avatars/{persona.hash}?

Copy link
Contributor

@ellisonbg ellisonbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this PR, we definitely need a nice way like this to manage personas. See comments inline...

self.log.error(f"Error serving avatar file {filename}: {e}")
raise tornado.web.HTTPError(500, f"Error serving avatar file: {str(e)}")

@lru_cache(maxsize=128)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache will be keyed on both self and filename. Because this is on the handler you will get many of these caches rather than a small number that will do the caching you want. Can you extract into a function that does the caching you want?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe invalidate the cache on reloading personas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent catch! You're absolutely right - the lru_cache on an instance method would create a new cache for each request handler instance, making it useless.

I've implemented a module-level cache instead:

  • _avatar_cache: Dict[str, str] dictionary at module level
  • build_avatar_cache() function called when personas are initialized or refreshed
  • _find_avatar_file() now does O(1) dictionary lookup instead of iterating
  • Cache is automatically rebuilt when /refresh-personas is called

This addresses your performance concern about walking all personas on each request.

async def get(self, filename: str):
"""Serve an avatar file by filename."""
# Get the avatar file path from persona managers
avatar_path = self._find_avatar_file(filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is on the handler, it will walk all personas on each request. Any way to remove this from the per-request logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe update all of this on persona loading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Implemented a module-level cache that's built once at initialization and rebuilt on persona refresh. Avatar requests now do O(1) dictionary lookup instead of iterating all personas. See the latest commit for details.

This adds the ability for personas to specify an avatar image file directly in their package instead of needing to implement their own static file handler.

Personas now set `avatar_path` to an absolute path pointing to an image file (SVG, PNG, JPG). The server dynamically looks up and serves these files through `/api/ai/avatars/{url_encoded_id}`.

- Uses persona.id (URL-encoded) for avatar URLs to ensure uniqueness
- File size validation (5MB max)
- Proper content-type headers based on file extension
- Module-level cache for O(1) avatar path lookups (built at init, rebuilt on refresh)

Fixes jupyter-ai-contrib#6
@Zsailer
Copy link
Contributor Author

Zsailer commented Oct 21, 2025

@3coins Thanks for the detailed suggestion! I explored this approach, but ended up going with @dlqqq's suggestion to use the persona ID directly (URL-encoded) since it's already guaranteed to be unique. This simplifies the implementation by avoiding hash computation while achieving the same goal of unique, collision-free avatar URLs.

@Zsailer
Copy link
Contributor Author

Zsailer commented Oct 21, 2025

@dlqqq Great suggestion! I've implemented this approach - the avatar route now uses /api/ai/avatars/{url_encoded_persona_id} where the persona ID is URL-encoded to handle the special characters (::) in the ID format. This ensures uniqueness without needing a separate hash property.

The health endpoint was removed when RouteHandler was deleted, so the frontend no longer needs to check it on activation.
@Zsailer Zsailer requested review from 3coins, dlqqq and ellisonbg October 21, 2025 19:02
Copy link
Contributor

@3coins 3coins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zsailer
This looks great. Thanks for making all updates.

Note: The "absolute path" to avatar file might still trip some users, as in the current form the avatar path should dynamically resolve by specifically using os.path.join(os.path.dirname(__file__), as using any other absolute path (e.g., /Users/myid/mypersona/assets/avatar.svg) would fail for personas distributed for installation.

@Zsailer
Copy link
Contributor Author

Zsailer commented Oct 24, 2025

Thanks!

@Zsailer Zsailer merged commit 651d3b4 into jupyter-ai-contrib:main Oct 24, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow personas to just specify an avatar path instead of avatar URL

4 participants