Skip to content

Conversation

@delenzhang
Copy link

@delenzhang delenzhang commented Nov 10, 2025

This PR introduces a file-based caching mechanism to persist API responses locally, addressing the limitations of in-memory caching. The goal is to reduce redundant API calls and improve data retrieval efficiency, especially after service restarts.

Motivation

Previously, local memory caching was volatile — all cached data was lost after a system restart. This resulted in repeated data retrieval for large financial datasets, increasing both latency and API usage.
The new file cache persistence ensures that frequently accessed data remains available between sessions, significantly improving efficiency.

Key Changes

Implemented file-based caching for financial dataset APIs.

Added cache date comparison logic — the system decides whether to refresh data based on the query date versus the cached date.

Introduced partial data fetching — data is retrieved incrementally, with caching prioritized in later requests to minimize API calls.

Enhanced cache tracking and statistics — cache usage, hit/miss rates, and update frequency are now logged and displayed.

Refactored related functions:

get_financial_metrics

search_line_items

get_insider_trades

get_company_news

These functions now record the latest_cached_date based on query time (if within the same day) instead of data retrieval time, ensuring the cache remains valid for daily queries.

Additional Updates

Added .cache directory and updated .gitignore accordingly.

Improved readability and organization of line-item caching logic.

Standardized all cache-related messages and statistics output to English.

Benefits

Reduced redundant API calls.

Improved response time for frequently queried datasets.

Persistent cache across restarts.

Better observability of caching behavior.

delenzhang added 9 commits November 10, 2025 14:36
…news, and market cap

- Added support for market cap caching with new methods for setting and retrieving market cap data.
- Updated financial metrics methods to include period-based caching.
- Improved insider trades and company news caching with sorting and filtering capabilities.
- Implemented logic to refresh cache based on the latest available data.
- Implemented cache statistics tracking in the Cache class, including total hits and hits by cache type.
- Added methods to load and save cache statistics to a JSON file.
- Enhanced cache retrieval methods to record hits for various data types.
- Introduced a new function to display cache statistics in a user-friendly format.
- Changed cache hit statistics output from Chinese to English for better accessibility.
- Updated cache type display names to English for consistency in the Cache class.
- Updated `get_line_items` and `set_line_items` methods to support period-based caching.
- Introduced `get_latest_line_items_date` method to retrieve the latest report period from cached data.
- Improved API data fetching logic to cache results based on ticker and period, with enhanced filtering and sorting capabilities.
- Added tracking for total API calls and API calls by type in the Cache class.
- Introduced `record_api_call` method to log API calls for various data types.
- Updated cache retrieval methods to record API calls when data is not found in the cache.
- Enhanced cache statistics display to include total API calls and cache hit rate.
- Introduced methods to manage last updated dates for various cache types, improving cache freshness.
- Updated existing cache methods to accept an optional update date parameter, allowing for better tracking of data recency.
- Refactored cache retrieval logic to utilize last updated dates instead of relying solely on data timestamps, ensuring more accurate cache refresh conditions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant