Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache dictionary #21

Open
kaykhancheckpoint opened this issue Aug 13, 2020 · 8 comments
Open

Cache dictionary #21

kaykhancheckpoint opened this issue Aug 13, 2020 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@kaykhancheckpoint
Copy link

Can you use this to cache a large dictionary?

@keithrozario
Copy link
Owner

Depends how large -- and how would the dictionary be populated? Via file from S3?

@keithrozario keithrozario added the enhancement New feature or request label Aug 14, 2020
@kaykhancheckpoint
Copy link
Author

Hmm not quite, lets say i have built an object which has taken a long time due to some complex computation, now i wish to cache this object because running that computation every time the lambda function fires is time consuming.

cache = {}

if cache["data"]:
    do_some_work()
else:
   some_class_obj = some_large_computation()
   cache["data"] = some_class_obj 
   do_some_work()

Do you have any suggestions on how i can deal with this

@keithrozario
Copy link
Owner

keithrozario commented Aug 14, 2020

I see. The solution I suggest is to setup the code to generate the dictionary outside of the handler.

# everything here will execute during init time
cache = {}
some_class_obj = some_large_computation()
cache["data"] = some_class_obj 
do_some_work()
# end of section

def handler(event, context):
    other_stuff()
    return

just be careful if the cache dictionary requires more than 10 seconds, this might not work.

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Aug 15, 2020

I think initially it does take more than 10 seconds. So like you said this probably would not work for my usecase. Out of curiosity why is there a 10~ second limit?

The other problem i see with this approach is that you have to pass the cached object around. The function that actually requires the cached object may be 3-4 levels down from the handler function.

@keithrozario
Copy link
Owner

ok, I'll look into it.

The other option you can do while I work on this, is to simply generate the dictionary, and save it into global variable. Global variables remain constant across invocations of a single execution context :).

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Aug 19, 2020

Ye i have used cachetools https://pypi.org/project/cachetools/ to save the obj into a dictionary but when the aws lambda container shuts down the cache is reset. The container shuts down if there are no calls made to it after 30~ minutes (I think)

Right now im thinking of pickling the obj and storing it onto s3 and then retrieving it and unpickling it

@keithrozario
Copy link
Owner

Yes, that seems the most reasonable approach. Although I wouldn't use pickle 👍

Because the lambda container will expire in minutes if not used (typically 10-30), and you'd lose everything in the container including any cache items you had.

@kaykhancheckpoint
Copy link
Author

kaykhancheckpoint commented Aug 21, 2020

Yeah thats why i was going to pickle and store on s3 and retrieve from s3 so the cache wouldint be in the container.

Anyway after trying the above i decided it wasen't worth it, it took a long time to unpickle the file once i recieved it from s3.

I decided to move away from serverless for this application im working on and instead im deploying it on kubernetes as an api, where i can store it in cache that will last for how ever long i set the TTL to be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants