Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Python 3 #3411

Closed
wants to merge 52 commits into from
Closed

Upgrade to Python 3 #3411

wants to merge 52 commits into from

Conversation

willgearty
Copy link
Member

@willgearty willgearty commented Sep 8, 2021

This is a very much work-in-progress PR to migrate from Python 2 to Python 3.

TODO:

  • Run modernize
  • Remove selenium package (now built-in to django)
  • Replace pydns with py3dns
  • Update django-argcache to python3
  • Encode before hashing
  • Upgrade xlwt (might need to replace with openpyxl)
  • Fix StringIO imports
  • Fix migrations (remove byte strings) (python docs)
  • Upgrade other dependencies (Twilio, flake9, stripe, pygments, pillow)
  • Replace __unicode__() with __str__() (django docs)
  • Replace uses of cmp(), fix uses of sort() and sorted() (python docs) (more python docs about cmp to key)
  • Fix strings vs byte strings (pythons docs)
  • Verify integer division: Make sure / (Python 2) has been changed to // (Python 3) in all the right places (I did a first pass on this in Some changes for python 3 migration unrelated to strings and bytes #3412 but should be double-checked)
  • Many more things...

Fixes #2576. Fixes #2820.

@willgearty willgearty added Work in progress dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Sep 8, 2021
@willgearty willgearty added this to the Stable Release 14 milestone Sep 8, 2021
@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

esp/esp/themes/controllers.py Outdated Show resolved Hide resolved
esp/esp/users/models/__init__.py Outdated Show resolved Hide resolved
@lgtm-com

This comment has been minimized.

@willgearty
Copy link
Member Author

willgearty commented Sep 20, 2021

I believe the final errors are caused by these bits of code:

class PersistentQueryFilter(models.Model):
""" This class stores generic query filters persistently in the database, for retrieval (by ID, presumably) and
to pass the query along to multiple pages and retrival (et al). """
item_model = models.CharField(max_length=256) # A string representing the model, for instance User or Program
q_filter = models.TextField() # A string representing a query filter
sha1_hash = models.CharField(max_length=256) # A sha1 hash of the string representing the query filter
create_ts = models.DateTimeField(auto_now_add = True) # The create timestamp
useful_name = models.CharField(max_length=1024, blank=True, null=True) # A nice name to apply to this filter.

def create_from_Q(item_model, q_filter, description = ''):
""" The main constructor, please call this. """
import hashlib
dumped_filter = pickle.dumps(q_filter)
# Deal with multiple instances
query_q = Q(item_model = str(item_model), q_filter = dumped_filter, sha1_hash = hashlib.sha1(dumped_filter).hexdigest())
pqfs = PersistentQueryFilter.objects.filter(query_q)

The q_filter field is a TextField. In Python 2, pickle.dumps() returned a str, which would work with a TextField. However, in Python 3, pickle.dumps() returns a bytes object, so Django tries to coerce it to a str, but that fails because pickle.dumps() objects can only be converted to str objects using pickle.loads().

I think ultimately we want the q_filter field to be a BinaryField. The Django 1.11 docs say you can't perform filters on BinaryFields, but I don't think we ever perform queryset filters with this field, so that shouldn't be an issue.

Also, it's possible this would break Python 2 backwards compatibility, but strangely enough my initial testing doesn't seem to indicate that.

@lgtm-com
Copy link

lgtm-com bot commented Sep 20, 2021

This pull request introduces 50 alerts and fixes 100 when merging 2b6cc20 into 5eeecff - view on LGTM.com

new alerts:

  • 18 for 'input' function used in Python 2
  • 17 for 'import *' may pollute namespace
  • 8 for Clear-text logging of sensitive information
  • 2 for Unused import
  • 1 for `__eq__` not overridden when adding attributes
  • 1 for Except block handles 'BaseException'
  • 1 for Inconsistent equality and hashing
  • 1 for Wrong name for an argument in a class instantiation
  • 1 for Unguarded next in generator

fixed alerts:

  • 64 for Unused import
  • 26 for Testing equality to None
  • 5 for Result of integer division may be truncated
  • 2 for Uncontrolled data used in path expression
  • 1 for Comparison using is when operands support `__eq__`
  • 1 for Modification of parameter with default
  • 1 for Inconsistent equality and hashing

@hwatheod
Copy link
Contributor

Yes, the final errors are pickle issues which I will work on fixing with the plan I wrote many comments ago. So far I have figured out how to switch between running Python 2 and 3 on fab runserver which will be needed to test any such fix.

Also, these may not be the "final" errors. Once these errors are fixed, they may unmask more errors :)

@willgearty
Copy link
Member Author

Ah, yes, there've been so many comments and LGTM comments that I totally forgot you already had a plan! I've got the pickled errors temporarily fixed for my travis builds, so I can start preemptively fixing the new errors.

@willgearty
Copy link
Member Author

willgearty commented Sep 22, 2021

I've run into an issue with MessageRequest.special_headers being changed to a BinaryField. In #3129 I made it so that we include the MessageRequest as a MessageVariable, which means the MessageRequest object gets pickled here:

newMessageVar.pickled_provider = pickle.dumps(obj)

However, you can't pickle a BinaryField because it's converted to a memoryview by Django:
https://github.com/django/django/blob/2da029d8540ab0b2e9edcba25c4d46c52853197f/django/db/models/fields/__init__.py#L2303-L2307

So I guess our options are to either make these JSON fields now (instead of waiting until the future) or partially revert the changes in #3129?

The good news is that I'm not getting any other errors (after the fixes in dc1b1a2)!

@willgearty
Copy link
Member Author

willgearty commented Sep 24, 2021

Here are some comments for each of the currently pickled model fields:

PersistentQueryFilter.q_filter
Looks like this can have some fairly complex pickled objects in it, and I'm guessing it will be a pain (if not impossible) to convert them. It looks like the only time we ever use a preexisting PersistentQueryFilter object is when we use the user search controller (and only for the confirmation or checklist parts). All other calls can just create new objects. I think we can just delete all of the old objects and start from scratch with this field as a JSONField.

MessageRequest.special_headers
As far as I can tell, we only ever use this to store a pickled dictionary. I imagine this should be easy to unpickle and repickle (or JSONify).

special_headers_dict
= { 'Reply-To': replytoemail, }, )

MessageVars.pickled_provider
This includes pickled instances of Program, User, and request objects, which means it'll probably be tricky to convert them. Fortunately, I don't think we ever use these once an email request has been processed, so we can probably just delete the old ones (but don't delete their associated MessageRequest objects).

ResourceType.attributes_pickled
This is always set as just a |-delimited list. I'm not sure why we call the field attributes_pickled when it never even seems to be pickled (or why we also have an un-pickled cached version of the attributes). As far as I can tell, we never call _attributes_cached or attributes (which would call _attributes_cached). I have a feeling that the vast majority of existing ResourceType objects will just have normal strings for this field (not pickled) and we can convert the small minority (if any) that have pickled attributes to a |-separated string (and then remove any of the code related to pickling).

Answer.value
I think these are almost exclusively pickled strings, so they should hopefully be easy to convert?

@milescalabresi
Copy link
Contributor

Resolved by #3616; closing

@milescalabresi milescalabresi deleted the python3 branch May 16, 2024 11:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file python Pull requests that update Python code Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade pillow to 7.1.0? Upgrade to Python 3
4 participants