update only changed fields. see issue #2 for details #45

CastixGitHub · 2022-03-18T12:19:27Z

No description provided.

…ent is deepcopied

CastixGitHub · 2022-03-18T12:20:25Z

build is failing due to codecov

codecov · 2022-03-18T12:34:43Z

Codecov Report

Merging #45 (795874d) into master (b22af7c) will increase coverage by 0.13%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #45      +/-   ##
==========================================
+ Coverage   89.96%   90.10%   +0.13%     
==========================================
  Files          43       43              
  Lines        6636     6718      +82     
==========================================
+ Hits         5970     6053      +83     
+ Misses        666      665       -1

Flag	Coverage Δ
tests-3.5	`89.76% <100.00%> (+0.14%)`	⬆️
tests-3.6	`89.97% <100.00%> (+0.14%)`	⬆️
tests-3.7	`89.97% <100.00%> (+0.14%)`	⬆️
tests-3.8	`90.11% <100.00%> (+0.13%)`	⬆️
tests-3.9	`90.11% <100.00%> (+0.13%)`	⬆️
tests-pypy3	`89.97% <100.00%> (+0.14%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
ming/odm/mapper.py	`89.33% <100.00%> (+0.25%)`	⬆️
ming/session.py	`89.02% <100.00%> (+0.20%)`	⬆️
ming/tests/odm/test_declarative.py	`99.58% <100.00%> (+0.03%)`	⬆️
ming/tests/odm/test_mapper.py	`100.00% <100.00%> (ø)`
ming/tests/test_session.py	`99.30% <100.00%> (ø)`
ming/utils.py	`74.13% <100.00%> (+1.91%)`	⬆️
ming/odm/odmsession.py	`68.48% <0.00%> (+0.32%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b22af7c...795874d. Read the comment docs.

CastixGitHub · 2022-03-18T15:36:20Z

see #2

brondsem · 2022-04-01T20:49:30Z

Hi,

I haven't had a chance to look at this yet - but wanted to say I have seen it and it looks quite interesting.

Just a little update to let you know your contribution is appreciated and seen. I hope to take a serious look at it before too much longer.

brondsem · 2022-04-19T15:27:49Z

Hi, thanks for working on this and for your patience.

Could you add a simple test that uses replace_session? At least enough to get some test coverage would be nice.

I ran these changes against a large app that uses Ming (Apache Allura, which I work on). I found two issues:

When a model has a before_save hook which modifies data then that field will get missed. (This was discovered by Allura's forgetracker.tests.functional.test_root:TestFunctionalController.test_touch test) Here's a little code to illustrate what triggers it:

class MyThing:
    class __mongometa__:
        ...
        def before_save(data):
            data['mod_date'] = datetime.utcnow()

We should probably add a Ming test that has this kind of behavior.
And then how to fix it? I'm not sure what a nice solution would be. Your new fields logic runs before session.impl.save which is what calls self._prep_save and the hooks. So we'd have to somehow move the new logic after self._prep_save, or some other moving or passing of vars. Maybe before_save hooks should return a list of fields they modified, and then include that in the save fields? It would be a breaking change and require existing before_save hooks to change, but maybe that'd be the cleanest.

is related to new objects. I haven't debugged it completely, it might have something to do with _id field being special and automatically populated on new objects by pymongo, and ming is skipping it? Anyway I can fix the Allura tests affected by adding an if state.original_document. Its probably better this way anyway: only do the new fields logic if its an update, and new objects work the previous way (saving all the fields):

            if state.original_document:
                # here we do a symmetric difference to see what fields did change
                fields = tuple(set((k for k, v in
                                    doc_to_set(state.original_document)
                                    ^ doc_to_set(state.document))))
            else:
                fields = ()

CastixGitHub · 2022-04-22T22:31:20Z

state.options isn't documented
"the ming foundation layer's save args parameter" is undocumented too
so adding the state parameter, and moving the fields logic there, then why do we take args as a parameter if we are now detecting what changed?
this is a breaking change, and a test was in place.
do you think is there a better approach?
can you re-run with allura?
Thank you for reviewing!

brondsem · 2022-04-29T20:28:51Z

I think this is a good approach to dealing with it.

I get this test error, not sure why the CI tests didn't error on it:

======================================================================
ERROR: test_writable_backref (ming.tests.odm.test_declarative.TestRealMongoRelation)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/brondsem/sf/ming/ming/tests/odm/test_declarative.py", line 191, in test_writable_backref
    self.session.flush()
  File "/Users/brondsem/sf/ming/ming/odm/base.py", line 35, in inner
    result = func(obj, *args, **kwargs)
  File "/Users/brondsem/sf/ming/ming/odm/odmsession.py", line 99, in flush
    self.uow.flush()
  File "/Users/brondsem/sf/ming/ming/odm/unit_of_work.py", line 49, in flush
    unow(obj, st)
  File "/Users/brondsem/sf/ming/ming/odm/base.py", line 35, in inner
    result = func(obj, *args, **kwargs)
  File "/Users/brondsem/sf/ming/ming/odm/odmsession.py", line 115, in update_now
    mapper(obj).update(obj, st, self, **kwargs)
  File "/Users/brondsem/sf/ming/ming/odm/base.py", line 35, in inner
    result = func(obj, *args, **kwargs)
  File "/Users/brondsem/sf/ming/ming/odm/mapper.py", line 88, in update
    ret = session.impl.save(doc, validate=False, state=state)
  File "/Users/brondsem/sf/ming/ming/session.py", line 23, in wrapper
    return func(self, doc, *args, **kwargs)
  File "/Users/brondsem/sf/ming/ming/session.py", line 165, in save
    result = self._impl(doc).update(
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/collection.py", line 3325, in update
    return self._update_retryable(
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/collection.py", line 868, in _update_retryable
    return self.__database.client._retryable_write(
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1498, in _retryable_write
    return self._retry_with_session(retryable, func, s, None)
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1384, in _retry_with_session
    return self._retry_internal(retryable, func, session, bulk)
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/mongo_client.py", line 1416, in _retry_internal
    return func(session, sock_info, retryable)
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/collection.py", line 860, in _update
    return self._update(
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/collection.py", line 837, in _update
    _check_write_command_response(result)
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/helpers.py", line 226, in _check_write_command_response
    _raise_last_write_error(write_errors)
  File "/Users/brondsem/sf/ming/.tox/py39/lib/python3.9/site-packages/pymongo/helpers.py", line 208, in _raise_last_write_error
    raise WriteError(error.get("errmsg"), error.get("code"), error)
pymongo.errors.WriteError: ('\'$set\' is empty. You must specify a field like so: {$set: {<field>: ...}}, full error: {\'index\': 0, \'code\': 9, \'errmsg\': "\'$set\' is empty. You must specify a field like so: {$set: {<field>: ...}}"}', "doc:  {'_id': 0, 'parent_id': 1}")

I would like to keep the save() behavior that lets a list of fields be passed in. Even though its not documented, there was a test for it and I know I have some code that uses it. I think the method could work like this. It also fixes the above error if the args/fields ends up empty.

    @annotate_doc_failure
    def save(self, doc, *args, state=None, **kwargs):
        data = self._prep_save(doc, kwargs.pop('validate', True))
        if not args and state is not None and state.original_document:
            args = tuple(set((k for k, v in
                              doc_to_set(state.original_document)
                              ^ doc_to_set(data))))
        if args:
            values = dict((arg, data[arg]) for arg in args)
            result = self._impl(doc).update(
                dict(_id=doc._id), {'$set': values}, **fix_write_concern(kwargs))
        else:
            result = self._impl(doc).save(data, **fix_write_concern(kwargs))
        if result and '_id' not in doc:
            doc._id = result
        return result

However, I have found another bug, a very subtle one. Here's a simple test case. If you do multiple updates within a session and one of the updates puts a field back to the original value, that is not detected as a change so it is not saved. Not sure what the right solution to this would be. Maybe original_document could be updated after a flush/save? But I don't know what other impacts that would make.

class TestBasicMapping(TestCase):
    ...
    def test_multiple_update_flushes(self):
        initial_doc = self.Basic()
        initial_doc.a = 1
        self.session.flush()
        self.session.close()

        doc_updating = self.Basic.query.get(_id=initial_doc._id)
        doc_updating.a = 2
        self.session.flush()
        doc_updating.a = 1  # back to "initial" value
        doc_updating.e = 'foo'  # change something else too
        self.session.flush()
        self.session.close()

        doc_after_updates = self.Basic.query.get(_id=doc_updating._id)
        assert doc_after_updates.a == 1```

CastixGitHub · 2022-05-04T18:32:04Z

Ming seems to use original_document only inside update_if_not_modified
and as long as not modified "since first queried" means "since the last time it was queried" it should be ok to update it... (letting the users know it changed)

Isn't the state meant to recover data from a bad update "transaction"?
I see after an update the state status is set to clean

clean state (which means that they have not been modified since the last flush to the session)

then I consider not updating original_document a bug that you spotted

You're absolutely right about keeping args, so one could eventually bypass field detection or just skip some of them

I'm unable to reproduce the error you reported, is it somehow apple related?

Thank you

brondsem · 2022-05-13T19:16:20Z

Yea as far as I can tell it should be ok to update original_document. I'm kind of surprised that field isn't used in more places. So hopefully no subtle bugs will appear if we update it :)

Regarding the error I got, I'm not sure why it happened really. But if we go with the def save code that I posted, it avoids the error (when args is empty for whatever reason, the code will save the whole object instead of doing $set with nothing)

…nd don't break on inserts. tests provided

CastixGitHub · 2022-05-14T00:58:50Z

Thank you Dave 😊

brondsem · 2022-05-20T19:10:51Z

Hi again. When I ran the Allura tests with these ming changes, I got some really deep exceptions related to deepcopy. I wonder if copy would be good enough? Or no copying at all? So I tried it and the ming tests still pass. However Allura tests got some other errors when I tried just copy or no copying, I'll have to investigate that more next week. What do you think?

Here's the deepcopy errors

  File "/src/allura/Allura/allura/model/monq_model.py", line 240, in __call__
    session(self).flush(self)
  File "/src/ming/ming/odm/base.py", line 35, in inner
    result = func(obj, *args, **kwargs)
  File "/src/ming/ming/odm/odmsession.py", line 105, in flush
    self.update_now(obj, st)
  File "/src/ming/ming/odm/base.py", line 35, in inner
    result = func(obj, *args, **kwargs)
  File "/src/ming/ming/odm/odmsession.py", line 115, in update_now
    mapper(obj).update(obj, st, self, **kwargs)
  File "/src/ming/ming/odm/base.py", line 35, in inner
    result = func(obj, *args, **kwargs)
  File "/src/ming/ming/odm/mapper.py", line 92, in update
    state.original_document = deepcopy(doc)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 307, in _reconstruct
    value = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 307, in _reconstruct
    value = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 281, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 221, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/usr/local/lib/python3.7/copy.py", line 221, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 281, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 221, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/usr/local/lib/python3.7/copy.py", line 221, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 281, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 281, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/usr/local/lib/python3.7/copy.py", line 281, in _reconstruct
    state = deepcopy(state, memo)
  File "/usr/local/lib/python3.7/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/usr/local/lib/python3.7/copy.py", line 241, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/local/lib/python3.7/copy.py", line 161, in deepcopy
    y = copier(memo)
  File "/var/local/env-allura/lib/python3.7/site-packages/pymongo/collection.py", line 3444, in __call__
    self.__name)
TypeError: 'Collection' object is not callable. If you meant to call the '__deepcopy__' method on a 'Database' object it is failing because no such method exists.```

CastixGitHub · 2022-05-20T23:39:39Z

What do you think?

that we need copy on write

brondsem · 2022-06-02T15:17:27Z

I will at least merge the replace_session changes, since that is simple enough.

For the rest (updating only changed fields) it certainly is annoying that we've gotten so close yet keep running into problems for certain situations. I haven't had much time to spend on this, but when I get a chance I do want to investigate the error I found where deepcopy tries to copy a Collection since that seems pretty weird. I also wonder if there's any other way to deal with original_document that sidesteps the issue. Like maybe set it to None during flush, and don't even try to maintain it after that?

brondsem · 2022-06-02T16:19:42Z

I have merged the replace_session helper. And all the new tests, since they are good to have no matter what happens with this pull request.

update only changed fields. helper to replace session. original docum…

727a3b8

…ent is deepcopied

CastixGitHub added 3 commits May 14, 2022 02:35

change session save api (foundation) to detect before_save changes, a…

25c2982

…nd don't break on inserts. tests provided

apply @brondsem patch, update now resets state.original_document

b8db156

forgot to save this

795874d

CastixGitHub force-pushed the master branch from 635ebef to 795874d Compare May 14, 2022 00:41

CastixGitHub mentioned this pull request May 22, 2022

Copy On Write for the object state #47

Open

brondsem pushed a commit that referenced this pull request Jun 2, 2022

helper to replace session - from #45

4ae649e

brondsem added a commit that referenced this pull request Jun 2, 2022

test improvements - from #45

531e404

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update only changed fields. see issue #2 for details #45

update only changed fields. see issue #2 for details #45

CastixGitHub commented Mar 18, 2022

CastixGitHub commented Mar 18, 2022

codecov bot commented Mar 18, 2022 •

edited

Loading

CastixGitHub commented Mar 18, 2022

brondsem commented Apr 1, 2022

brondsem commented Apr 19, 2022

CastixGitHub commented Apr 22, 2022

brondsem commented Apr 29, 2022

CastixGitHub commented May 4, 2022 •

edited

Loading

brondsem commented May 13, 2022

CastixGitHub commented May 14, 2022

brondsem commented May 20, 2022

CastixGitHub commented May 20, 2022

brondsem commented Jun 2, 2022

brondsem commented Jun 2, 2022

update only changed fields. see issue #2 for details #45

Are you sure you want to change the base?

update only changed fields. see issue #2 for details #45

Conversation

CastixGitHub commented Mar 18, 2022

CastixGitHub commented Mar 18, 2022

codecov bot commented Mar 18, 2022 • edited Loading

Codecov Report

CastixGitHub commented Mar 18, 2022

brondsem commented Apr 1, 2022

brondsem commented Apr 19, 2022

CastixGitHub commented Apr 22, 2022

brondsem commented Apr 29, 2022

CastixGitHub commented May 4, 2022 • edited Loading

brondsem commented May 13, 2022

CastixGitHub commented May 14, 2022

brondsem commented May 20, 2022

CastixGitHub commented May 20, 2022

brondsem commented Jun 2, 2022

brondsem commented Jun 2, 2022

codecov bot commented Mar 18, 2022 •

edited

Loading

CastixGitHub commented May 4, 2022 •

edited

Loading