Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falseful changelog, incorrect README and broken code in 4.3.0 #622

Closed
kshnurov opened this issue Mar 10, 2023 · 23 comments
Closed

Falseful changelog, incorrect README and broken code in 4.3.0 #622

kshnurov opened this issue Mar 10, 2023 · 23 comments

Comments

@kshnurov
Copy link
Contributor

kshnurov commented Mar 10, 2023

@stefankroes what is happening to this gem? Are you ok with this?
I can take over the repo and fix the mess created by @kbrock if you let me.

4.3.0 changelog states:
* arrange is 3x faster and uses 20-30x less memory [#415](https://github.com/stefankroes/ancestry/pull/415)

The PR itself is called Make arrange_nodes more memory efficient.
But arrange and arrange_nodes were not even touched. That's a straight lie!

README now contains absolutely incorrect setup instructions that are either broken or will result in bad performance.
I've provided the correct ones here and here, but @kbrock just doesn't want to listen and learn about database collations.

Also, both of the last 2 versions were released with broken code that just doesn't work as described.
One I've fixed and the second one is still there.

@kbrock
Copy link
Collaborator

kbrock commented Mar 10, 2023

@kshnurov your communication style is quite combative. These attacks are getting tiring. I'd rather be spending my time merging your PRs than fighting you.

You had the option to put in a PR or to attack me and call me a liar. You chose the latter.
It feels like you often choose the attacking route.

I could have merged a PR that you submitted without a second though, but here I am defending myself against your most recent attack. I'm sorry @stefankroes that you are dragged into this.


Thank you for finding another typo. We were working on flatten_arranged_nodes (and the alternative to arrange_as_array. As you'll notice, arrange is the common word there. Yes, I made a mistake. It seems like a reasonable mistake but I will change it to sort_by_ancestry to make sure no one is misguided. Saying I was intentionally lying is a bit of a stretch.

I'd imagine your frustration would be heightened since you have clearly stated that you want to delete this method from the code base.

But we can't delete this method. It is needed. The original form has been in the wiki for a long time with people updating it. And just last week someone submitted a few issues wanting to do this very thing. I need to go in there and s/arrange_as_array/sort_by_ancestry/g. But again, you are capable of contributing instead of attacking.

And that sums up a lot of our tension. You solve a problem in a certain way. I've repeatedly explained that other developers may use different solutions because they have different use cases. There are trade offs. Your responses suggest that my response does not resonate.


You have solved your problem using ruby callbacks. But there are other solutions where an update in pure sql is mandatory. Bringing back all those records into ruby would not complete in a reasonable amount of time, let alone locking tables for that long. I've shown you how rails offers both solutions, but you just insist that I am negligent by allowing "broken" code in this gem.


Now as for collation, I've explained this before.

Ancestry was written in a world where it assumes an ascii character set, with POSIX or C collation. And when it was written, that was a reasonable solution. Now that we are in a UTF-8 world, we need to ask the indexes to act in the way the gem was designed.

The documentation behind collation was not optimal. I took a stab at making it better and you have shared some good knowledge. But back to the backwards compatibility comment, I can't just swap a string field over to binary. I have added a section with your solution to the README. I'm trying my best to introduce your knowledge in a way that does not break existing installations. I tried to run all my suggested scenarios with a benchmark script ensuring that indexes were used on both mysql and postgres.

If you have a problem with how I represented your knowledge, please put in a PR instead of attacking me. The documentation PR was open for a number of days. I didn't think you had a problem with it. You just went radio silent on the project.

Since it is just a README file that is easily updated with a PR and not really tied to a release, I went ahead and released the gem. You have been asking me to release the gem for a while now. I wasn't expecting you to attack me for releasing it.


Your latest PR just does too much. I told you that it changes too much but you just state that I can't see the obvious.

After you stated you would not split up the PR, I have been splitting apart the code and pulling out PRs. Any code change potentially introduces bugs. So I am pulling out the various changes to make it easier to diagnose bugs in the future.

I've been maintaining this gem for many years. So making sure we can address bugs in the future is my number one priority. If that slows down development in the short term, then that is acceptable to me. Contributors come and go. They swoop in demanding that their code gets merged, raise a fuss, and then they leave the maintainer to maintain the code.

I appreciate the knowledge you have shared, but please cut back on the attacks.

@jrafanie
Copy link
Contributor

@kshnurov "PRs welcome." This is not how it works in open source. There are bugs. There are misworded changelogs or pull request descriptions. We open pull requests to fix things that are wrong. We discuss the pros and cons in the open and don't criticize things and especially not people because we cannot articulate our points convincingly.

Can you open issues with what's still broken? Actually, no. Where are the pull requests to fix the things you're still concerned with? Are you implementing and testing your desired changes for all supported databases and versions of ruby/rails? Perhaps you can better demonstrate your ideas in pull requests tested on at least a few databases and then offer to test them more thoroughly if your proposal is desired.

Thanks.

@kshnurov
Copy link
Contributor Author

Where are the pull requests to fix the things you're still concerned with?
Perhaps you can better demonstrate your ideas in pull requests tested on at least a few databases and then offer to test them more thoroughly if your proposal is desired.

@jrafanie have you even looked at my PRs and comments on issues before writing this? I've done all of it, spent days to figure things out, test, point out bugs, and propose solutions. And after all that @kbrock just releases a new broken version without asking anyone.

@kshnurov
Copy link
Contributor Author

@kshnurov your communication style is quite combative. These attacks are getting tiring. I'd rather be spending my time merging your PRs than fighting you.

@kbrock we're from very different cultures in terms of criticism and feedback. What you see as an attack is absolutely normal in my culture (and many others), where solving a problem is much more important than keeping everyone happy.

And when someone acts irresponsible and breaks things - we're saying it out loud, for the sole purpose of fixing the problem. If you take that personal - just don't.

@kshnurov
Copy link
Contributor Author

I can't just swap a string field over to binary

Yes, you can. As I've already said - just read what is collation.

@kbrock
Copy link
Collaborator

kbrock commented Mar 10, 2023

I can't just swap a string field over to binary

Yes, you can. As I've already said - just read what is collation.

I read that the first time you suggested.

I have already agreed that binary is a great option for storing ancestry. Since it does not manipulate any strings, indexes work well. I added it to the documentation.

Our goal is to do basic C/Posix/binary collation string compare. That can be solved by using binary data, using an ascii character set, or setting collation to use ascii comparisons.

All of these options work.

Why are you having trouble acknowledging that there are multiple viable solutions to this problem?

People want to solve this is multiple ways.

The reason why I choose to keep this as character data instead of binary data is because they are characters. I want to leave open the option of using regular expressions and string concatenation in the database to better implement includes(). I'm not ready to close that door yet, especially since there are a number of viable options in both mysql and postgres.

I am not being obtuse. I have put in some time into this topic. Yes I still have a lot to learn, but I believe that is a multi year topic.

I don't fully understand the whole 7bit vs 8bit debate and the glibc vs libc (mostly bsd) approach differences.

But I do understand that the Mac's collation routines are from an outdated version of BSD and basically just an ascii compare. That is why the collation on a mac and linux postgres database work differently. Mysql on the other hand ships locales with the product. I understand that strcoll and strcmp need to work together to define a definitive order in postgres. I understand that collation implementations are different because there are a bunch of nuances in there. Ones I have yet to discover.

@jrafanie
Copy link
Contributor

@jrafanie have you even looked at my PRs and comments on issues before writing this? I've done all of it, spent days to figure things out, test, point out bugs, and propose solutions. And after all that @kbrock just releases a new broken version without asking anyone.

Yes, of course. I couldn't find any pull requests beyond the ones already merged. So, if there are outstanding issues, I don't see them as issues and pull requests linked as the fix for them. I don't even see failing tests demonstrating that it's broken.

@kshnurov
Copy link
Contributor Author

But there are other solutions where an update in pure sql is mandatory. Bringing back all those records into ruby would not complete in a reasonable amount of time, let alone locking tables for that long. I've shown you how rails offers both solutions, but you just insist that I am negligent by allowing "broken" code in this gem.

Ok, just tell me how any of this is gonna work:

has_ancestry cache_depth: true, update_strategy: :sql
has_ancestry counter_cache: true, update_strategy: :sql
has_ancestry update_strategy: :sql
(in view) cache child

It won't, and the gem won't fire a single warning. So, it is broken.

If someone needs sql only updates and really knows what they're doing - there's a without_ancestry_callbacks for that.

@jrafanie
Copy link
Contributor

@kbrock we're from very different cultures in terms of criticism and feedback. What you see as an attack is absolutely normal in my culture (and many others), where solving a problem is much more important than keeping everyone happy.

And when someone acts irresponsible and breaks things - we're saying it out loud, for the sole purpose of fixing the problem. If you take that personal - just don't.

That's an excuse. You can be critical of the code and provide failing tests and solutions in pull requests without being negative towards anyone. If your concerns aren't well understood, demonstrate them in failing tests, issues and draft pull requests.

@kshnurov
Copy link
Contributor Author

Our goal is to do basic C/Posix/binary collation string compare.

No it's not. All we need is to compare bytes, no need for any collation.

I understand that collation implementations are different because there are a bunch of nuances in there

That is one of the reasons collations should not be used for ancestry at all.
Another one - they add nothing except slowing things down.

I want to leave open the option of using regular expressions and string concatenation in the database to better implement includes()

REPLACE and CONCAT work perfectly with binary. I don't see any reason to use regex here.

@kbrock
Copy link
Collaborator

kbrock commented Mar 10, 2023

There are a few issues around the counter cache that people are seeing. These issues may be related to what you are seeing.

It looks like depth_cache should work fine with update: sql and counter caches are probably broken.

We run the whole postgres tests suite using update_strategy: :sql. So our current tests have not found the issue you are raising.

Can you write a test that shows an issue with the code?
Or even just shows where you think an issue will come up?
I will do the same.

If there are bugs in the code, I want tests pointing them out.

@kbrock
Copy link
Collaborator

kbrock commented Mar 10, 2023

Our goal is to do basic C/Posix/binary collation string compare.

No it's not. All we need is to compare bytes, no need for any collation.

https://dba.stackexchange.com/questions/209713/what-is-the-mysql-equivalent-of-postgres-c-collation

I understand that collation implementations are different because there are a bunch of nuances in there

That is one of the reasons collations should not be used for ancestry at all. Another one - they add nothing except slowing things down.

It sounds like using C location on postgres just does byte comparison.

I want to leave open the option of using regular expressions and string concatenation in the database to better implement includes()

REPLACE and CONCAT work perfectly with binary. I don't see any reason to use regex here.

Great. I will see if I can s/regex/replace/g.

@kshnurov
Copy link
Contributor Author

Or even just shows where you think an issue will come up?

It's obvious that depth_cache_column, counter_cache_column and updated_at columns won't be updated for descendants, because it's done with callbacks, which are not triggered.

@kshnurov
Copy link
Contributor Author

It sounds like using C location on postgres just does byte comparison.

And works differently on different platforms, as you've already mentioned many times.
That's why using raw binary is the best (and the fastest) option.

@kbrock
Copy link
Collaborator

kbrock commented Mar 10, 2023

could we pull the collation and the counters discussion out of this issue?
They seem to come up in every thread you join.

counter discussion => #618
I'll create a pr around collation testing

Can we keep this issue on the original topic?

I believe this issue is here so you can drop me from the project1

@kshnurov
Copy link
Contributor Author

I believe this issue is here so you can drop me from the project1

The issue is to keep this gem stable, fast and without broken functionality (which was already released twice).
Since you continue to merge things left and right without even reviewing them - I have to try more harsh options.

@stefankroes
Copy link
Owner

@kshnurov I love the work @kbrock is doing, please behave

@kbrock kbrock closed this as completed Mar 11, 2023
@kshnurov
Copy link
Contributor Author

@kshnurov I love the work @kbrock is doing, please behave

@stefankroes he's not even reading the code before "fixing" it by introducing yet another bug - running the tests twice.
Completely breaks backward compatibility without sufficient reason.
If you "love" that - that's on your conscience.

@kbrock
Copy link
Collaborator

kbrock commented Mar 11, 2023

I am adding back the backwards compatible tests that you dropped in #597.
I'm also adding tests specifically to support mysql binary columns for you.

It sounded like you want to be included in more of the conversations.
So I was reaching out for your feedback, but maybe I misread your intent.

I will reopen this thread as you clearly are not done venting.

@kbrock kbrock reopened this Mar 11, 2023
@kshnurov
Copy link
Contributor Author

kshnurov commented Mar 11, 2023

I am adding back the backwards compatible tests that you dropped in #597.

I didn't drop anything! What is wrong with you? Read the code!
BTW, it is your own code, I just renamed STRATEGY to FORMAT in #597.
It is testing both strategies, without FORMAT it'll be the default materialized_path.

bundle exec rake
FORMAT=materialized_path2 bundle exec rake

You've added format to matrix in #624, which is really good, but you didn't remove the first bundle exec rake, and now all tests are run twice. IDK how you cannot see two consecutive bundle exec rake here.

@stefankroes you still love this?

@kshnurov
Copy link
Contributor Author

Found another breaking change in 4.3.0 not mentioned on the changelog #617

@jrafanie
Copy link
Contributor

You've added format to matrix in #624, which is really good, but you didn't remove the first bundle exec rake, and now all tests are run twice. IDK how you cannot see two consecutive bundle exec rake here.

@stefankroes you still love this?

Please open new issues with the various bugs you are finding. Comments buried in PRs or different issues get lost. Or better yet, open pull requests for bugs you find.

Bugs happen. Compatibility breakages for not understood use cases happen. It's best to report them like most contributors to open source do and articulate how you're using it and how it broke you.

@kshnurov
Copy link
Contributor Author

Please open new issues with the various bugs you are finding.

So, @kbrock is gonna merge broken code literally 5 times a week, release broken versions, and I should debug that?
Thanks but no thanks, I'll just fork it, I don't want to spend any more time stopping you guys from doing stupid things.

@kbrock kbrock closed this as completed Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants