Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate optstrings automatically #864

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

hanefi
Copy link
Contributor

@hanefi hanefi commented Aug 6, 2024

This commit introduces a new function construct_optstring that constructs the optstring for getopt_long based on the long_options array.

We used to manually edit the optstring in each command's getopts function. This was error-prone and tedious. This commit removes the manual editing and replaces it with a call to construct_optstring.

This commit introduces a new function construct_optstring that
constructs the optstring for getopt_long based on the long_options array.

We used to manually edit the optstring in each command's getopts
function. This was error-prone and tedious. This commit removes the
manual editing and replaces it with a call to construct_optstring.
Copy link
Owner

@dimitri dimitri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is nice that static arrays can be used that way to generate the opt string, but now we are doing the operation at run-time instead of compile-time. Is there a way that we could do that operation at compile-time?

I know C is a poor platform for adding compile-time tooling, but there might be a clean way to still do that?

@hanefi
Copy link
Contributor Author

hanefi commented Aug 7, 2024

It is nice that static arrays can be used that way to generate the opt string, but now we are doing the operation at run-time instead of compile-time. Is there a way that we could do that operation at compile-time?

I do not think that we can do that in C. If you think that it is not reasonable to do this operation in run-time, we can close this pull request. IMO, it is risky to assume that people will do a good job maintaining these strings by hand, and it is a meaningful compromise.

I know C is a poor platform for adding compile-time tooling, but there might be a clean way to still do that?

In recent versions of C++, there are several ways that usually include variadic templates, and/or cryptic class definitions. I failed to find an elegant solution in C++, and not any solution in C.

I tried to implement macros that concatenate characters at compile time, but that does not work with variables, only string constants.

@@ -653,8 +681,7 @@ cli_copy_db_getopts(int argc, char **argv)
exit(EXIT_CODE_BAD_ARGS);
}

const char *optstring =
"S:T:D:J:I:b:L:u:mcAPOXj:xBeMlUagkynF:F:Q:irRCN:fp:ws:o:tE:Vvdzqh";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we generate the optstring value for clone command at run-time, we get "S:T:D:J:J:I:b:L:L:u:mcAPOXj:xBBeMMlUagkynF:F:Q:irRCN:fp:ws:o:tE:Vvvdzqh".

This character array contains duplicate characters due to the aliases that we have, and works fine.

Another thing that I noticed is that we had a substring F:F: that also repeated a short option string.

Copy link
Owner

@dimitri dimitri Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to remove duplicates in the code that builds the string here? I am worried about portability of duplicates depending on getopt_long implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added deduplication logic. However to make things simple, I assumed the aliases will always be consecutive in the long_options array.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could use our hash lib to handle that fully, and hide that behind a DEBUG flag so that we don't do the dedup operation at run-time in “production” builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I think I can take this idea and build some more on top of it to come up with a nice design.

I will rewrite the helper function in a way that:

  • on prod: it will immediately return supplied static strings
  • on debug builds: run dedup logic, and verify that the result is identical to the supplied static string. If they are not identical, it will result in an assertion failure that should guide the developer into how they should update the string.

This way, we will be able to catch potential bugs in the future as CI will catch those assertions

@@ -107,7 +107,9 @@ cli_dump_schema_getopts(int argc, char **argv)
exit(EXIT_CODE_BAD_ARGS);
}

while ((c = getopt_long(argc, argv, "S:T:D:PrReFCNVvdzqh",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have N: as snapshot has a required argument.

{ "snapshot", required_argument, NULL, 'N' }

I wonder if this means we never supported pgcopydb dump schema --snapshot <snapshot_id> properly.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Can we include a fix for the argument parsing in this PR then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we programmatically create the optstring values, it will fix this underlying issue as well.

This bug only exists in the line that I want to remove.

@@ -294,7 +294,7 @@ cli_list_db_getopts(int argc, char **argv)
exit(EXIT_CODE_BAD_ARGS);
}

const char *optstring = "S:D:s:t:F:xPL:u:k:mfyarJRIN:Vdzvqh";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have k: as it has no arg. { "skip-split-by-ctid", no_argument, NULL, 'k' }

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is something that this PR is fixing, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I should have made myself clear.

I wanted to document the human errors when maintaining optstring values by hand. This bug will not be possible after we start generating the optstring values programmatically.

If we have multiple aliases for the same option, we should only include
the first one in the optstring.

To make deduplication easy, we assume aliases are consecutive in the
long_options array.
... and replace the hardcoded optstring with a dynamically generated one
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants