Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3C needs a better way to output multiple files. #347

Closed
kyleheadley opened this issue Dec 7, 2020 · 5 comments · Fixed by #412
Closed

3C needs a better way to output multiple files. #347

kyleheadley opened this issue Dec 7, 2020 · 5 comments · Fixed by #412
Assignees
Labels
command-line enhancement New feature or request help wanted Extra attention is needed

Comments

@kyleheadley
Copy link
Member

3C can output results to stdout, or to file using the -output-postfix= option. This works fine for single file conversions, and reasonably well for small projects. But stdout can't be redirected to multiple files if there were multiple inputs, and post-fixed files have limited options for names and locations. We need to brainstorm, discuss, and implement a better permanent solution.

@kyleheadley kyleheadley added enhancement New feature or request help wanted Extra attention is needed labels Dec 7, 2020
@mattmccutchen-cci
Copy link
Member

mattmccutchen-cci commented Dec 7, 2020

As we discussed at today's meeting, the most basic solution is to have a flag to redirect paths relative to 3C's existing -base-dir directory to an "output directory". This may be a tiny change: replace one string at the beginning of each file path with another and call llvm::sys::fs::create_directories to create the intermediate directories before opening the output file. (I didn't see an existing library function that both creates the intermediate directories and opens the output file.)

In some scenarios that use an output directory, an option to write files even if they are identical to the originals (#328) may be especially helpful.

@mattmccutchen-cci
Copy link
Member

mattmccutchen-cci commented Jan 13, 2021

I'm thinking of going ahead and implementing -output-dir since I know I'll eventually want it as part of #346 at least for the multi-file regression tests so we can avoid hacks like this. Any advice? Some specific design questions:

  • Normally, 3c rewrites a file if it is passed on the command line or under the -base-dir (canWrite), otherwise it is silently skipped. Obviously, if -output-dir is set, we can't rewrite files on the command line that are outside the -base-dir because we don't have output paths for them. Should it be a warning or an error (nonzero exit code) if such a file needs changes? I'm thinking an error. What if the file doesn't need changes? Unless we think it's a reasonable use case to use one file outside the base dir to drive the rewriting of several other files by including them, I'm thinking also an error; it's annoying to have a script that works for a while and suddenly fails when the inference result changes. I'm inclined to move as far as reasonable toward strictness now before we get more real users and backward compatibility is more of a concern.
  • For consistency, should we handle a file on the command line outside the -base-dir the same way if -output-postfix is set? If neither -output-* option is set, should it be an error if more than one file is passed on the command line or any file not passed on the command line needs changes?
  • If I understand correctly that our analysis tries to constrain non-canWrite files to not change, should it be an error if we try to change them? If this constraint logic isn't working as intended, the output might be uncompilable, and in any event I think we'd want to know. (This is getting far afield from the present issue but may be efficient to address at the same time.) Update: Someone in the 2021-01-13 meeting (Mike?) said yes.

I'm unsure whether it will prove efficient to try to address #328 at the same time. Inasmuch as it might, I'd also appreciate design feedback there.

@mwhicks1
Copy link
Member

I'm thinking of going ahead and implementing -output-dir since I know I'll eventually want it as part of #346 at least for the multi-file regression tests so we can avoid hacks like this. Any advice? Some specific design questions:

  • Normally, 3c rewrites a file if it is passed on the command line or under the -base-dir (canWrite), otherwise it is silently skipped. Obviously, if -output-dir is set, we can't rewrite files on the command line that are outside the -base-dir because we don't have output paths for them. Should it be a warning or an error (nonzero exit code) if such a file needs changes? I'm thinking an error. What if the file doesn't need changes? Unless we think it's a reasonable use case to use one file outside the base dir to drive the rewriting of several other files by including them, I'm thinking also an error; it's annoying to have a script that works for a while and suddenly fails when the inference result changes.

Error seems good, yes, even if the file doesn't need changes.

I'm inclined to move as far as reasonable toward strictness now before we get more real users and backward compatibility is more of a concern.

  • For consistency, should we handle a file on the command line outside the -base-dir the same way if -output-postfix is set? If neither -output-* option is set, should it be an error if more than one file is passed on the command line or any file not passed on the command line needs changes?

I don't recall what -base-dir does. It changes directories to the specified one and then runs the tool? How is that expected to interact with other relative paths? I don't know if there are standard expectations about this sort of thing.

If I am guessing this right, an error seems good (i.e., same behavior as when you pass two files to the command line but don't set the output directory/suffix).

  • If I understand correctly that our analysis tries to constrain non-canWrite files to not change, should it be an error if we try to change them? If this constraint logic isn't working as intended, the output might be uncompilable, and in any event I think we'd want to know. (This is getting far afield from the present issue but may be efficient to address at the same time.) Update: Someone in the 2021-01-13 meeting (Mike?) said yes.

I'm unsure whether it will prove efficient to try to address #328 at the same time. Inasmuch as it might, I'd also appreciate design feedback there.

@mattmccutchen-cci
Copy link
Member

  • For consistency, should we handle a file on the command line outside the -base-dir the same way if -output-postfix is set?

I don't recall what -base-dir does. It changes directories to the specified one and then runs the tool? How is that expected to interact with other relative paths? I don't know if there are standard expectations about this sort of thing.

All -base-dir currently does is change what files 3C is allowed to modify; it does not change the working directory and thus does not affect the interpretation of any relative paths. The rule is: "3C is allowed to modify a file if it is listed on the command line or under the -base-dir". With the addition of -output-dir, -base-dir takes on a second purpose of establishing the reference point for mapping input files to the output dir.

When -output-dir is used, we can't handle input files on the command line that are outside the -base-dir because we can't determine output paths for them. If we made it an error to specify input files outside the -base-dir even when -output-dir is not used, then the rule would be simplified to "3C is allowed to modify a file if it is under the -base-dir" (possibly with an exception in stdout mode, which we're discussing at #328 (comment)). Do you think we should do that?

If neither -output-* option is set, should it be an error if more than one file is passed on the command line

I learned that it is already an error, but it is an assertion error and it should just be a nonzero exit. I'll fix that as part of this issue; I don't think it's worth filing another.

or any file not passed on the command line needs changes

This is now being discussed in #328 (comment).

If I understand correctly that our analysis tries to constrain non-canWrite files to not change, should it be an error if we try to change them?

This is moved to #387.

@mwhicks1
Copy link
Member

I agree with what you have: Yes, to your first question: Make it an error, and Yes, fix the assertion error to be a proper error.

@mattmccutchen-cci mattmccutchen-cci self-assigned this Feb 2, 2021
mattmccutchen-cci added a commit that referenced this issue Feb 13, 2021
- Add -output-dir option to write updated files to a directory structure
  parallel to the base dir (#347). When -output-dir is used, a source
  file outside the base dir can't be handled because there is no way to
  compute its output path. For consistency, this is now an error even
  when -output-dir is not used.

- Convert all 3C regression tests from -output-postfix to -output-dir to
  avoid leaving temporary files in the clang/test/3C directory (#378).

- Expand "3c -help" documentation. In particular, direct the user to pass
  "--" when they don't want to use a compilation database to avoid
  accidentally using unwanted compiler options and suppress the warning
  if no compilation database is found (#343).

- For consistency, have stdout mode output the main file even if it is
  unchanged (#328).

- Fix bugs in matching of file paths against the base dir (#327).

- Other minor bug fixes: see the pull request description for details.

Co-authored-by: John Kastner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
command-line enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants