3C needs a better way to output multiple files. #347

kyleheadley · 2020-12-07T21:28:44Z

3C can output results to stdout, or to file using the -output-postfix= option. This works fine for single file conversions, and reasonably well for small projects. But stdout can't be redirected to multiple files if there were multiple inputs, and post-fixed files have limited options for names and locations. We need to brainstorm, discuss, and implement a better permanent solution.

The text was updated successfully, but these errors were encountered:

mattmccutchen-cci · 2020-12-07T21:37:55Z

As we discussed at today's meeting, the most basic solution is to have a flag to redirect paths relative to 3C's existing -base-dir directory to an "output directory". This may be a tiny change: replace one string at the beginning of each file path with another and call llvm::sys::fs::create_directories to create the intermediate directories before opening the output file. (I didn't see an existing library function that both creates the intermediate directories and opens the output file.)

In some scenarios that use an output directory, an option to write files even if they are identical to the originals (#328) may be especially helpful.

mattmccutchen-cci · 2021-01-13T16:27:10Z

I'm thinking of going ahead and implementing -output-dir since I know I'll eventually want it as part of #346 at least for the multi-file regression tests so we can avoid hacks like this. Any advice? Some specific design questions:

Normally, 3c rewrites a file if it is passed on the command line or under the -base-dir (canWrite), otherwise it is silently skipped. Obviously, if -output-dir is set, we can't rewrite files on the command line that are outside the -base-dir because we don't have output paths for them. Should it be a warning or an error (nonzero exit code) if such a file needs changes? I'm thinking an error. What if the file doesn't need changes? Unless we think it's a reasonable use case to use one file outside the base dir to drive the rewriting of several other files by including them, I'm thinking also an error; it's annoying to have a script that works for a while and suddenly fails when the inference result changes. I'm inclined to move as far as reasonable toward strictness now before we get more real users and backward compatibility is more of a concern.
For consistency, should we handle a file on the command line outside the -base-dir the same way if -output-postfix is set? If neither -output-* option is set, should it be an error if more than one file is passed on the command line or any file not passed on the command line needs changes?
If I understand correctly that our analysis tries to constrain non-canWrite files to not change, should it be an error if we try to change them? If this constraint logic isn't working as intended, the output might be uncompilable, and in any event I think we'd want to know. (This is getting far afield from the present issue but may be efficient to address at the same time.) Update: Someone in the 2021-01-13 meeting (Mike?) said yes.

I'm unsure whether it will prove efficient to try to address #328 at the same time. Inasmuch as it might, I'd also appreciate design feedback there.

mwhicks1 · 2021-01-14T17:44:14Z

I'm thinking of going ahead and implementing -output-dir since I know I'll eventually want it as part of #346 at least for the multi-file regression tests so we can avoid hacks like this. Any advice? Some specific design questions:

Normally, 3c rewrites a file if it is passed on the command line or under the -base-dir (canWrite), otherwise it is silently skipped. Obviously, if -output-dir is set, we can't rewrite files on the command line that are outside the -base-dir because we don't have output paths for them. Should it be a warning or an error (nonzero exit code) if such a file needs changes? I'm thinking an error. What if the file doesn't need changes? Unless we think it's a reasonable use case to use one file outside the base dir to drive the rewriting of several other files by including them, I'm thinking also an error; it's annoying to have a script that works for a while and suddenly fails when the inference result changes.

Error seems good, yes, even if the file doesn't need changes.

I'm inclined to move as far as reasonable toward strictness now before we get more real users and backward compatibility is more of a concern.

For consistency, should we handle a file on the command line outside the -base-dir the same way if -output-postfix is set? If neither -output-* option is set, should it be an error if more than one file is passed on the command line or any file not passed on the command line needs changes?

I don't recall what -base-dir does. It changes directories to the specified one and then runs the tool? How is that expected to interact with other relative paths? I don't know if there are standard expectations about this sort of thing.

If I am guessing this right, an error seems good (i.e., same behavior as when you pass two files to the command line but don't set the output directory/suffix).

If I understand correctly that our analysis tries to constrain non-canWrite files to not change, should it be an error if we try to change them? If this constraint logic isn't working as intended, the output might be uncompilable, and in any event I think we'd want to know. (This is getting far afield from the present issue but may be efficient to address at the same time.) Update: Someone in the 2021-01-13 meeting (Mike?) said yes.

I'm unsure whether it will prove efficient to try to address #328 at the same time. Inasmuch as it might, I'd also appreciate design feedback there.

mattmccutchen-cci · 2021-01-19T20:21:19Z

For consistency, should we handle a file on the command line outside the -base-dir the same way if -output-postfix is set?

I don't recall what -base-dir does. It changes directories to the specified one and then runs the tool? How is that expected to interact with other relative paths? I don't know if there are standard expectations about this sort of thing.

All -base-dir currently does is change what files 3C is allowed to modify; it does not change the working directory and thus does not affect the interpretation of any relative paths. The rule is: "3C is allowed to modify a file if it is listed on the command line or under the -base-dir". With the addition of -output-dir, -base-dir takes on a second purpose of establishing the reference point for mapping input files to the output dir.

When -output-dir is used, we can't handle input files on the command line that are outside the -base-dir because we can't determine output paths for them. If we made it an error to specify input files outside the -base-dir even when -output-dir is not used, then the rule would be simplified to "3C is allowed to modify a file if it is under the -base-dir" (possibly with an exception in stdout mode, which we're discussing at #328 (comment)). Do you think we should do that?

If neither -output-* option is set, should it be an error if more than one file is passed on the command line

I learned that it is already an error, but it is an assertion error and it should just be a nonzero exit. I'll fix that as part of this issue; I don't think it's worth filing another.

or any file not passed on the command line needs changes

This is now being discussed in #328 (comment).

If I understand correctly that our analysis tries to constrain non-canWrite files to not change, should it be an error if we try to change them?

This is moved to #387.

mwhicks1 · 2021-01-19T20:46:37Z

I agree with what you have: Yes, to your first question: Make it an error, and Yes, fix the assertion error to be a proper error.

- Add -output-dir option to write updated files to a directory structure parallel to the base dir (#347). When -output-dir is used, a source file outside the base dir can't be handled because there is no way to compute its output path. For consistency, this is now an error even when -output-dir is not used. - Convert all 3C regression tests from -output-postfix to -output-dir to avoid leaving temporary files in the clang/test/3C directory (#378). - Expand "3c -help" documentation. In particular, direct the user to pass "--" when they don't want to use a compilation database to avoid accidentally using unwanted compiler options and suppress the warning if no compilation database is found (#343). - For consistency, have stdout mode output the main file even if it is unchanged (#328). - Fix bugs in matching of file paths against the base dir (#327). - Other minor bug fixes: see the pull request description for details. Co-authored-by: John Kastner <[email protected]>

kyleheadley added enhancement New feature or request help wanted Extra attention is needed labels Dec 7, 2020

mattmccutchen-cci mentioned this issue Jan 13, 2021

Better behavior with unchanged files #328

Closed

mattmccutchen-cci mentioned this issue Jan 28, 2021

Clean up test RUN commands #346

Open

13 tasks

mattmccutchen-cci added the command-line label Feb 2, 2021

mattmccutchen-cci self-assigned this Feb 2, 2021

mattmccutchen-cci mentioned this issue Feb 2, 2021

Add -output-dir and other file handling improvements #412

Merged

14 tasks

mattmccutchen-cci closed this as completed in #412 Feb 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3C needs a better way to output multiple files. #347

3C needs a better way to output multiple files. #347

kyleheadley commented Dec 7, 2020

mattmccutchen-cci commented Dec 7, 2020 •

edited

Loading

mattmccutchen-cci commented Jan 13, 2021 •

edited

Loading

mwhicks1 commented Jan 14, 2021

mattmccutchen-cci commented Jan 19, 2021

mwhicks1 commented Jan 19, 2021

3C needs a better way to output multiple files. #347

3C needs a better way to output multiple files. #347

Comments

kyleheadley commented Dec 7, 2020

mattmccutchen-cci commented Dec 7, 2020 • edited Loading

mattmccutchen-cci commented Jan 13, 2021 • edited Loading

mwhicks1 commented Jan 14, 2021

mattmccutchen-cci commented Jan 19, 2021

mwhicks1 commented Jan 19, 2021

mattmccutchen-cci commented Dec 7, 2020 •

edited

Loading

mattmccutchen-cci commented Jan 13, 2021 •

edited

Loading