Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend option for removing headers #13

Open
tmuras opened this issue Sep 24, 2013 · 6 comments
Open

Extend option for removing headers #13

tmuras opened this issue Sep 24, 2013 · 6 comments

Comments

@tmuras
Copy link

tmuras commented Sep 24, 2013

Would be nice if tool could clean up any header (so the new one can be added then). Cleaning up could be defined as: remove all lines from the top (or after <?php in case of PHP, etc) of the file until first non-comment line.

@osterman
Copy link
Collaborator

Thanks for the input! This is a tricky problem that I tried to solve as reliably as possible. The solution you propose did at some point cross my mind, but it's error prone. There are just too many way's that a general solution like this would get tripped up. For example, right after the license may come a description of the source code, which should not get removed.

The solution I arrived at was to allow you to create a new license file that matches the header already in the files (less any comment declarations or leading whitespace). Then use the --remove-path argument along with the --license-file argument passing along the location of the new license file you created. With these arguments, it should recursively remove all existing headers of that kind.

Let me know if this solution does not work for you.

-Erik

@tmuras
Copy link
Author

tmuras commented Sep 24, 2013

Hi Erik,

Thanks for the prompt response.

I would still implement removal of all comments. My use case is to clean up the files, and then add the new copyright header. Each file had different header and it was a mess.
I don't think you have to be perfect here. I would not trust any tool removing lines from my source code anyway. The way to go is to commit your code to VCS, then run your tool and review the result. I would be happy enough if it works as expected 90% of the time.

I was thinking about creating similar tool to yours as part of my project - moosh. Now that I found your utility, I think I would rather integrate with your tool. Thanks for the good work so far.

Tomek

@osterman
Copy link
Collaborator

One idea I had was to do an automatic analysis on the first N lines of commented source code across all files, after skipping the funky stuff like #! and PEP encoding lines. The idea would be to identify the common lines which are commented across X% of the files. Then make it possible to do automatic removal of those lines.

-Erik

@tmuras
Copy link
Author

tmuras commented Sep 25, 2013

That would not help me with the cleanup - my files were all different.
I think you are over- thinking this one, you don't have to be perfect each time :-). Simply removing all comments will be good enough.

@osterman
Copy link
Collaborator

The other challenge I see with this is that often multiple comment formats are allowed in the various languages. The syntax file only defines how to decorate the the license, but not how all comments could be defined. We would need to add a definition for the various comment formats in each language. I'm not keen on updating the syntax file with all the various formats.

@tmuras
Copy link
Author

tmuras commented Nov 12, 2013

I think the pre-defined standard // # /* would be good in 99% of cases, maybe with an option to over-ride it.

It's your call anyway, please close it if you don't want it for any reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants