Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a code formatter for GDScript in the script editor #55835

Closed
wants to merge 1 commit into from

Conversation

Razoric480
Copy link
Contributor

@Razoric480 Razoric480 commented Dec 11, 2021

This implements godotengine/godot-proposals#3630 and puts in a code formatter in the GDScript module. This is accessible via Edit/Format Code, Alt + Shift + F, or when saving while having the new Format On Save editor setting enabled.

The formatting is based off of the official style guide, though there are cases where automation may cause differences. Or coding issues/bugs to be fixed in the formatter :3

Comments are supported. This involved making the GDScriptTokenizer recognize and return comment tokens instead of bypassing them. The GDScriptParser applies those comments in the various parse_ nodes as header comments (above a line of code), inline comment (at the end of a line of code), or footer comment (at the bottom of a suite or class). They are then printed by the formatter as it goes through the code.

Adds three new editor settings:

  • text_editor/behavior/formatter/format_on_save - defaults to false. When saving scripts, whether to automatically run the formatter.

  • text_editor/behavior/formatter/indent_in_multiline_block - defaults to 2. When wrapping a block of code or parameters inside of parenthesis, how many indents to add from the first.

  • text_editor/behavior/formatter/lines_between_functions - defaults to 2. It is the number of blank lines between functions.

  • text_editor/appearance/guidelines/line_length_guideline_hard_column is used for the target line length, which defaults to 100.

There's a suite of unit tests for formatting issues. It could maybe be moved into a test runner later on to make the code cleaner, and possibly some tests merged and consolidated, though I used tests to do a fair bit of TDD.

Likely improvements required

  • I can't forsee all scripts, so odds are there will be various formatting issues that come from edge cases or code combinations that happen to cause bigger than normal gaps, disappearing comments that I forgot to account for, or various other issues like that.
  • Extra features, like black's trailing commas forcing wraps, could be in subsequent PRs

Pushed to subsequent PRs

  • Integration with CLI
    • Since the formatter lives in a module, it can't/shouldn't really be called or instanced in main.cpp where all the CLI stuff happens. It could happen in gdscript.cpp's init(), but the question then is how to make the editor behave like the other CLI options, where the editor then shuts itself down after it finishes the task.
    • Other issue: passing options to the formatter, like indent type. I guess if it's running in tool mode it can just get the editor settings, but could potentially be called to be overwritten.
  • Integration with GDScript language server

@Scony
Copy link
Contributor

Scony commented Dec 13, 2021

Looks really good! @Razoric480 Are you considering harnessing GDScript testing framework to it?

@Razoric480
Copy link
Contributor Author

I didn't make immediate plans for it, but I can certainly investigate adding some tests.

@Razoric480 Razoric480 force-pushed the raz/code-format branch 4 times, most recently from 39750eb to 0ef7230 Compare April 17, 2022 04:15
@gustavi
Copy link

gustavi commented May 16, 2022

First, thank you a lot for working on this feature. As a Python developer black make my life (as a team lead) easier and resolved a lot of "conflicts" on the good way of doing thinks. I like the idea of "it's not perfect but it's the same for everyone".

As someone using black since the first version my feedback will compare the result with this tool. Since Python and GDScript have almost the same type of syntax I thing this is relevant.

I know you are working on some features (like comments) but I'll put my feedback here anyway, you can use my samples for testing if you want!

Feedback

Some line breaks around comments are removed

-##
-## XXX
-##
-
-
+# #
+# # XXX
+# #
 const VERSION := "xxx"

-
 # ----- Debug ------------------------------------------------------------------
-
-

I guess it's a bug.

Documentation comments are not supported

-##
-## XXX
-##
-
-
+# #
+# # XXX
+# #

Extra parenthesis are added around constant declaration

Example 1:

 const VERSION := "XXX"

-
 # ----- Debug ------------------------------------------------------------------
-
-
-const DEBUG := false
+const DEBUG := (
+    false
+)
 const DEBUG_IS_VERBOSE := false
 const DEBUG_SERVER := DEBUG
 const DEBUG_CLIENT := DEBUG

Example 2:

-
 # ----- Generic ----------------------------------------------------------------
-
-
-const DEFAULT_ANIMATIONS_FPS := 30
+const DEFAULT_ANIMATIONS_FPS := (
+    30
+)
 const FPS_XXX := 20.0

Comments at the end of a file are removed

Preferred char for indenting is ignored

I use 4 spaces for indent (Editor setting) but the reformat force tabs. I don't know if it's a bug or not.

3 lines are used between functions instead of 2

cf https://docs.godotengine.org/en/latest/tutorials/scripting/gdscript/gdscript_styleguide.html#blank-lines

Single quotes strings are replaced by double quotes, even when double quotes are present inside the String

 func _attrib_as_string() -> String:
     if attrib:
-        var rv: = ""
+        var rv := ""
         for key in keys():
-            rv += ' {key}="{value}"'.format(
-                {"key": key, "value": get_attr(key)}
-            )
+            rv += " {key}="{value}"".format({"key": key, "value": get_attr(key)})
         return rv
     return ""

Crash report

I'm able to crash the editor (Windows 10 - v4.0.alpha.custom_build [1edd7291e]) with the following scripts:

extends RefCounted
# bbb
var root_element: AStar2D
# aaa
class_name FooBar

func _init(tag_param: String):
    assert(tag_param != "")
class_name XMLElement2
extends RefCounted


func _as_string(indent: int) -> String:
    var element_indent: String
    if indent == -1:
        indent = -2  # small hack for for subelements
        element_indent = ""
    else:
        element_indent = "  ".repeat(indent)

    return ""

Features suggestion

  • Add a setting "Reformat on save"

That's all for the moment! I'll test more on next update but I had a lot of crashes (95% of my files). Thank you again for your work.

@Razoric480
Copy link
Contributor Author

Razoric480 commented May 16, 2022

  • Docstring still need to be implemented, just working on in-line for now.
  • Such short const being broken is weird O_o I'll have a look.
  • Comments at end of file removed... interesting - guess they don't have any object to attach to and get discarded. Will have to find a way to grab them.
  • Right now tabs are hard coded, but changing spacing and style will go into Editor Settings integration.
  • I'd forgotten about single quotes. Will make sure to add support for them.
  • Crashes, haven't run into any yet, but I've mostly been running tests. Thanks for the example script - I'll use it to make sure things don't explode as I go.
  • Reformat on saves is planned :)

Thanks for the review~

@Razoric480
Copy link
Contributor Author

Progress update: This is a little blocked due to lack of time - near full time job makes it a bit tricky to raise the energy to tackle the project. If it wasn't for comments this would be a lot more done, but comments are a very complex endeavour given the parser isn't particularly helpful with them.

@vnen
Copy link
Member

vnen commented Jul 13, 2022

We checked this briefly during the GDScript meeting. The consensus is that this idea is good (along with the proposal), we just need to flesh out this PR to a mergeable state. It doesn't need to have all features at first but it needs to have a base so we can work on it, fixing some issues like the problems with comments for instance. I can help, just ping me on the Godot Rocket chat and we can talk.

On a personal note, sorry it took me so long to check this out. Thanks for working on it.

@vnen
Copy link
Member

vnen commented Jul 13, 2022

BTW, I haven't really took a look in the implementation itself, but if something is blocking you we may discuss some other approach, either in your code or in what's there already in GDScript. If something is missing from the GDScript side we can add it.

@Razoric480 Razoric480 force-pushed the raz/code-format branch 7 times, most recently from dc2d1ff to 56d1e04 Compare July 30, 2022 18:29
@Razoric480 Razoric480 force-pushed the raz/code-format branch 3 times, most recently from ddfc962 to 5678c38 Compare July 31, 2022 18:10
@aaronfranke
Copy link
Member

aaronfranke commented Aug 24, 2022

Couldn't repro the nested if extra lines

I can still reproduce this. There are also still a lot more errors. Some of these below also cause the extra lines.

Constants are destroyed by the formatter. When I format this code:

var f = fposmod(5, TAU)

It automatically changes to this:

var f = fposmod(5, 6.28318530717959)

This code gives an error of unexpected indent:

func _ready():
	if true:
		pass
# comment
	pass

The formatter destroys the type information for typed arrays (it removes the [Node] part):

@onready var children: Array[Node] = get_children()

Comment indentation is not preserved in all cases. The formatter adds an extra tab before this comment:

func _ready():
	if true:
		pass
	# This comment is for the print statement.
	print("hi")

Signals that explicitly have zero arguments are stripped of their parenthesis (possibly intentional, but we have been using () in our codebase to explicitly mark these as such, so I'd argue against removing them):

signal my_signal()

Any comments above @onready vars are deleted:

## I am a comment describing var hi
@onready var hi

The formatter currently formats array arguments like this:

func _ready():
	var arr = []
	arr.append_array(
			[
				"test with a long string 1",
				"test with a long string 2",
				"test with a long string 3",
				"test with a long string 4"
			]
	)

The formatter should keep trailing commas on multi-line lists. This is what the style guide says here.

Also, I would prefer to avoid having opening and closing brackets by themselves:

func _ready():
	var arr = []
	arr.append_array([
		"test with a long string 1",
		"test with a long string 2",
		"test with a long string 3",
		"test with a long string 4",
	])

@Razoric480
Copy link
Contributor Author

Thanks, as always! Will start plugging these into tests and fixing.

@Scony
Copy link
Contributor

Scony commented Aug 24, 2022

@Razoric480 is there an easy way to run formatter vs multiple files?
I'm asking because I got quite a lot of GDScript files here: https://github.com/Scony/godot-gdscript-toolkit/tree/4.0-int/tests/formatter/input-output-pairs (not to mention entire repo) but testing everything by hand is cumbersome.

@Razoric480
Copy link
Contributor Author

@Razoric480 is there an easy way to run formatter vs multiple files? I'm asking because I got quite a lot of GDScript files here: https://github.com/Scony/godot-gdscript-toolkit/tree/4.0-int/tests/formatter/input-output-pairs (not to mention entire repo) but testing everything by hand is cumbersome.

@Scony

At the moment, no - CLI access is planned for a future PR, as this one was running long in the teeth and I wanted to at least get the start of it in the early beta days. So this is the "save on format" and "code editor" PR. Then I will do CLI as a matter of priority next in a follow up PR, and finally language server intergration.

Once CLI is in, a single **/*.gd blob should be able to cover everything.

Copy link
Member

@Paulb23 Paulb23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't comment too much on the GDScript side of things.

Missing a format on save under:

void ScriptEditor::_menu_option(int p_option) {
	FILE_SAVE_AS
}

I'm also slightly concerned about performance here on larger scripts as set_text will cause everything to recalculate.

I don't think it's a good idea to link it to TextEdit like with convert indent and trim trailing whitespace. However, if format_on_save is true, can we skip those calls so we don't end up formatting the same script three times?

Secondly, could we take a page out of EDIT_AUTO_INDENT book, and allow formatting parts of the script, or perhaps only call set_line on parts of the script that have changed?

Comment on lines 366 to 370
tx->set_caret_line(cursor_caret_line);
tx->set_caret_column(cursor_caret_column);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could make this a little smarter e.g. if I have:

enum MOVE_SET {LEFT|, RIGHT, UP, DOWN}

this could make:

enum MOVE_SET {|
    LEFT, 
    RIGHT, 
    UP, 
    DOWN
}

where as I would expect:

enum MOVE_SET {
    LEFT|, 
    RIGHT, 
    UP, 
    DOWN
}

There's also the question of how this will play with #61902 as set_text will remove all secondary carets.


GDP::ClassNode *root = parser.get_tree();

String output = "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all this string manipulation, would it be worth looking at StringBuilder?

@Razoric480
Copy link
Contributor Author

Razoric480 commented Aug 29, 2022

Big ol' update.

  • Went through @aaronfranke 's latest findings and fixed those.
  • Replaced String with StringBuilder as suggested by PaulB
  • Wrapped the other on-save actions inside of a not-format-on-save block

Ready for the next set of issues :P

I'm a bit unsure how to go about fixing the caret situation, however. I'll have to give it a think.

Doing the only-part-of-the-code formatting should be possible, however it would take a bit of finnagling, as just feeding it directly would probably result in the parser giving errors. As for using set_line, yeah, we could do that.

@aaronfranke
Copy link
Member

aaronfranke commented Aug 29, 2022

@Razoric480 Nothing happens when I save a script with the current version of this PR. Format on save is enabled in editor settings. I tested both this PR as-is, and rebased on the latest master, both do not do any formatting on save.

@Razoric480
Copy link
Contributor Author

Oh, uhm. Oops. I'll make sure to fix that.

@aaronfranke
Copy link
Member

This code fails to parse (Expected statement, found "else" instead):

func _ready():
	if true:
		pass
# Comment
	else:
		pass

This code fails to parse (Unindent doesn't match the previous indentation level):

func _ready():
	if true:
		pass
# Comment
		pass
	pass

The bug with a lot of extra newlines when leaving a scope is still present:

func _ready():
	if true:
		if true:
			if true:
				if true:
					pass




	pass

func _ready():
	var this_is_a_very_long_boolean_for_test_purposes: bool = false
	if this_is_a_very_long_boolean_for_test_purposes or this_is_a_very_long_boolean_for_test_purposes or this_is_a_very_long_boolean_for_test_purposes:
		pass

The formatter does correctly split the above into multiple lines, but it adds multiple sets of parenthesis for no reason:

func _ready():
	var this_is_a_very_long_boolean_for_test_purposes: bool = false
	if (
		(
			this_is_a_very_long_boolean_for_test_purposes or this_is_a_very_long_boolean_for_test_purposes
			or this_is_a_very_long_boolean_for_test_purposes
		)
	):
		pass

var my_array = [
	"there is a bug with an extra newline at the end of arrays but only when the contents have long lines",
	# Comment

]

@Razoric480 Razoric480 force-pushed the raz/code-format branch 3 times, most recently from 73fb4f6 to 085a162 Compare September 4, 2022 18:25
@Razoric480
Copy link
Contributor Author

Razoric480 commented Sep 4, 2022

New update:

  • Replaced the use of set_text with set_line
  • Moved disabled line handling into the parser, so that should reduce the number of errors with those
  • Fixed formatting issues found by @aaronfranke 's very gracious and patient testing
  • Rebased off of master and fixed a regression with null.

I did manage to recreate the scope extra line bug so that should hopefully be fixed properly now. Fewer parentheses around binary operators, too.

* Add verbose messages for debugging purposes
* Add disabled lines handling to parser
* Force newline at end of code
@aaronfranke
Copy link
Member

aaronfranke commented Sep 10, 2022

Sorry for the delay, here's the next round of problems found by testing my company's codebase.

I can still reproduce the extra lines after the end of blocks issue:

func _ready():
	for i in range(5):
		for j in range(5):
			for k in range(5):
				pass




	pass


# Comment

The formatter is removing the space before this equal sign:

func my_func(param = true):
	pass

The formatter is expanding \n. This code:

func _ready():
	print("hi\n")

Becomes this:

func _ready():
	print("hi
")

Property setters defined using methods are broken after this PR formats them. This:

var my_str: String = "":
	set = set_my_str


func set_my_str(value: String):
	pass

Becomes this (complains that it didn't expect an identifier of setget):

var my_str: String = "" setget set_my_str


func set_my_str(value: String):
	pass

The formatter breaks this code, where addition is performed and then abs is called on the result:

func _ready():
	var a = Vector2(1, 2)
	var b = Vector2(3, 4)
	var c = (a + b).abs()

It becomes this code, where abs is called on b and then it's added to a:

func _ready():
	var a = Vector2(1, 2)
	var b = Vector2(3, 4)
	var c = a + b.abs()

The formatter breaks this case with string substitution:

func _ready():
	var x: int = 50
	print("%4d" % (x / 10))

It becomes this, which fails because it says string can't be divided by 10:

func _ready():
	var x: int = 50
	print("%4d" % x / 10)

The formatter is duplicating a comment at the end of a line in this case:

var grandchild = $Child.get_node(^"Grandchild") # Comment

After formatting a few times, it looks like this:

var grandchild = $Child.get_node(
		^"Grandchild"
) # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment # Comment

The formatter is removing _ separators in code like this:

var big_number: int = 10_000_000

This code fails to parse. And yes it's isn't a well-formatted snippet of code to start with, but it should at least parse.

func _ready():
	var x: int = 0
	match x:
		0:
			pass
	# comment
			pass
		1:
			pass

The formatter is indenting this comment, seems un-ideal (shown here is the before):

func _ready():
	if true:
		pass
	# This comment is describing the below if.
	if true:
		pass

The formatter is interpreting this comment as a doc comment, moving it down and adding a space (so the line starts #, space, tab). My suggestion is that lines starting with # and tab are treated as part of the previous scope.

func _ready():
	pass
#	pass


func other_func():
	pass

Can you add a feature to turn off automatic formatting selectively, for a given line until the end of the file or until it's enabled again? Just like how clang-format has the /* clang-format off */ comment. Something like:

@formatter(false)
func plane_yz() -> Plane: return Plane(transform.basis.x.normalized(), transform.origin)
func plane_xz() -> Plane: return Plane(transform.basis.y.normalized(), transform.origin)
func plane_xy() -> Plane: return Plane(transform.basis.z.normalized(), transform.origin)
@formatter(true)

The syntax can be anything, enable_formatter, skip_format, etc, I don't really have a strong opinion. Also, this is a low-priority, if this is merged without a way to selectively disable formatting then I'll be fine with that.

@NathanLovato
Copy link
Contributor

I just started testing this, I can confirm the extra lines after block. This:

func _ready() -> void:
	if not player:
		set_physics_process(false)
		return
	initial_offset = player.global_position - global_position

Becomes that (2 inserted lines):

func _ready() -> void:
	if not player:
		set_physics_process(false)
		return


	initial_offset = player.global_position - global_position

@NathanLovato
Copy link
Contributor

NathanLovato commented Oct 21, 2022

A parenthesis gets put on a newline... semi-randomly, I'm unsure how to reproduce the bug.

Notice the last parenthesis in the _ready() function):

func _ready() -> void:
	target_detector_3d.area_entered.connect(func(area: Area3D):
		_target = area)
	target_detector_3d.area_exited.connect(func(area: Area3D):
		_target = null
)


func _physics_process(delta: float) -> void:
	if _target:
		pivot.look_at(_target.global_position)
		if timer.is_stopped():
			shoot()
			timer.start()

If I comment and uncomment physics process, once it'll work fine, the next time the parenthesis will move down.

@dalexeev
Copy link
Member

Superseded by #76211.

@akien-mga akien-mga closed this May 12, 2023
@YuriSizov YuriSizov removed this from the 4.x milestone Dec 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.