Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for a BitArray/PackedBoolArray data type for storing a large amount of bools in a small amount of space #10089

Closed
Kirbedev opened this issue Jul 1, 2024 · 7 comments

Comments

@Kirbedev
Copy link

Kirbedev commented Jul 1, 2024

Describe the project you are working on

A Metroidvania game that stores a lot of flags and map data with save files

Describe the problem or limitation you are having in your project

The type of information I need to store (e.g., uncovered map tiles, unlocked achievements, object on/off states, etc.) requires a very large amount of Boolean variables to store. The problem becomes very apparent here, since each value takes up a whole byte of memory, even though only one bit is actually being utilized. This leads to a lot of wasted memory, and while it may not seem to be a lot since most people have around 8-16GB of memory these days, it can add up very fast. For example, imagine you had 131072 (power of 2) map tiles to account for. Storing these in an array of bools will take up 131KB of memory. If you stored these as individual bits instead, it will only consume ~16KB of memory!

While it's true that you can use a BitMap to do just that, it requires 2D indexing and may not seem as intuitive or efficient as a BitArray, which is much more specialized for this purpose since it simply operates on a string of Bytes rather than a 2D matrix of them.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Add support for BitArray/PackedBoolArray to allow users to easily store single-bit flags quickly, without the need for additional logic like looping through a set of coordinates to access a bit from a BitMap. This solution will essentially allow you to store bools in an array just like how you can store Vector2 in a PackedVector2Array, among others. The engine will figure out the logic behind it using bytes and bit manipulation.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

This proposal behave like an array, allowing you to simply store bools in it as you would for any other array, like this:

var bool_array: PackedBoolArray # Or BitArray

# With bools
bool_array.append([true, true, false, true, false, false, false, true])

# With ints
bool_array.append([1, 1, 0, 1, 0, 0, 0, 1])


print(bool_array) # Prints [1, 1, 0, 1, 0, 0, 0, 1], just as seen above. 
# All this occupies only 1 byte, instead of 8.

As you can see, this approach is very simple and easy, and aims to streamline optimization by making BitArray just as easy to use as a normal array.

In my game, I used a similar approach—albeit, a little more complex since you can't make custom array types in GDScript—by utilizing a PackedByteArray and performing bit manipulation on it. It works perfectly for my use case, and has caused no performance issues. Here was my implementation for it:

class BitArray:
	var data: PackedByteArray
	var active_bits: int
	
	func _init(_num_bits: int = 8):
		resize_bits(_num_bits)
	
	func _to_string():
		var string: String
		for i in range(active_bits): string += str(int(get_bit(i)))
		
		return string
	
	func resize_bits(new_size: int, reset_bits: bool = true):
		data.resize(ceili(float(new_size) / 8))
		active_bits = new_size
		
		if reset_bits: fill_bits(false)
	
	func fill_bits(new_value: bool):
		for i in range(active_bits): set_bit(i, new_value)
	
	func append(other: BitArray):
		var offset = active_bits
		resize_bits(active_bits + other.active_bits, false)
		
		for i in range(other.active_bits): set_bit(i + offset, other.get_bit(i))
	
	func get_bit(index: int):
		if index < 0 or index > len(data) * 8 - 1:
			push_error('ERROR: Tried to call `get_bit` with invalid index.')
			return false
		elif len(data) == 0: 
			push_error('ERROR: There are no bits to set.')
			return false
		
		return bool(data[index / 8] & 1 << (index % 8))
	
	func set_bit(index: int, value: bool):
		if index < 0 or index > len(data) * 8 - 1:
			push_error('ERROR: Tried to call `set_bit` with invalid index.')
			return
		elif len(data) == 0: 
			push_error('ERROR: There are no bits to set.')
			return
		
		var mask = 1 << (index % 8) # It's nice that GDScript supports this
		if value: data[index / 8] |= mask
		else: data[index / 8] &= ~mask
	
	func duplicate():
		var new_array = BitArray.new(active_bits)
		new_array.data = data.duplicate()
		return new_array

This obviously requires a little more work to use, but it is very fast and efficient, and gets the job done.

If this enhancement will not be used often, can it be worked around with a few lines of script?

If this isn't used often, for whatever reason, another approach is to just use some of the bit manipulation logic I used above directly on a PackedByteArray:

# Get bit
func get_bit(data: PackedByteArray, index: int) -> bool: 
	if (index < 0 or index > len(data) * 8 - 1) or len(data) == 0): return
	
	# Get the value directly
	return bool(data[index / 8] & 1 << (index % 8))

# Set bit
func set_bit(data: PackedByteArray, index: int, new_value: bool) -> void:
	if (index < 0 or index > len(data) * 8 - 1) or len(data) == 0): return
	
	# Mask out the value and set it
	var mask = 1 << (index % 8)
	if value: data[index / 8] |= mask
	else: data[index / 8] &= ~mask

This approach works fine as well, but it is much less straightforward as an Array since it requires you to manually control the size of data in order to use it properly, which is what by BitArray class automates. A proper implementation of this in the engine code will do the same thing.

Is there a reason why this should be core and not an add-on in the asset library?

Because being able to store a lot of bools/flags efficiently should be just as simple and straightforward to do as with any other PackedArray type, and probably won't be much of a hurdle to implement as more advanced features. bool is a fundamental data type, and it's not unlikely that you will end up having to store a lot of them (>8) at once at some point. Having this in core means that you can easily optimize parts of your game without needing to install an add-on for each project that needs it, and you don't have to do whole implementation of it yourself each time.

@Calinou
Copy link
Member

Calinou commented Jul 2, 2024

Adding new Variant type1 has a performance cost, so it should be carefully considered. It needs to be used often enough within the engine to be worth doing.

While it's true that you can use a BitMap to do just that, it requires 2D indexing and may not seem as intuitive or efficient as a BitArray, which is much more specialized for this purpose since it simply operates on a string of Bytes rather than a 2D matrix of them.

Couldn't you make a BitMap with a very long width and 1 pixel height? There doesn't appear to be an individual limitation on width or height, other than width * height being lower than or equal to INT32_MAX. This means that for a BitMap that is 1 pixel tall, its maximum width would be INT32_MAX (roughly 2.14 billion).

Footnotes

  1. Every packed array type is its own Variant type, unlike the generic Array type.

@Kirbedev
Copy link
Author

Kirbedev commented Jul 2, 2024

That's a good point. But what I wonder is, does accessing a BitMap have any additional overhead, even with a height of 1, compared to an Array? And especially one that is accessed constantly

@Kirbedev
Copy link
Author

Kirbedev commented Jul 2, 2024

Actually, I have another idea. Since BitMaps work well enough, why not add a function to its class that allows you to access it with a single index? This way it can also behave as a flat array regardless of its dimensions. I know C++, but I never really looked at the way Variants are referenced internally, so here is an example of what I mean in GDscript:

func get_bit_flat(bitmap: BitMap, index: int) -> Vector2i:
	var map_size = bitmap.get_size()
	if index >= 0 and index < map_size.x * map_size.y:
		return Vector2i(index / map_size.x, index % map_size.x)
	else: return Vector2i.ZERO

EDIT: Ok, I looked at the source code for BitMap. I think this is how the function would be implemented, but I'm not entirely sure what ERR_FAIL_INDEX_V is--I'm guessing it tests the value and throws an error when it's out of range--so bear with me here.

bool BitMap::get_bit_flat(int index) const {
	ERR_FAIL_INDEX_V(index, width * height, false);
  	return get_bit(index / width, index % width); // Not sure if BitMap:: prefix is required here
}

@KoBeWi
Copy link
Member

KoBeWi commented Jul 2, 2024

You can also use PackedInt64Array, but it requires some wrapping effort to make it easy to use.

@AThousandShips
Copy link
Member

Using a PackedInt64Array would be made easier in combination with:

@AThousandShips
Copy link
Member

Actually, I have another idea. Since BitMaps work well enough, why not add a function to its class that allows you to access it with a single index?

This sounds like a great thing for an add-on, but I don't think it's general enough to warrant core, in most cases people aren't needing this level of data tightness for game development I'd say

@Kirbedev
Copy link
Author

Kirbedev commented Jul 4, 2024

Alright, I am going to close this issue now. If anyone wants a quick implementation of BitArrays in GDScript, feel free to use the code I provided above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants