Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.PriorityQueue.update assumes T supports the equality operator #9918

Closed
paoda opened this issue Oct 8, 2021 · 9 comments · Fixed by #12154
Closed

std.PriorityQueue.update assumes T supports the equality operator #9918

paoda opened this issue Oct 8, 2021 · 9 comments · Fixed by #12154
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@paoda
Copy link
Contributor

paoda commented Oct 8, 2021

Zig Version

0.9.0-dev.1275+ac52e0056

Steps to Reproduce

Given the main.zig described below:

const std = @import("std");

const Event = struct {
    kind: EventKind,
    timestamp: u64,
};

const EventKind = enum {
    u64Overflow,
};

fn lessThan(a: Event, b: Event) std.math.Order {
    return std.math.order(a.timestamp, b.timestamp);
}

pub fn main() !void {
    const alloc = std.heap.page_allocator;

    var pq = std.PriorityQueue(Event).init(alloc, lessThan);
    defer pq.deinit();

    // some contrived reason to use PriorityQueue.update
    const wrong = Event{
        .kind = EventKind.u64Overflow,
        .timestamp = std.math.maxInt(u32),
    };

    const right = Event{
        .kind = EventKind.u64Overflow,
        .timestamp = std.math.maxInt(u64),
    };

    try pq.add(wrong);
    try pq.update(wrong, right);
}

run zig build in order to reproduce the bug.

Expected Behavior

I expected the zig program to compile.

Actual Behavior

I received the following from stderr:

/piston/packages/zig/0.8.0/bin/lib/std/mem.zig:1009:22: error: operator not allowed for type 'Event'
        if (slice[i] == value) return i;
                     ^
/piston/packages/zig/0.8.0/bin/lib/std/mem.zig:993:28: note: called from here
    return indexOfScalarPos(T, slice, 0, value);
                           ^
/piston/packages/zig/0.8.0/bin/lib/std/priority_queue.zig:209:60: note: called from here
            var update_index: usize = std.mem.indexOfScalar(T, self.items[0..self.len], elem) orelse return error.ElementNotFound;
                                                           ^
/piston/packages/zig/0.8.0/bin/lib/std/priority_queue.zig:208:64: note: called from here
        pub fn update(self: *Self, elem: T, new_elem: T) !void {
                                                               ^
/piston/packages/zig/0.8.0/run: line 4: ./out: No such file or directory

Which suggests to me that either this is an oversight or it should be made clear that there is an equality operator constraint on std.PriorityQueue.update or std.PriorityQueue as a whole.

@paoda paoda added the bug Observed behavior contradicts documented or intended behavior label Oct 8, 2021
@paoda
Copy link
Contributor Author

paoda commented Oct 8, 2021

Changing the implementation of indexOfScalarPos from:

pub fn indexOfScalarPos(comptime T: type, slice: []const T, start_index: usize, value: T) ?usize {
    var i: usize = start_index;
    while (i < slice.len) : (i += 1) {
        if (slice[i] == value) return i;
    }
    return null;
}

to:

pub fn indexOfScalarPos(comptime T: type, slice: []const T, start_index: usize, value: T) ?usize {
    var i: usize = start_index;
    while (i < slice.len) : (i += 1) {
        if (meta.eql(slice[i], value)) return i;
    }
    return null;
}

resolves this issue and the program in my original comment will then proceed to compile. However, I can't comment on the performance implications of such a change, or whether this would even be the right way of solving this.

@andrewrk andrewrk added proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library. and removed bug Observed behavior contradicts documented or intended behavior labels Nov 20, 2021
@andrewrk andrewrk added this to the 0.10.0 milestone Nov 20, 2021
@Chris3606
Copy link

Chris3606 commented Dec 16, 2021

I can't comment on the performance of that fix either; but whether the solution is that, or passing in a custom equality function (perhaps via a context like HashMap uses), I believe there are clear use cases for which == is insufficient.

Even basic algorithms like Dijkstra will require the association of a value to its priority; and as far as I'm aware, this requires the use of a struct as the priority queue type, something to the effect of:

pub const DijkstraNode {
    node_val: T,
    priority: u32
};

Namely because, there isn't a good way (that I'm aware of) to implement the required comparison function for a priority queue unless the value and the priority are contained within the same struct. There are obviously some current limitations to closures/lambda capturing in Zig, as discussed particularly near the bottom of #1048, so if you had a priority queue of T values (even assuming T supports == ), I don't see how you could implement that comparison function without a struct like this, even if you maintained the mapping of values to priority in a separate structure.

@Chris3606
Copy link

In Dijkstra's specifically, it is of note that it wouldn't require this association; you could just insert duplicate elements; but I would think that there are still cases where this behavior could be problematic.

@Deecellar
Copy link
Contributor

I think is a missuse of std.mem.indexOfScalar the name suggest it shoud be used only on scalar values that are numerical (or so I understand), so really priorityQueue should use std.mem.indexOfPos since that one uses internally std.mem.eql and would also fix it, think is, that std.mem.indexOfPos need more than one value, so maybe a better proposal would be adding a function that takes 1 element and makes it an slice and pass it to indexOfPos?

@thomastay
Copy link

Ran into this today while doing Djikstra too. I propose that the update function shouldn't do any searching in the queue for you. Instead, it should take in an index of the item to update, and then just update it with the new value. Users can write their own version of std::find customized to whatever they want.

Here's a code sample of what I'm thinking of. This is code from AoC day 15 I was working on, when I wanted to implement addOrUpdate to the priority queue. Bear in mind that this is fantasy code so it might not actually be correct 😅

fn addOrUpdate(queue: *DjikstraPQ, item: QueueDatum) !void {
    var foundIdx: usize = 0;
    // std::find
    var it = queue.iterator();
    while (it.next()) |elt|: (foundIdx += 1) {
        // Users should change this to implement their own find function
        if (elt.pos == item.pos) break;
    }
    if (found != queue.count()) {
        queue.update(foundIdx, item) catch unreachable; // in this case, update takes an index instead
    } else {
        try queue.add(item);
    }
}

As @Chris3606 mentions, adding duplicate elements into the queue is definitely a workaround, but is not a good long term solution.

I took a look at what a Rust crate does, looks like they take the element in but they also have a requirement that the elements inserted must be hashable, and there is an internal hash table that keeps track of which element is in which index.
I don't think that is a good solution in general, but is interesting to note.

@schmee
Copy link
Contributor

schmee commented Dec 25, 2021

Closed in #10342?

@Chris3606
Copy link

Chris3606 commented Dec 25, 2021

Closed in #10342?

I don't believe so. The update function in that implementation is still using indexOfScalar: see here, and that function uses == which is what was causing the original issue.

@Chris3606
Copy link

Additionally, I wanted to add a bit to the discussion on the API from above: I agree with @thomastay insofar as an update function that does a linear search doesn't necessarily support all use cases very well. In fact, in an optimized Dijkstra or particularly an A* implementation, for example, although I haven't formally tested it, I believe that even if the update function did work when T is a struct, I'd still prefer creating duplicate elements over the current implementation of update for the sake of performance.

Particularly since in theory (if the API allowed a user to cache nodes themselves and the data types were cached in a way accessible in constant time) it should be possible to have a constant-time lookup of the node then simply the O(log(n)) update operation, I think there would be significant performance benefits in some cases to being able to use an O(log(n)) update function over adding duplicate nodes, since it would be the same operation but without the additional memory usage and items to process in the queue.

voroskoi added a commit to voroskoi/zig that referenced this issue Jul 17, 2022
update() calls mem.indexOfScalar() which uses `==` for comparing items,
which fails when the operator is not supported.
As PirorityQueue needs a comparing function anyway we can use `.eq` results
to identify matching objects.

closes ziglang#9918
voroskoi added a commit to voroskoi/zig that referenced this issue Jul 17, 2022
update() calls mem.indexOfScalar() which uses `==` for comparing items,
which fails when the operator is not supported.
As PirorityQueue needs a comparing function anyway we can use `.eq` results
to identify matching objects.

closes ziglang#9918
@voroskoi
Copy link
Contributor

voroskoi commented Jul 18, 2022

I have tried my Aoc 2021 solutions with #12154:
day15a: https://nest.pijul.com/voroskoi/aoc2021-zig:main/37QWQTZRMBJPE.EMAAA
day15b: https://nest.pijul.com/voroskoi/aoc2021-zig:main/QBPUYYX77BA2W.VIAAA

In ReleaseFast first part 1100 us -> 2200 us, the second part 33.000us -> 105.000 us.

PriorityQueue use a binary heap under the hood, so we can not use std.sort.binarySearch() for finding elements. I have no better idea, than the rust solution mentioned before: using a HashMap for storing the array positions to make lookup faster. On the other hand the HashMap just creates overhead when You do not use update()...

Vexu pushed a commit that referenced this issue Jul 22, 2022
update() calls mem.indexOfScalar() which uses `==` for comparing items,
which fails when the operator is not supported.
As PirorityQueue needs a comparing function anyway we can use `.eq` results
to identify matching objects.

closes #9918
wooster0 pushed a commit to wooster0/zig that referenced this issue Jul 24, 2022
update() calls mem.indexOfScalar() which uses `==` for comparing items,
which fails when the operator is not supported.
As PirorityQueue needs a comparing function anyway we can use `.eq` results
to identify matching objects.

closes ziglang#9918
@Vexu Vexu modified the milestones: 0.12.0, 0.10.0 Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants