Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect functional dependency analysis in index costing #8947

Open
nicktobey opened this issue Mar 5, 2025 · 0 comments
Open

Incorrect functional dependency analysis in index costing #8947

nicktobey opened this issue Mar 5, 2025 · 0 comments
Labels

Comments

@nicktobey
Copy link
Contributor

The following function in costed_index_scan.go is incorrect:

func (c *conjCollector) getFds() *sql.FuncDepSet {
	constCols := sql.ColSet{}
	c.constant.ForEach(func(i int) {
		constCols.Add(sql.ColumnId(i))
	})
	return sql.NewLookupFDs(c.stat.FuncDeps(), c.stat.ColSet(), sql.ColSet{}, constCols, nil)
}

The purpose of this function is to run a functional dependency analysis on a set of conjunctions, based on identifying which columns are bound to constant values by the filter expressions. But there's a subtle problem: conjCollector numbers fields based on their position in the index, whilec.stat numbers fields based on their position in the table, and there is no mapping in scope that can translate between them. As a result, the wrong columns are marked as constant. This can influence how the index is costed.

Amazingly, it's very difficult to design a test case where this produces unexpected results, because in the most common case, we use the FDS to determine whether the query has at most one result and can be a point lookup, and that only happens if every column in the index gets marked as const. Since the size of c.constant can't exceed the number of columns in the index, then there are two main cases:

  • An N-column index has fewer than N constant columns, so no matter how those constants get mapped, we'll never incorrectly deduce that the index can be used as a point lookup.
  • An N-column index has exactly N constant columns, which gets detected and handled as a special case before index costing even begins.

So I don't think this issue can effect the result of analyzing simple lookups, but it can affect join planning.

The simplest way to observe this is just to comment out constCols.Add(sql.ColumnId(i)) and see which plan tests change. Most of the changed tests feel like a lateral move: removing these incorrect inputs to the FDS modifies the costing for some plans, causing a slightly different join type to be chosen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants