Internal sorting of parameter keys leads to unexpected behaviour when registering an array #530

till-m · 2024-10-22T08:48:17Z

Describe the bug
Internally, the target space sorts the keys of the pbounds dict before creating the target space bounds. This is presumably because python dicts were unordered before python 3.7 and ensured that there was a canonical order to the parameters. However, after python 3.7 dict keys are now ordered, making this sorting unnecessary. Moreover, the fact that the internal representation is changed can lead to unexpected behaviour when registering a numpy array (see example below). This seems very likely to happen when using a DataFrame representation of data.

To Reproduce

Ex:

from sklearn import datasets
import pandas as pd
from bayes_opt import BayesianOptimization

df = pd.DataFrame(datasets.load_iris().data, columns=datasets.load_iris().feature_names)
df.columns = [col.replace(' (cm)', '') for col in df.columns]
df.columns = [col.replace(' ', '_') for col in df.columns]

pbounds = {
    'sepal_length': (df['sepal_length'].min(), df['sepal_length'].max()),
    'sepal_width': (df['sepal_width'].min(), df['sepal_width'].max()),
    'petal_length': (df['petal_length'].min(), df['petal_length'].max()),
    'petal_width': (df['petal_width'].min(), df['petal_width'].max())
}
optimizer = BayesianOptimization(
    f=None,
    pbounds=pbounds,
    random_state=1,
)
optimizer.register(df.iloc[0].values, 0)
print(dict(optimizer.space.array_to_params(optimizer.space.params[0])))
print(dict(df.iloc[0]))

Output:

{'petal_length': np.float64(5.1), 'petal_width': np.float64(3.5), 'sepal_length': np.float64(1.4), 'sepal_width': np.float64(0.2)}
{'sepal_length': np.float64(5.1), 'sepal_width': np.float64(3.5), 'petal_length': np.float64(1.4), 'petal_width': np.float64(0.2)}

Expected behavior
The optimizer should not change the order of the keys provided by the user. Registering a numpy array instead of a dict should assume the same order as given by the pbounds.

The text was updated successfully, but these errors were encountered:

Nkehoe-QUB · 2024-10-25T10:26:00Z

Hello, I believe this bug is affecting my usage of the package. I have the following instance of the Optimisation class and after the first iteration is seems to swap the second and third bounds, so the parameter 'thick' is meant to have bound of 0.1e-6 to 15.0e-6 and 'g' is meant to have 1.0 to 2.0 but they get swapped.

# Simulation Parameters min and max values
param_Min = [1.0, 0.1 * micron, 1.0]
param_Max = [20.0, 15.0 * micron, 2.0]

# Bayesian Optimization Class, random_state is the seed for the random number generator
acq = acquisition.UpperConfidenceBound(kappa=5.0)

optimizer = BayesianOptimization(
    f=None,
    acquisition_function=acq,
    pbounds={'den' : (param_Min[0], param_Max[0]),
             'thick' : (param_Min[1], param_Max[1]),
             'g' : (param_Min[2], param_Max[2])
             },
    verbose=2,
    random_state=1,
)

till-m · 2024-10-25T11:20:32Z

Hey @Nkehoe-QUB, if you give me a full example that reproduces the error I can have a look and see if this is really what happens.

Nkehoe-QUB · 2024-10-25T12:08:27Z

Hi, I'm sorry it turns out it's an issue how I'm handing the output of optimiser.suggest() and passing it to my simulation. I was doing:

next_points = optimizer.suggest()
    ID = str(Run_PIC(i, j, [values for values in next_points.values()]))

But I didn't realise that the keys of the suggest() output would be in alphabetical order so therefore it was passing it as den g and then thick.

perezed00 · 2024-11-14T20:59:41Z

I'm glad this has been noted. I agree that this is a wrinkle worth smoothing out. If nothing else, better compatibility with pandas seems worthwhile. I'll take a look and report back.

bwheelz36 · 2024-11-16T08:18:14Z

I do remember running into this and being tripped up by it. (example below 😝)

I agree it would be better if this didn't happen, but note that this would be a breaking change for many users of this library, who will be assuming that this library is sorting the parameters. so fixing this should be deferred until the next major release is planned in my opinion.

till-m · 2024-11-17T14:11:56Z

This should only affect people that register arrays, which hopefully is a small part of the userbase. One thing we could do is publish a minor version that raises a warning when someone registers an array, telling them that this operation will work differently in a future release.

anates · 2024-12-18T08:48:17Z

This change added confusing warnings even when executing the example code given on the web page, e.g. on https://bayesian-optimization.github.io/BayesianOptimization/2.0.1/visualization.html. Would it be possible to add a short hint about that also there, as it might be confusing for future users who want to follow the tutorial, but receive the warning?

till-m · 2024-12-18T09:04:16Z

Hey @anates,

Thanks for reporting. The warning should definitely not show up in this context, this is due to an error on my part. Let me think about the best way to handle the issue.

For the record: This happens because space.probe actually converts to an array before registering, I thought it would just pass the point as received. So now running the maximize loop will actually trigger the warning, which really shouldn't happen and is understandably extremely confusing for any user.

Apologies for the confusion & hi from Basel :)

anates · 2024-12-18T09:33:27Z

@till-m For my understanding: Is there currently any way to avoid rotations/exchanges of the keys in pbounds? Or what is the best way to approach that problem until a further fix?
Thank you, and regards back from Thun!

till-m · 2024-12-18T09:50:21Z

Hey @anates,

how are you using the library currently?

If you just run .maximize -- no need to worry. The warning gets triggered unnecessarily.
If you use suggest-evaluate-register, but you register points as dictionaries (.suggest returns a dictionary, so unless you manually transform the point, this will be the case), you don't need to worry either. The warning gets triggered unnecessarily.
If you register a point in an array-format at some point (i.e using either .register or .probe manually, with an array-valued argument), then this warning would be for you.

The plan is to fix the erroneously displayed warnings. The fact that the order is manipulated by the optimizer is something we will change in an upcoming release.

What is the problem?

Consider the case where you're trying to optimize a function

def foobar(baz, qux, quux):
    return something(baz, qux, quux)

and you have evaluated the function at foobar(baz=0., qux=1., quux=2.)=42. If you register this information with the optimizer, you can do it in two ways:

optimizer.register({'baz': 0., 'qux': 1., 'quux': 2.}, target=42) # register a dict
optimizer.register(np.array([0., 1., 2.]), target=42) # register an array

The optimizer assumes that if you register an array, the entries correspond to the alphabetically ordered parameters, i.e. it would consider the second option to be equivalent to optimizer.register({'baz': 0., 'quux': 1., 'qux': 2.}, target=42). So if you want to register an array, you need to ensure that the entries in the array are in alphabetical order of parameter names. Or, much simpler, just convert it to a dictionary and don't worry about it.

I hope this helps. Let me know if there are any other questions.

anates · 2024-12-18T10:02:30Z

Hei @till-m, currently I'm just using .maximize(), thus I was confused how to fix that warning.
Thank you for the explanation!

till-m · 2024-12-18T11:19:20Z

@anates if you install from PyPI, an updated version of the package should be available that doesn't produce the warnings. On conda it might take some time for the release to show up.

anates · 2024-12-18T12:11:40Z

@till-m Perfect, thank you very much for the quick reply and fix!

till-m added bug enhancement labels Oct 22, 2024

This was referenced Nov 24, 2024

deprecation warning for sorted parameters #532

Merged

Typed Optimization #531

Merged

till-m mentioned this issue Dec 18, 2024

Fix array registration warning #535

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal sorting of parameter keys leads to unexpected behaviour when registering an array #530

Internal sorting of parameter keys leads to unexpected behaviour when registering an array #530

till-m commented Oct 22, 2024

Nkehoe-QUB commented Oct 25, 2024 •

edited

Loading

till-m commented Oct 25, 2024

Nkehoe-QUB commented Oct 25, 2024

perezed00 commented Nov 14, 2024

bwheelz36 commented Nov 16, 2024 •

edited

Loading

till-m commented Nov 17, 2024 •

edited

Loading

anates commented Dec 18, 2024

till-m commented Dec 18, 2024

anates commented Dec 18, 2024

till-m commented Dec 18, 2024

anates commented Dec 18, 2024

till-m commented Dec 18, 2024

anates commented Dec 18, 2024

Internal sorting of parameter keys leads to unexpected behaviour when registering an array #530

Internal sorting of parameter keys leads to unexpected behaviour when registering an array #530

Comments

till-m commented Oct 22, 2024

Nkehoe-QUB commented Oct 25, 2024 • edited Loading

till-m commented Oct 25, 2024

Nkehoe-QUB commented Oct 25, 2024

perezed00 commented Nov 14, 2024

bwheelz36 commented Nov 16, 2024 • edited Loading

till-m commented Nov 17, 2024 • edited Loading

anates commented Dec 18, 2024

till-m commented Dec 18, 2024

anates commented Dec 18, 2024

till-m commented Dec 18, 2024

What is the problem?

anates commented Dec 18, 2024

till-m commented Dec 18, 2024

anates commented Dec 18, 2024

Nkehoe-QUB commented Oct 25, 2024 •

edited

Loading

bwheelz36 commented Nov 16, 2024 •

edited

Loading

till-m commented Nov 17, 2024 •

edited

Loading