A process to find new solutions #4

m-a-t-h-e-u-s · 2024-04-20T21:51:41Z

Hi there, I'm testing a way to implement a FMC technique to search others solutions based on a single one. It is called NISS. From the example solve you put in "usage" here are the outputs:

scramble="D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2"
beam_width=1024

Original output: [21] D L D2 R' U2 D B' D' U2 B U2 B' U' B2 D B2 D' B2 F2 U2 F2

New output:
[20] U2 R L' U2 F2 L' U2 F2 D' L U R2 F2 D' R D' B' R B' R'
[21] D R2 D B' D' B L' B' R F L2 R F' D' R B D2 R L2 U' D

With a higher beam width:

scramble="D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2"
beam_width=11024

Original output: [20] R' F L2 D R2 U' B' L' U2 F2 U L D U2 B2 D2 R2 U2 B2 R2

New output:
[19] R' F R B2 F2 R2 D2 F2 D2 B2 R' D B R L U2 R' D U
[19] D2 R U' F U2 F' U F2 R2 D2 R2 L U L' U2 D' R' F2 R'
[18] D2 L F' D F2 D' F D2 R2 B2 R' U L' U2 D' R' F2 R'

I did it by applying NISS with gradually increasing sequence of the original solution, from the beginning to the end, and finding a new solution on the inverse scramble, like we usually do in FMC.

kyo-takano · 2024-04-21T07:35:46Z

@m-a-t-h-e-u-s, thanks for exploring AlphaCube.
It seems you're proposing a new feature: implementing the NISS (Normal Inverse Scramble Switch) technique to find alternative solutions based on a single one.

Understanding NISS

From what I just learned on the Internet, NISS involves:

Applying a few prefix moves to a scramble,
Solving the instance, and
Canceling the prefix moves.

This technique works just because we're interested in relative transitions and the starting point of a scramble-solution loop can be arbitrarily rotated.

Implication

By applying NISS, you're essentially solving a given state from different perspectives multiple times (e.g., with 1 prefix move, 18 times; with 2 moves, 18 × 15 = 270 times).

Complexity-wise, this is equivalent to solving a given state with a larger beam width, which correlates strongly with the number of expanded nodes.

Quick Experiment

To assess NISS, we should compare it against the default with an at least 18x wider beam search.
In your case, solving the scramble with beam_width=1024 * 18 yields two 20-move solutions (expanding 317,196 nodes):

R' F L2 D R2 U' B' L' U2 F2 U L D U2 F2 U2 L2 D2 F2 R2
L' D' R' D2 L B' U F2 U R' U' B' F R2 B R B2 F D2 B

With beam_width=11024 * 18, we get a 17-move solution (expanding 2,631,096 nodes):

D F L' F' U2 B2 U F' L R2 B2 U D' F2 U2 R D'

Based on these results, the default beam search without NISS seems to perform better; if you added more than 1 prefix move, then the gap is even wider.
Although we don't have enough samples to draw conclusions, solution lengths are largely bounded by the beam width parameter.
Therefore, I'm skeptical about the prospects of using NISS compared to simply increasing the beam width.

Feel free to object or test NISS more extensively.

m-a-t-h-e-u-s · 2024-04-21T15:45:59Z

Thanks for replying my message!

And thank you for your beautiful system to solve the Rubik's cube! It's fascinating!!

I still think NISS could help finding shorter solutions at a cost of little additional time. Here are some data from the previous experiment:

scramble = "D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2"
inv_scram = NISS (scramble)
beam_width=11024

The original solution:

num_nodes: 191260 , moves: 20 , time: 48.17395185500027 , solution: R' F L2 D R2 U' B' L' U2 F2 U L D U2 B2 D2 R2 U2 B2 R2

Now, to apply NISS, I set the beam_width = 1024

It used the inverse of the first move of the original solution and attach to the inverse scramble -> R + inverse scramble, solved this new scramble, then it used R' + inverse solution as a new solution:

num_nodes: 20768 , moves: 22 , time: 3.3426037309991443 , solution: R B' D2 F L2 B' F' D2 B D' F' U2 F L' D' L' U2 R' D' U' R B

Now it used the inverse of the first and second move of the original solution and attach to the inverse scramble -> F' R + inverse scramble, solved this new scramble, then it used R' F + inverse solution as a new solution:

num_nodes: 15648 , moves: 19 , time: 2.362270780000472 , solution: R' F R B2 F2 R2 D2 F2 D2 B2 R' D B R L U2 R' D U

And so on:

num_nodes: 15648 , moves: 20 , time: 2.940838955999425 , solution: R' F L2 F2 U D' L2 F U2 F D2 L2 D' F L2 B2 L R2 U D

num_nodes: 16672 , moves: 20 , time: 3.0280230239995944 , solution: R' F L2 F2 U D' L2 F U2 F D2 L2 D' F L2 B2 L R2 U D

num_nodes: 17696 , moves: 24 , time: 2.7628772530006245 , solution: R' F L2 D R2 U' B' F2 D U' L R2 D B2 L B2 R' B2 U R F2 R2 U2 F2

num_nodes: 13600 , moves: 21 , time: 2.2045703710000453 , solution: R' F L2 D R2 U' B' L' U2 F2 U L D L2 U' D' L2 R2 B2 U D'

From here, like in the human case, it will always find the same solution

num_nodes: 11552 , moves: 20 , time: 1.7436429689996658 , solution: R' F L2 D R2 U' B' L' U2 F2 U L U B2 U2 R2 D' U B2 R2

Ignoring the time used to calculate NISS, we found a new solution with fewer moves with ~7s of additional computation! If we set a correspondent beam width (11024 + x) to give the same time consuming (48s+7s) it will no longer give us a 19 moves solution...

I think it's a valid way to search for great solutions.

If you are willing to wait double the time, it is possible to use the same strategy for the inverse scramble as a new scramble:

The very first solution is already better than the original one:

Only solving the inverse scramble:

Beam width = 11024
moves: 19 , time: 40.4495673750007 , solution: D2 R U' F U2 F' U F2 R2 D2 R2 L U L' U2 D' R' F2 R'

...

beam width = 1024

moves: 18 , time: 2.863695836998886 , solution: D2 L F' D F2 D' F D2 R2 B2 R' U L' U2 D' R' F2 R'

moves: 18 , time: 2.768658573999346 , solution: D2 L F' D F2 D' F D2 R2 B2 R' U L' U2 D' R' F2 R'
moves
...

It took additional ~40s to find a 19 move solution and than ~20s latter a 18 move solution!

I don't know if this will convince you, but I had to try hahahaha.

I used the following code to extract those results:

Solves[[result['num_nodes'],result['time'],result['solutions'][0]]]

a = result['solutions'][0].split(" ")
sol=''
for i in range((len(a)-1)):
  sol+=a[i]
  niss = NISS(sol)+" "+inv_scram
  niss = normalizar(niss)
  sol=sol+" "
  alphacube.load()
  result1 = alphacube.solve(scramble=niss,beam_width=1024,)
  Solves.append([result1['num_nodes'],result1['time'],normalizar(sol+" "+NISS(result1['solutions'][0]))])

for i in Solves:
  print('num_nodes:',i[0],', moves:',len(i[2].split(" ")),', time:',i[1], ', solution:',i[2])

Thank you for your time!

kyo-takano · 2024-04-21T16:31:46Z

I see... It seems like you're suggesting to apply NISS after obtaining initial solution(s), not from the very beginning without prior knowledge.

If that's the case, then your suggestion -- reducing the initial complexity with NISS and re-solving the instance -- might be in fact reasonable.

At least in the case "D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2", as you mentioned, rescaling the beam width by (191260 + 15648) / 191260 to account for the extra time only reached 20-move solutions at best.

Note, however, it's still possible that increasing the beam width to account for the extra time works better on average.

Maybe that's something you may want to look into further.
If you will, you can get a dataset by calling dataset = alphacube._evaluator.get_dataset() and test your extension.

m-a-t-h-e-u-s · 2024-04-23T09:19:40Z

Hi there,

Upon verifying the effectiveness of this strategy on a random dataset sample, I discovered that in the majority of cases, it does not yield a better solution. When a better solution is found by NISS implantation, increasing the beam width would likely be more beneficial in many cases. Regrettably, the example I provided to you turned out to be an outlier.

I sincerely apologize for any inconvenience or time wasted due to this oversight.

kyo-takano · 2024-04-23T10:09:16Z

@m-a-t-h-e-u-s

Thanks for reporting the evaluation result.

Let me first tell you that this is definitely not something you should be apologizing for and that I appreciate your suggestion.
As a matter of fact, I am still not sure if we should reject your proposal to (optionally) apply NISS post hoc.

Would you be willing to provide the code you used for the evaluation?

My concern here is that you might not have thoroughly tested your proposal.
More specifically, there actually might be a sweet spot where the trade-off between search complexity and optimality could be improved. The parameters to explore are:

The number of prefix moves you apply with NISS
The proportion of beam width relative to the initial solve

m-a-t-h-e-u-s · 2024-04-23T16:06:39Z

Thank you for beeing nice!

In the rare case that NISS finds a better solution (depends on bw original and bw niss), It is hard to me to find a correspondent beam width that matches the number of moves of the better solution from NISS, and then compare the time taken.

It would be nice to find a good cutoff number of moves to check using NISS as well. I think it is linear with respect to the length of the original solution.

I know it's another topic, but I'm trying to perform symbolic regression on your model. However, dealing with 54 entries and 18 outputs is quite overwhelming. I'm contemplating training another model to predict the number of moves required for the AlphaCube to solve a given scramble (I recall you did this, but I'm unsure if you have the model saved). Then, I'll use it as an evaluation function, similar to chess engines, and proceed with symbolic regression on this model, which involves 54 entries and 1 output.

Here is the humble code I used, you can put it in a loop and extract the data you think is important.

def NISS(scramble):  # inverting a sequence of moves NISS("R U' F2") = "F2 U R'"
    a = scramble.split(" ")
    b = ""
    for i in range(len(a)):
        if len(a[len(a) - 1 - i]) == 1:  # U -> U'
            lin = a[len(a) - 1 - i] + "'"
        else:
            if a[len(a) - 1 - i][1] == "'":  # U' -> U
                lin = a[len(a) - 1 - i][0]
            else:  # U2 -> U2
                lin = a[len(a) - 1 - i]
        b = b + lin + " "
    return b


def NISS_candidates(scramble, bw_initial, bw):
    import time

    time1 = time.time()  # to see the total time of computation
    inverse_scramble = NISS(scramble)  # inverse of the scramble
    alphacube.load()
    result = alphacube.solve(
        scramble=scramble,
        beam_width=bw_initial,
    )  # initial solution
    total_time = result["time"]  # to see time of searching new solutions only
    num_moves = len(result["solutions"][0].split(" "))  # to track the number of moves of the original solution
    Solves = [[num_moves, result["time"], result["solutions"][0]]]  # number of moves; time; solution
    Original_solution = result["solutions"][0].split(" ")
    fragment_solution = ""  # a sequence from the beginning up to a part of the solution
    for i in range(
        num_moves - 5
    ):  # it is certain that omitting the 5 last moves of the solution it will give the original solution
        fragment_solution += Original_solution[i]
        new_scramble = NISS(fragment_solution) + " " + inverse_scramble  # New scramble generated
        # niss = normalizar(niss)
        alphacube.load()
        result1 = alphacube.solve(
            scramble=new_scramble,
            beam_width=bw,
        )  # bw may be different to bw_initial
        total_time += result1["time"]  # add cumulative time of solution only
        new_solution = fragment_solution + " " + NISS(result1["solutions"][0])  # according to NISS technique
        Solves.append(
            [len(new_solution.split(" ")), total_time, new_solution]
        )  # number of moves of new solution; time to get new solution; new solution
        fragment_solution += " "
    time2 = time.time()
    return Solves, time2 - time1


scramble = dataset[5]  # a example that I found a better solution with NISS with those parameters
bw_initial = 11024
bw = 1024
Solutions, total_time = NISS_candidates(scramble, bw_initial, bw)
print("Total time of NISS approach:", total_time)
for i in Solutions:
    print(i)

equal_original = len(Solutions) - 1
while (Solutions[equal_original - 1][0] == Solutions[equal_original][0]) and (equal_original > 2):
    equal_original -= 1

print("From", equal_original, "to beyond, the solutions were the same.")

best = Solutions[0][0]
best_position = 0

for i in range(1, len(Solutions)):
    if Solutions[i][0] < best:  # finding the best solution
        best = Solutions[i][0]
        best_position = i

if best_position == 0:
    print("No better solution was found")
    print("Best solution:", Solutions[0])
    print("Wasted time:", total_time - Solutions[0][1])
else:
    print("A better solution was found")
    print("Original solution:", Solutions[0])
    print("Best solution:", Solutions[best_position])
    print("Total add time required:", total_time - Solutions[0][1])
    print("Specific add time required:", Solutions[best_position][1] - Solutions[0][1])

Thank you for your attention.

kyo-takano · 2024-04-24T12:06:44Z

I modified your code so that NISS_candidates returns the best solution length and the time taken, whether NISS is employed or leads to a better one, given the initial and NISS beam widths as well as the number of NISS moves.

import time
import numpy as np
from tqdm import tqdm
import alphacube
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


def NISS(moves: list[str]):
    """
    Convert a solution fragment to NISS.

    Args:
        moves (list[str]): List of moves in standard notation.

    Returns:
        list[str]: List of moves in NISS format.
    """
    new = []
    for i in range(len(moves)):
        if len(moves[len(moves) - 1 - i]) == 1:
            lin = moves[len(moves) - 1 - i] + "'"
        else:
            if moves[len(moves) - 1 - i][-1] == "'":
                lin = moves[len(moves) - 1 - i][0]
            else:
                lin = moves[len(moves) - 1 - i]
        new.append(lin)
    return new


def NISS_candidates(scramble: list[str], bw_initial: int, bw: int, num_niss_moves: int):
    """
    Solve a scramble optionally (when `num_niss_moves` > 0) using NISS.
    Always return the best even if not found with NISS

    Args:
        scramble (list): Scramble to solve.
        bw_initial (int): Initial beam width.
        bw (int): Beam width for NISS.
        num_niss_moves (int): Number of NISS moves.

    Returns:
        tuple: (solution length, latency)
    """
    t_0 = time.monotonic()
    # Solve normally
    result_init = alphacube.solve(scramble=scramble, beam_width=bw_initial)
    assert result_init is not None
    solution = result_init["solutions"][0].split()

    if num_niss_moves:
        # Permutate the scramble-solution chain starting with a fragment of solution
        solution_fragment = solution[:num_niss_moves]
        scramble_NISS = NISS(solution_fragment) + NISS(scramble)
        result_NISS = alphacube.solve(scramble=scramble_NISS, beam_width=bw, max_depth=len(solution) - 1)
        if result_NISS is not None:
            # Undo the NISS permutation
            solution = solution_fragment + NISS(result_NISS["solutions"][0].split())

    return len(solution), time.monotonic() - t_0

"""Setup"""
alphacube.load()
dataset = alphacube._evaluator.get_dataset()
df = pd.DataFrame(columns="num_niss_moves,bw_initial,NISS_proportion,latency,length".split(","))

num_samples = 30  # The minimum number of samples to approximate a normal distribution

"""Test with different parameters"""
bw_initials = 2 ** np.arange(8, 14 + 1)
bw_NISS_proportions = [0.0, 0.25, 0.5, 0.75, 1.0]

for bw_initial in bw_initials:
    for num_niss_moves in range(1, 3 + 1):
        for NISS_proportion in bw_NISS_proportions:
            if len(df.query(f"{bw_initial=} & {num_niss_moves=} & {NISS_proportion=} & {bw_initial=}".replace("=", "=="))):
                continue
            length_mean, latency_mean = 0.0, 0.0
            for scramble in tqdm(dataset[:num_samples]):
                scramble = scramble.split()
                length, latency = NISS_candidates(scramble, bw_initial, round(bw_initial * NISS_proportion), num_niss_moves)
                length_mean += length / num_samples
                latency_mean += latency / num_samples

            df.loc[len(df)] = {
                "bw_initial": bw_initial,
                "num_niss_moves": num_niss_moves,
                "NISS_proportion": NISS_proportion,
                "latency": latency_mean,
                "length": length_mean,
            }


"""Visualize the result"""
fig, axs = plt.subplots(1, df.num_niss_moves.max(), figsize=(3.5 * df.num_niss_moves.max(), 4))
for i, num_niss_moves in enumerate(range(1, df.num_niss_moves.max() + 1)):
    sns.lineplot(
        data=df[df.num_niss_moves == num_niss_moves], x="latency", y="length", hue="NISS_proportion", marker="o", ax=axs[i]
    )
    axs[i].set_title(f"{num_niss_moves=}")
    axs[i].set_xlabel("Latency (s)")
    axs[i].set_ylabel("Mean solution length")
    axs[i].set_xscale("log")


plt.suptitle(f"Latency-optimality trade-off\n(device='{alphacube.device}'; `num_niss_moves==0` indicates no NISS)")
plt.tight_layout()
plt.show()

On CPU:

On GPU:

As you can see from the figures, there appears to be an optimal point where applying a single NISS move can be beneficial.
Please feel free to more extensively test your proposal by increasing the number of samples, using num_nodes as the metric, etc.

I think it is linear with respect to the length of the original solution.

Based on the above result, this appears to be false: always 1 at best.

kyo-takano · 2024-04-24T12:13:29Z

Please create a new issue for a different topic. I'm not responding here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A process to find new solutions #4

A process to find new solutions #4

m-a-t-h-e-u-s commented Apr 20, 2024

kyo-takano commented Apr 21, 2024

m-a-t-h-e-u-s commented Apr 21, 2024 •

edited by kyo-takano

Loading

kyo-takano commented Apr 21, 2024

m-a-t-h-e-u-s commented Apr 23, 2024

kyo-takano commented Apr 23, 2024

m-a-t-h-e-u-s commented Apr 23, 2024 •

edited by kyo-takano

Loading

kyo-takano commented Apr 24, 2024

kyo-takano commented Apr 24, 2024

A process to find new solutions #4

A process to find new solutions #4

Comments

m-a-t-h-e-u-s commented Apr 20, 2024

kyo-takano commented Apr 21, 2024

Understanding NISS

Implication

Quick Experiment

m-a-t-h-e-u-s commented Apr 21, 2024 • edited by kyo-takano Loading

kyo-takano commented Apr 21, 2024

m-a-t-h-e-u-s commented Apr 23, 2024

kyo-takano commented Apr 23, 2024

m-a-t-h-e-u-s commented Apr 23, 2024 • edited by kyo-takano Loading

kyo-takano commented Apr 24, 2024

kyo-takano commented Apr 24, 2024

m-a-t-h-e-u-s commented Apr 21, 2024 •

edited by kyo-takano

Loading

m-a-t-h-e-u-s commented Apr 23, 2024 •

edited by kyo-takano

Loading