Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A process to find new solutions #4

Open
m-a-t-h-e-u-s opened this issue Apr 20, 2024 · 8 comments
Open

A process to find new solutions #4

m-a-t-h-e-u-s opened this issue Apr 20, 2024 · 8 comments

Comments

@m-a-t-h-e-u-s
Copy link

Hi there, I'm testing a way to implement a FMC technique to search others solutions based on a single one. It is called NISS. From the example solve you put in "usage" here are the outputs:

scramble="D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2"
beam_width=1024

Original output: [21] D L D2 R' U2 D B' D' U2 B U2 B' U' B2 D B2 D' B2 F2 U2 F2

New output:
[20] U2 R L' U2 F2 L' U2 F2 D' L U R2 F2 D' R D' B' R B' R'
[21] D R2 D B' D' B L' B' R F L2 R F' D' R B D2 R L2 U' D

With a higher beam width:

scramble="D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2"
beam_width=11024

Original output: [20] R' F L2 D R2 U' B' L' U2 F2 U L D U2 B2 D2 R2 U2 B2 R2

New output:
[19] R' F R B2 F2 R2 D2 F2 D2 B2 R' D B R L U2 R' D U
[19] D2 R U' F U2 F' U F2 R2 D2 R2 L U L' U2 D' R' F2 R'
[18] D2 L F' D F2 D' F D2 R2 B2 R' U L' U2 D' R' F2 R'

I did it by applying NISS with gradually increasing sequence of the original solution, from the beginning to the end, and finding a new solution on the inverse scramble, like we usually do in FMC.

@kyo-takano
Copy link
Owner

@m-a-t-h-e-u-s, thanks for exploring AlphaCube.
It seems you're proposing a new feature: implementing the NISS (Normal Inverse Scramble Switch) technique to find alternative solutions based on a single one.


Understanding NISS

From what I just learned on the Internet, NISS involves:

  1. Applying a few prefix moves to a scramble,
  2. Solving the instance, and
  3. Canceling the prefix moves.

This technique works just because we're interested in relative transitions and the starting point of a scramble-solution loop can be arbitrarily rotated.

Implication

By applying NISS, you're essentially solving a given state from different perspectives multiple times (e.g., with 1 prefix move, 18 times; with 2 moves, 18 × 15 = 270 times).

Complexity-wise, this is equivalent to solving a given state with a larger beam width, which correlates strongly with the number of expanded nodes.

Quick Experiment

To assess NISS, we should compare it against the default with an at least 18x wider beam search.
In your case, solving the scramble with beam_width=1024 * 18 yields two 20-move solutions (expanding 317,196 nodes):

  • R' F L2 D R2 U' B' L' U2 F2 U L D U2 F2 U2 L2 D2 F2 R2
  • L' D' R' D2 L B' U F2 U R' U' B' F R2 B R B2 F D2 B

With beam_width=11024 * 18, we get a 17-move solution (expanding 2,631,096 nodes):

  • D F L' F' U2 B2 U F' L R2 B2 U D' F2 U2 R D'

Based on these results, the default beam search without NISS seems to perform better; if you added more than 1 prefix move, then the gap is even wider.
Although we don't have enough samples to draw conclusions, solution lengths are largely bounded by the beam width parameter.
Therefore, I'm skeptical about the prospects of using NISS compared to simply increasing the beam width.


Feel free to object or test NISS more extensively.

@m-a-t-h-e-u-s
Copy link
Author

m-a-t-h-e-u-s commented Apr 21, 2024

Thanks for replying my message!

And thank you for your beautiful system to solve the Rubik's cube! It's fascinating!!

I still think NISS could help finding shorter solutions at a cost of little additional time. Here are some data from the previous experiment:

scramble = "D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2"
inv_scram = NISS (scramble)
beam_width=11024

The original solution:

num_nodes: 191260 , moves: 20 , time: 48.17395185500027 , solution: R' F L2 D R2 U' B' L' U2 F2 U L D U2 B2 D2 R2 U2 B2 R2

Now, to apply NISS, I set the beam_width = 1024

It used the inverse of the first move of the original solution and attach to the inverse scramble -> R + inverse scramble, solved this new scramble, then it used R' + inverse solution as a new solution:

num_nodes: 20768 , moves: 22 , time: 3.3426037309991443 , solution: R B' D2 F L2 B' F' D2 B D' F' U2 F L' D' L' U2 R' D' U' R B

Now it used the inverse of the first and second move of the original solution and attach to the inverse scramble -> F' R + inverse scramble, solved this new scramble, then it used R' F + inverse solution as a new solution:

num_nodes: 15648 , moves: 19 , time: 2.362270780000472 , solution: R' F R B2 F2 R2 D2 F2 D2 B2 R' D B R L U2 R' D U

And so on:

num_nodes: 15648 , moves: 20 , time: 2.940838955999425 , solution: R' F L2 F2 U D' L2 F U2 F D2 L2 D' F L2 B2 L R2 U D

num_nodes: 16672 , moves: 20 , time: 3.0280230239995944 , solution: R' F L2 F2 U D' L2 F U2 F D2 L2 D' F L2 B2 L R2 U D

num_nodes: 17696 , moves: 24 , time: 2.7628772530006245 , solution: R' F L2 D R2 U' B' F2 D U' L R2 D B2 L B2 R' B2 U R F2 R2 U2 F2

num_nodes: 13600 , moves: 21 , time: 2.2045703710000453 , solution: R' F L2 D R2 U' B' L' U2 F2 U L D L2 U' D' L2 R2 B2 U D'

From here, like in the human case, it will always find the same solution

num_nodes: 11552 , moves: 20 , time: 1.7436429689996658 , solution: R' F L2 D R2 U' B' L' U2 F2 U L U B2 U2 R2 D' U B2 R2

Ignoring the time used to calculate NISS, we found a new solution with fewer moves with ~7s of additional computation! If we set a correspondent beam width (11024 + x) to give the same time consuming (48s+7s) it will no longer give us a 19 moves solution...

I think it's a valid way to search for great solutions.

If you are willing to wait double the time, it is possible to use the same strategy for the inverse scramble as a new scramble:

The very first solution is already better than the original one:

Only solving the inverse scramble:

Beam width = 11024
moves: 19 , time: 40.4495673750007 , solution: D2 R U' F U2 F' U F2 R2 D2 R2 L U L' U2 D' R' F2 R'

...

beam width = 1024

moves: 18 , time: 2.863695836998886 , solution: D2 L F' D F2 D' F D2 R2 B2 R' U L' U2 D' R' F2 R'

moves: 18 , time: 2.768658573999346 , solution: D2 L F' D F2 D' F D2 R2 B2 R' U L' U2 D' R' F2 R'
moves
...

It took additional ~40s to find a 19 move solution and than ~20s latter a 18 move solution!

I don't know if this will convince you, but I had to try hahahaha.

I used the following code to extract those results:

Solves[[result['num_nodes'],result['time'],result['solutions'][0]]]

a = result['solutions'][0].split(" ")
sol=''
for i in range((len(a)-1)):
  sol+=a[i]
  niss = NISS(sol)+" "+inv_scram
  niss = normalizar(niss)
  sol=sol+" "
  alphacube.load()
  result1 = alphacube.solve(scramble=niss,beam_width=1024,)
  Solves.append([result1['num_nodes'],result1['time'],normalizar(sol+" "+NISS(result1['solutions'][0]))])

for i in Solves:
  print('num_nodes:',i[0],', moves:',len(i[2].split(" ")),', time:',i[1], ', solution:',i[2])

Thank you for your time!

@kyo-takano
Copy link
Owner

I see... It seems like you're suggesting to apply NISS after obtaining initial solution(s), not from the very beginning without prior knowledge.

If that's the case, then your suggestion -- reducing the initial complexity with NISS and re-solving the instance -- might be in fact reasonable.

At least in the case "D U F2 L2 U' B2 F2 D L2 U R' F' D R' F' U L D' F' D R2", as you mentioned, rescaling the beam width by (191260 + 15648) / 191260 to account for the extra time only reached 20-move solutions at best.

Note, however, it's still possible that increasing the beam width to account for the extra time works better on average.

Maybe that's something you may want to look into further.
If you will, you can get a dataset by calling dataset = alphacube._evaluator.get_dataset() and test your extension.

@m-a-t-h-e-u-s
Copy link
Author

Hi there,

Upon verifying the effectiveness of this strategy on a random dataset sample, I discovered that in the majority of cases, it does not yield a better solution. When a better solution is found by NISS implantation, increasing the beam width would likely be more beneficial in many cases. Regrettably, the example I provided to you turned out to be an outlier.

I sincerely apologize for any inconvenience or time wasted due to this oversight.

@kyo-takano
Copy link
Owner

@m-a-t-h-e-u-s

Thanks for reporting the evaluation result.

Let me first tell you that this is definitely not something you should be apologizing for and that I appreciate your suggestion.
As a matter of fact, I am still not sure if we should reject your proposal to (optionally) apply NISS post hoc.

Would you be willing to provide the code you used for the evaluation?

My concern here is that you might not have thoroughly tested your proposal.
More specifically, there actually might be a sweet spot where the trade-off between search complexity and optimality could be improved. The parameters to explore are:

  • The number of prefix moves you apply with NISS
  • The proportion of beam width relative to the initial solve

@m-a-t-h-e-u-s
Copy link
Author

m-a-t-h-e-u-s commented Apr 23, 2024

Thank you for beeing nice!

In the rare case that NISS finds a better solution (depends on bw original and bw niss), It is hard to me to find a correspondent beam width that matches the number of moves of the better solution from NISS, and then compare the time taken.

It would be nice to find a good cutoff number of moves to check using NISS as well. I think it is linear with respect to the length of the original solution.

I know it's another topic, but I'm trying to perform symbolic regression on your model. However, dealing with 54 entries and 18 outputs is quite overwhelming. I'm contemplating training another model to predict the number of moves required for the AlphaCube to solve a given scramble (I recall you did this, but I'm unsure if you have the model saved). Then, I'll use it as an evaluation function, similar to chess engines, and proceed with symbolic regression on this model, which involves 54 entries and 1 output.

Here is the humble code I used, you can put it in a loop and extract the data you think is important.

def NISS(scramble):  # inverting a sequence of moves NISS("R U' F2") = "F2 U R'"
    a = scramble.split(" ")
    b = ""
    for i in range(len(a)):
        if len(a[len(a) - 1 - i]) == 1:  # U -> U'
            lin = a[len(a) - 1 - i] + "'"
        else:
            if a[len(a) - 1 - i][1] == "'":  # U' -> U
                lin = a[len(a) - 1 - i][0]
            else:  # U2 -> U2
                lin = a[len(a) - 1 - i]
        b = b + lin + " "
    return b


def NISS_candidates(scramble, bw_initial, bw):
    import time

    time1 = time.time()  # to see the total time of computation
    inverse_scramble = NISS(scramble)  # inverse of the scramble
    alphacube.load()
    result = alphacube.solve(
        scramble=scramble,
        beam_width=bw_initial,
    )  # initial solution
    total_time = result["time"]  # to see time of searching new solutions only
    num_moves = len(result["solutions"][0].split(" "))  # to track the number of moves of the original solution
    Solves = [[num_moves, result["time"], result["solutions"][0]]]  # number of moves; time; solution
    Original_solution = result["solutions"][0].split(" ")
    fragment_solution = ""  # a sequence from the beginning up to a part of the solution
    for i in range(
        num_moves - 5
    ):  # it is certain that omitting the 5 last moves of the solution it will give the original solution
        fragment_solution += Original_solution[i]
        new_scramble = NISS(fragment_solution) + " " + inverse_scramble  # New scramble generated
        # niss = normalizar(niss)
        alphacube.load()
        result1 = alphacube.solve(
            scramble=new_scramble,
            beam_width=bw,
        )  # bw may be different to bw_initial
        total_time += result1["time"]  # add cumulative time of solution only
        new_solution = fragment_solution + " " + NISS(result1["solutions"][0])  # according to NISS technique
        Solves.append(
            [len(new_solution.split(" ")), total_time, new_solution]
        )  # number of moves of new solution; time to get new solution; new solution
        fragment_solution += " "
    time2 = time.time()
    return Solves, time2 - time1


scramble = dataset[5]  # a example that I found a better solution with NISS with those parameters
bw_initial = 11024
bw = 1024
Solutions, total_time = NISS_candidates(scramble, bw_initial, bw)
print("Total time of NISS approach:", total_time)
for i in Solutions:
    print(i)

equal_original = len(Solutions) - 1
while (Solutions[equal_original - 1][0] == Solutions[equal_original][0]) and (equal_original > 2):
    equal_original -= 1

print("From", equal_original, "to beyond, the solutions were the same.")

best = Solutions[0][0]
best_position = 0

for i in range(1, len(Solutions)):
    if Solutions[i][0] < best:  # finding the best solution
        best = Solutions[i][0]
        best_position = i

if best_position == 0:
    print("No better solution was found")
    print("Best solution:", Solutions[0])
    print("Wasted time:", total_time - Solutions[0][1])
else:
    print("A better solution was found")
    print("Original solution:", Solutions[0])
    print("Best solution:", Solutions[best_position])
    print("Total add time required:", total_time - Solutions[0][1])
    print("Specific add time required:", Solutions[best_position][1] - Solutions[0][1])

Thank you for your attention.

@kyo-takano
Copy link
Owner

I modified your code so that NISS_candidates returns the best solution length and the time taken, whether NISS is employed or leads to a better one, given the initial and NISS beam widths as well as the number of NISS moves.

import time
import numpy as np
from tqdm import tqdm
import alphacube
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


def NISS(moves: list[str]):
    """
    Convert a solution fragment to NISS.

    Args:
        moves (list[str]): List of moves in standard notation.

    Returns:
        list[str]: List of moves in NISS format.
    """
    new = []
    for i in range(len(moves)):
        if len(moves[len(moves) - 1 - i]) == 1:
            lin = moves[len(moves) - 1 - i] + "'"
        else:
            if moves[len(moves) - 1 - i][-1] == "'":
                lin = moves[len(moves) - 1 - i][0]
            else:
                lin = moves[len(moves) - 1 - i]
        new.append(lin)
    return new


def NISS_candidates(scramble: list[str], bw_initial: int, bw: int, num_niss_moves: int):
    """
    Solve a scramble optionally (when `num_niss_moves` > 0) using NISS.
    Always return the best even if not found with NISS

    Args:
        scramble (list): Scramble to solve.
        bw_initial (int): Initial beam width.
        bw (int): Beam width for NISS.
        num_niss_moves (int): Number of NISS moves.

    Returns:
        tuple: (solution length, latency)
    """
    t_0 = time.monotonic()
    # Solve normally
    result_init = alphacube.solve(scramble=scramble, beam_width=bw_initial)
    assert result_init is not None
    solution = result_init["solutions"][0].split()

    if num_niss_moves:
        # Permutate the scramble-solution chain starting with a fragment of solution
        solution_fragment = solution[:num_niss_moves]
        scramble_NISS = NISS(solution_fragment) + NISS(scramble)
        result_NISS = alphacube.solve(scramble=scramble_NISS, beam_width=bw, max_depth=len(solution) - 1)
        if result_NISS is not None:
            # Undo the NISS permutation
            solution = solution_fragment + NISS(result_NISS["solutions"][0].split())

    return len(solution), time.monotonic() - t_0

"""Setup"""
alphacube.load()
dataset = alphacube._evaluator.get_dataset()
df = pd.DataFrame(columns="num_niss_moves,bw_initial,NISS_proportion,latency,length".split(","))

num_samples = 30  # The minimum number of samples to approximate a normal distribution

"""Test with different parameters"""
bw_initials = 2 ** np.arange(8, 14 + 1)
bw_NISS_proportions = [0.0, 0.25, 0.5, 0.75, 1.0]

for bw_initial in bw_initials:
    for num_niss_moves in range(1, 3 + 1):
        for NISS_proportion in bw_NISS_proportions:
            if len(df.query(f"{bw_initial=} & {num_niss_moves=} & {NISS_proportion=} & {bw_initial=}".replace("=", "=="))):
                continue
            length_mean, latency_mean = 0.0, 0.0
            for scramble in tqdm(dataset[:num_samples]):
                scramble = scramble.split()
                length, latency = NISS_candidates(scramble, bw_initial, round(bw_initial * NISS_proportion), num_niss_moves)
                length_mean += length / num_samples
                latency_mean += latency / num_samples

            df.loc[len(df)] = {
                "bw_initial": bw_initial,
                "num_niss_moves": num_niss_moves,
                "NISS_proportion": NISS_proportion,
                "latency": latency_mean,
                "length": length_mean,
            }


"""Visualize the result"""
fig, axs = plt.subplots(1, df.num_niss_moves.max(), figsize=(3.5 * df.num_niss_moves.max(), 4))
for i, num_niss_moves in enumerate(range(1, df.num_niss_moves.max() + 1)):
    sns.lineplot(
        data=df[df.num_niss_moves == num_niss_moves], x="latency", y="length", hue="NISS_proportion", marker="o", ax=axs[i]
    )
    axs[i].set_title(f"{num_niss_moves=}")
    axs[i].set_xlabel("Latency (s)")
    axs[i].set_ylabel("Mean solution length")
    axs[i].set_xscale("log")


plt.suptitle(f"Latency-optimality trade-off\n(device='{alphacube.device}'; `num_niss_moves==0` indicates no NISS)")
plt.tight_layout()
plt.show()

On CPU:
NISS-cpu

On GPU:
NISS-cuda

As you can see from the figures, there appears to be an optimal point where applying a single NISS move can be beneficial.
Please feel free to more extensively test your proposal by increasing the number of samples, using num_nodes as the metric, etc.

I think it is linear with respect to the length of the original solution.

Based on the above result, this appears to be false: always 1 at best.

@kyo-takano
Copy link
Owner

Please create a new issue for a different topic. I'm not responding here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants