Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: 58% faster Project Euler 070 #10558

Conversation

ManpreetXSingh
Copy link
Contributor

@ManpreetXSingh ManpreetXSingh commented Oct 15, 2023

Describe your change:

#8594

  • Benchmark:
    • Old solution: 118.2/10 seconds
    • New solution: 49.3/10 seconds
  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new Python files are placed inside an existing directory.
  • All filenames are in all lowercase characters with no spaces or dashes.
  • All functions and variable names follow Python naming conventions.
  • All function parameters and return values are annotated with Python type hints.
  • All functions have doctests that pass the automated testing.
  • All new algorithms include at least one URL that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the description above includes the issue number(s) with a closing keyword: "Fixes #ISSUE-NUMBER".

@algorithms-keeper algorithms-keeper bot added enhancement This PR modified some existing files awaiting reviews This PR is ready to be reviewed labels Oct 15, 2023
@ManpreetXSingh ManpreetXSingh changed the title Performance: 52% faster Project Euler 070 Performance: 58% faster Project Euler 070 Oct 16, 2023
@quant12345
Copy link
Contributor

@manpreetsingh2004 a good solution with filtering by prime numbers. Look at 072, I made it, you can probably apply it there too.

@ManpreetXSingh
Copy link
Contributor Author

ManpreetXSingh commented Oct 18, 2023

Hi @quant12345,
Thank you for your suggestion! Your input is much appreciated!
I noticed that the prime filtering for 072 is already implemented in sol2.py. Although it is as fast as sol1 with both taking 1 sec. What should be done in this case?
Also, I've identified a bug in 'sol1.py' that appears to be related to integer overflow. This issue specifically arises on Windows platforms because, by default, numpy on Windows 64-bit uses 32-bit integers, whereas on Linux, it employs 64-bit integers, so it also went unnoticed in automated checking of this repo. See #10672. I already have 3 PRs open so bot closes new ones.

@quant12345
Copy link
Contributor

quant12345 commented Oct 18, 2023

@manpreetsingh2004 now I checked it in Windows, it’s really an incorrect number if you don’t specify the type. I used your filtering(072) by prime numbers. And with this it’s faster, significantly. Below code:

test

import datetime
import numpy as np
from math import isqrt

def solution_2(limit: int = 1_000_000) -> int:
    """
    Return the number of reduced proper fractions with denominator less than limit.
    >>> solution(8)
    21
    >>> solution(1000)
    304191
    """
    primes = set(range(3, limit, 2))
    primes.add(2)
    for p in range(3, limit, 2):
        if p not in primes:
            continue
        primes.difference_update(set(range(p * p, limit, p)))

    phi = [float(n) for n in range(limit + 1)]

    for p in primes:
        for n in range(p, limit + 1, p):
            phi[n] *= 1 - 1 / p

    return int(sum(phi[2:]))


def np_calculate_prime_numbers(max_number: int) -> list[int]:
    """
    Returns prime numbers below max_number.
    See: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

    >>> np_calculate_prime_numbers(10)
    [2, 3, 5, 7]
    >>> np_calculate_prime_numbers(2)
    []
    """
    if max_number <= 2:
        return []

    # List containing a bool value for every odd number below max_number/2
    is_prime = np.ones(max_number // 2, dtype=bool)

    for i in range(3, isqrt(max_number - 1) + 1, 2):
        if is_prime[i // 2]:
            # Mark all multiple of i as not prime using list slicing
            is_prime[i ** 2 // 2:: i] = False

    primes = np.where(is_prime)[0] * 2 + 1
    primes[0] = 2
    return primes


def solution_new(limit: int = 1_000_000) -> int:
    """
    Returns an integer, the solution to the problem
    >>> solution(10)
    31
    >>> solution(100)
    3043
    >>> solution(1_000)
    304191
    """

    # generating an array from -1 to limit
    phi = np.arange(-1, limit)
    primes = np_calculate_prime_numbers(limit)

    for i in primes:
        #ind = np.arange(2 * i, limit + 1, i)  # indexes for selection
        phi[2*i::i] -= phi[2*i::i] // i

    return int(np.sum(phi[2: limit + 1], dtype=np.int64))


def solution(limit: int = 1_000_000) -> int:
    """
    Returns an integer, the solution to the problem
    >>> solution(10)
    31
    >>> solution(100)
    3043
    >>> solution(1_000)
    304191
    """

    # generating an array from -1 to limit
    phi = np.arange(-1, limit)

    for i in range(2, limit + 1):
        if phi[i] == i - 1:
            ind = np.arange(2 * i, limit + 1, i)  # indexes for selection
            phi[ind] -= phi[ind] // i

    return int(np.sum(phi[2: limit + 1], dtype=np.int64))


if __name__ == "__main__":

    n = 1_000_000

    now = datetime.datetime.now()
    old = solution(n)
    print('old_time', datetime.datetime.now() - now)

    now = datetime.datetime.now()
    new = solution_new(n)
    print('new_time', datetime.datetime.now() - now)

    now = datetime.datetime.now()
    solution_2 = solution_2(n)
    print('solution_2', datetime.datetime.now() - now)

    print('comparison for identity', old == new == solution_2)

Output:

old_time 0:00:00.773152
new_time 0:00:00.410564
solution_2 0:00:01.439802
comparison for identity True

Here is a slightly faster implementation primes of primesfrom2to than 'your' primesfrom3to.

@ManpreetXSingh
Copy link
Contributor Author

@quant12345
Hey, That's really nice. 😊

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! I'm sorry if I was not clear but we should be updating existing solution to make it performant instead of adding new solution(s). The benchmarks can be added in the PR description instead of the code. We only want one solution in the code. Can you please update the code accordingly?

@algorithms-keeper algorithms-keeper bot added awaiting changes A maintainer has requested changes to this PR and removed awaiting reviews This PR is ready to be reviewed labels Oct 29, 2023
@ManpreetXSingh
Copy link
Contributor Author

We only want one solution in the code.

@dhruvmanila
If I understand correctly, the three slower solutions (slow_solution, slicing_solution, py_solution) should be removed. Am I correct?

@algorithms-keeper algorithms-keeper bot added awaiting reviews This PR is ready to be reviewed and removed awaiting changes A maintainer has requested changes to this PR labels Dec 29, 2023
@ManpreetXSingh
Copy link
Contributor Author

Hi @dhruvmanila,
I've made the requested changes. Please let me know if any additional adjustments are needed.
The get_totients function is modified. The remainder of the solution remains unchanged from its original form.
Also I've put the benchmark results in the PR description.

@ManpreetXSingh ManpreetXSingh closed this by deleting the head repository Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting reviews This PR is ready to be reviewed enhancement This PR modified some existing files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants