Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Included linear + binary search for 1D static arrays #454

Merged
merged 18 commits into from
Dec 12, 2021

Conversation

r-avenous
Copy link
Contributor

@r-avenous r-avenous commented Dec 8, 2021

References to other Issues or PRs or Relevant literature

#446

Brief description of what is fixed or changed

A search function is added.
It has 2 necessary parameters(1. element to search, 2. order of the array) and one optional (3. comparator)

  1. what to search
  2. positive values (int or float allowed, anything else raises type error exception) mean ascending order, negative is descending, 0 is unsorted
  3. function as a parameter; this is for when the usual >, < do not work on the datatype of the array

Other comments

@@ -126,7 +126,7 @@ def __new__(cls, dtype=NoneType, *args, **kwargs):
@classmethod
def methods(cls):
return ['__new__', '__getitem__',
'__setitem__', 'fill', '__len__']
'__setitem__', 'fill', '__len__', 'search']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it a function (like place in algorithms.py in linear_data_structures) instead of a class method?

Copy link
Contributor Author

@r-avenous r-avenous Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon me, shall I include this directly? Since all this does is search in 1D arrays only so I did that. I wanted to include this for dynamic arrays too, but that removal of items just makes a void location and does not shift the items in those void locations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless, I will include a jump search as it was mentioned as an idea in the issue. Again, for 1D static arrays. Is that fine?

Copy link
Member

@czgdp1807 czgdp1807 Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be fine. I am planning to add ImplicitArray
(with same interface as explicit arrays available right now in pydatastructs) so it would be useful to have search functions for that as well. And anyways, conceptually, search is an algorithm that just reads the data from an array. It's not responsible for updating the state of OneDimensionalArray class, so having it as a method is not of much use.

Copy link
Contributor Author

@r-avenous r-avenous Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. So should I just move this change to algorithms.py along with addition of the jump search? Just confirming

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's a good idea. The logic for each type of array should be encapsulated in a single function. The user can decide on their own what they might want to use. Please make separate functions for binary_search, linear_search, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok understood

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh sorry to ask this. Should I make a new PR after doing as told?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed. You can update this one by pushing new commits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the new commit.

@@ -1369,3 +1372,58 @@ def bubble_sort(array, **kwargs):
array._modify(force=True)

return array

def linear_search(array, elem):
for i in range(array._size):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is array.size not working? We shouldn't access private attributes (the ones having _ as prefix) unless really needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no no. While seeing the code, I noticed that only _size is being used, so I used that too

Comment on lines 1383 to 1384
L = 0
R = array._size - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it explicit then. Always use full names instead of short hand notations. Not many key strokes will be saved while typing, L instead of left.

pydatastructs/linear_data_structures/arrays.py Outdated Show resolved Hide resolved
R = M - 1
return -1

def jump_search(array, elem, comp = None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might have noticed that all the APIs in this file have start, end, backend as kwargs arguments. Is there any specific reason for not adding the same in all the APIs you have added.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am afraid of making mistakes since I dont have much expertise, so I did the way I was confident in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for coming short

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any issues?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. All the APIs should be consistent to the maximum extent possible. Now, here, start, end and backend should be included and the implementation should be modified to use these parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you had to first pull the changes from forArrays branch. git pull your_remote_name forArrays. That will fetch the suggestion you applied here, then you will be able to push to your forArrays branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there still some issue? Since this is marked as a draft.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for all your APIs. See, tests/test_algorithms.py file under, linear_data_structures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc string are also to be added. See other algorithms in the same file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it

@r-avenous r-avenous closed this Dec 9, 2021
@r-avenous r-avenous deleted the forArrays branch December 9, 2021 03:33
@r-avenous r-avenous restored the forArrays branch December 9, 2021 06:12
@r-avenous r-avenous reopened this Dec 9, 2021
@czgdp1807 czgdp1807 marked this pull request as draft December 9, 2021 07:06
Left = start
Right = end
while Left <= Right:
Middle = int((Left+Right)/2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about, Left//2 + Right//2?

Copy link
Contributor Author

@r-avenous r-avenous Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think (Left + Right)//2 is fine but not too sure about the sum of floors, for example, when left = 3, right = 5(when both odd, can create problems), it will give 3 as middle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the loss won't be much. There are four cases,

  1. Left - Odd, Right - Odd - Middle will be lesser (by 1) in Left//2 + Right//2 as compared to current approach.
  2. Left - Odd, Right - Even or Right - Odd, Left - Even - Middle will be exactly the same as in current approach.
  3. Left - Even, Right - Even - Middle will again be exactly the same as in current approach.

The reason is (Left + Right) might get too large for implicit arrays leading to slow down in performance (Python uses big integers for handling larger than 64 bit integers). So, basically if you can just add 1 for Left - Odd, Right - Odd in Left//2 + Right//2 and do only Left//2 + Right//2 for other cases then it should be fine. In fact,

Middle = Left//2 + Right//2 + Left % 2 * Right % 2 should work. Worth trying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems much better, I shall do this

Comment on lines 341 to 345
ODA = OneDimensionalArray

array = ODA(int, [1, 2, 5, 7, 10, 29, 40])
for i in range(len(array)):
assert i == jump_search(array, array[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about corner cases like, empty array, one element array? And see _test_common_sort above. Repetition of tests can be avoided by doing something similar here.

Copy link
Contributor Author

@r-avenous r-avenous Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking for the test function or the actual algorithm written?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. You are using the same test for every search algorithm. So why not write a common test function, _test_common_search(search_function_api) and then in test_jump_search (for example),

def test_jump_search():
    _test_common_search(jump_search)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I will do this

@czgdp1807 czgdp1807 marked this pull request as ready for review December 9, 2021 13:39
@codecov
Copy link

codecov bot commented Dec 9, 2021

Codecov Report

Merging #454 (1a3439b) into master (f82f52d) will increase coverage by 0.015%.
The diff coverage is 100.000%.

@@              Coverage Diff              @@
##            master      #454       +/-   ##
=============================================
+ Coverage   98.588%   98.603%   +0.015%     
=============================================
  Files           29        29               
  Lines         3754      3795       +41     
=============================================
+ Hits          3701      3742       +41     
  Misses          53        53               
Impacted Files Coverage Δ
pydatastructs/linear_data_structures/__init__.py 100.000% <ø> (ø)
pydatastructs/linear_data_structures/algorithms.py 99.765% <100.000%> (+0.024%) ⬆️

Impacted file tree graph

@czgdp1807
Copy link
Member

czgdp1807 commented Dec 9, 2021

See this. The red lines are not covered. Add tests for covering them as well.

@r-avenous
Copy link
Contributor Author

See this. The red lines are not covered. Add tests for covering them as well.

Its showing that bubble sort is not covered, but it is, I did not touch anything other than searching which I added

@czgdp1807
Copy link
Member

Its showing that bubble sort is not covered, but it is, I did not touch anything other than searching which I added

That's the red (lighter shade) of diff. If you scroll and click on Click to load diff then you will see dark red lines. These are the ones that you have to cover by adding tests.

@r-avenous
Copy link
Contributor Author

Its showing that bubble sort is not covered, but it is, I did not touch anything other than searching which I added

That's the red (lighter shade) of diff. If you scroll and click on Click to load diff then you will see dark red lines. These are the ones that you have to cover by adding tests.

Ok on it. But still, bubble sort and another line of importing is in deep red, what about it?

@czgdp1807
Copy link
Member

bubble sort and another line of importing is in deep red, what about it?

Ignore. Though you can generate the same report locally if Codecov UI is confusing. See the testing section of README. Once you execute the command there, you will have to open the index.html file in html/cov in a browser.

@r-avenous
Copy link
Contributor Author

r-avenous commented Dec 9, 2021

bubble sort and another line of importing is in deep red, what about it?

Ignore. Though you can generate the same report locally if Codecov UI is confusing. See the testing section of README. Once you execute the command there, you will have to open the index.html file in html/cov in a browser.

Can I use Codecov? I am comfortable with it

@czgdp1807
Copy link
Member

Sure.

@r-avenous
Copy link
Contributor Author

Sure.

Is there any other issue?

@czgdp1807
Copy link
Member

czgdp1807 commented Dec 9, 2021

I will push some changes to your branch forArrays may be on Sunday. I don't think there is anything else to be done from your side.

@r-avenous
Copy link
Contributor Author

I will push some changes to your branch forArrays may be on Sunday. I don't think there is anything else to be done from your side.

Please let me know if there is any error. Shall I see if I can contribute something else in the project?

@czgdp1807
Copy link
Member

Shall I see if I can contribute something else in the project?

Sure go ahead.

@czgdp1807 czgdp1807 dismissed their stale review December 12, 2021 13:15

Pushed by me.

@czgdp1807 czgdp1807 merged commit 87e0d56 into codezonediitj:master Dec 12, 2021
@czgdp1807
Copy link
Member

Thanks @skullVoid

@r-avenous
Copy link
Contributor Author

Thanks @skullVoid

I didnt do much. Thank you for guiding me, this was a good experience for me as a first timer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants