-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: correct docstring examples (#3439) #16432
DOC: correct docstring examples (#3439) #16432
Conversation
pandas/core/reshape/reshape.py
Outdated
0 0.548814 0.544883 0.437587 0.383442 | ||
1 0.715189 0.423655 0.891773 0.791725 | ||
2 0.602763 0.645894 0.963663 0.528895 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have to remove this line (even though it seems like it should be there, based on the output).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aw man ... WTF?^^ lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
pandas/core/reshape/reshape.py
Outdated
@@ -1129,8 +1134,7 @@ def get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, | |||
1 0 1 0 | |||
2 0 0 1 | |||
|
|||
>>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'], | |||
'C': [1, 2, 3]}) | |||
>>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'], 'C': [1, 2, 3]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will give a linting error (we check for PEP8)
The way you can solve this is by adding ...
(then it should pass the doctests):
>>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'],
... 'C': [1, 2, 3]})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, the lines should be less than 80 characters wide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aw man .... I did that to several ...any suggested way to fix it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like in mass? lol
@@ -940,6 +941,7 @@ def wide_to_long(df, stubnames, i, j, sep="", suffix='\d+'): | |||
8 3 3 2.1 2.9 | |||
>>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age') | |||
>>> l | |||
... # doctest: +NORMALIZE_WHITESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give some explanation why this is needed in this case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche I think the issue was when there's a df.index.name
the output has whitespace for the rest of that line (which we don't want to include in the source)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, ...
Is that actually something we might want to solve in pandas, in the repr? (it's not a bug, but I also don't think it is a feature someone relies upon? So could change that (if it is easy) to not have to do this here) But that is certainly for another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could set this flag globally if this occurs a lot? (if that is possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what @TomAugspurger was thinking, but I don't think he knows either?
pandas/core/reshape/reshape.py
Outdated
@@ -689,7 +689,7 @@ def _convert_level_number(level_num, columns): | |||
new_labels = [np.arange(N).repeat(levsize)] | |||
new_names = [this.index.name] # something better? | |||
|
|||
new_levels.append(level_vals) | |||
new_levels.append(frame.columns.levels[level_num]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an actual change in the code. Is this to fix a bug? If so, could you do this as a separate PR (and add a test for it)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't make that change so I don't know what that is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is in your commit, you somehow made that change :-)
But if it was not the intent, you can just revert it (change it back to how it was based on the diff you see here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't - I swear I didn't. I don't even know what that means. I'll change it back, but I promise it wasn't me.
pandas/core/reshape/reshape.py
Outdated
|
||
>>> s.unstack(level=0) | ||
one two | ||
a 1 2 | ||
b 3 4 | ||
a 1.0 3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this was actually an error in the example!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no! So ...... how do I fix it? Cause when I try to change it, I get an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, no, now it is correct! So your change is perfect. I just noticed that by running the doctests, we actually corrected some errors in the docs, which is its purpose, so that is good :-)
pandas/core/reshape/reshape.py
Outdated
two a 3 | ||
b 4 | ||
one a 1.0 | ||
b 2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another option would be to change the series construction, use np.arange(1, 5)
instead of np.arange(1.0, 5.0)
.
I think using integers makes the example slightly simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - working on this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - will be committing soon
pandas/core/reshape/reshape.py
Outdated
X id | ||
0 0 0 | ||
1 1 1 | ||
2 2 2 | ||
2 1 2 | ||
|
||
>>> pd.wide_to_long(df, ['A(quarterly)', 'B(quarterly)'], | ||
i='id', j='year', sep='-') | ||
X A(quarterly) B(quarterly) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that the X column below is not correct? (it has no 1's)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X is random ... which didn't make sense to me.
>>> df = pd.DataFrame({'A(quarterly)-2010': np.random.rand(3),
... 'A(quarterly)-2011': np.random.rand(3),
... 'B(quarterly)-2010': np.random.rand(3),
... 'B(quarterly)-2011': np.random.rand(3),
... 'X' : np.random.randint(3, size=3)})
>>> df['id'] = df.index
>>> df
... # doctest: +NORMALIZE_WHITESPACE
A(quarterly)-2010 A(quarterly)-2011 B(quarterly)-2010 B(quarterly)-2011 \
0 0.548814 0.544883 0.437587 0.383442
1 0.715189 0.423655 0.891773 0.791725
2 0.602763 0.645894 0.963663 0.528895
X id
0 0 0
1 1 1
2 1 2```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can either change X to something not random, or leave it as is(with the random seed, it is also consistent), but the output below should just match the input (which is now not the case I think)
pandas/core/reshape/reshape.py
Outdated
1 0.634401 0.611024 0.361789 0.630976 | ||
2 0.849432 0.722443 0.228263 0.092105 | ||
\ | ||
... # doctest: +NORMALIZE_WHITESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also put this on the previous line (like the first example in https://docs.python.org/2/library/doctest.html#directives)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was too long then :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant
>>> df # doctest: +NORMALIZE_WHITESPACE
which would not be too long in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - I'll add in next commit
pandas/core/reshape/reshape.py
Outdated
b 2 | ||
two a 3 | ||
b 4 | ||
dtype: int32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one will depend on the platform you are using, so to be more robust, we should probably best add the dtype to np.arange
. I would use 'int64', as this is the default in pandas (in numpy the default is platform dependent).
Or as an alternative use range(1, 5)
, then pandas Series will always convert this to int64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our discussion, left at int32 - np.arrange and range() did not seem to make a difference so I kept it at np.arrange
What does it mean when the continuous-integration/travis-ci/py failed? |
In this case it is due to some linting errors (some style issues, based on PEP8 and flake are checked in the third build on travis, which is the one that is failing. If you go to travis and click on the failing build, and scroll down, you can see why).
|
See http://pandas.pydata.org/pandas-docs/stable/contributing.html#python-pep8 (I would recommend you to set up the editor you are using to check for this as well, that should be possible with most code editors) |
I don't see any way (may not have search correct) for Notepad++ and PEP8 |
Codecov Report
@@ Coverage Diff @@
## master #16432 +/- ##
==========================================
- Coverage 90.79% 90.42% -0.38%
==========================================
Files 161 161
Lines 51063 51025 -38
==========================================
- Hits 46363 46139 -224
- Misses 4700 4886 +186
Continue to review full report at Codecov.
|
pandas/core/reshape/pivot.py
Outdated
... "one", "two", "two", "two", "one"], dtype=object) | ||
>>> c = np.array(["dull", "dull", "shiny", "dull", "dull", "shiny", | ||
... "shiny", "dull", "shiny", "shiny", "shiny"], | ||
... dtype=object) | ||
|
||
>>> crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this pd.crosstab
instead of crosstab
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - will be in next commit
pandas/core/reshape/tile.py
Outdated
>>> pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]), 3, | ||
labels=["good","medium","bad"]) | ||
>>> result, bins = pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]), | ||
... 3, labels=["good","medium","bad"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, you still would need
>>> result
... [output]
>>> bins
... [output]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what that would look like :(
Once confirmed, pivot completed. |
232f82d
to
ffd3e3c
Compare
Just pushed some fixes for the pep8 failures, and enabled the doctests on the doc build. We'll check the output of https://travis-ci.org/pandas-dev/pandas/jobs/236410485 when it finishes. |
@ProsperousHeart Thanks a lot for working on this! The appveyor failure is the matplotlib one, so ignoring that one. There is one failing doctest, but that seems a python version issue (get different results when testing that locally as well on different python version). Will do another PR to check if changing doc build to py 3.6 solves things. |
xref #3439