DOC: correct docstring examples (#3439) #16432

ProsperousHeart · 2017-05-22T19:48:17Z

TomAugspurger · 2017-05-22T20:11:24Z

pandas/core/reshape/reshape.py

+    0           0.548814           0.544883           0.437587           0.383442
+    1           0.715189           0.423655           0.891773           0.791725
+    2           0.602763           0.645894           0.963663           0.528895
+


I think you have to remove this line (even though it seems like it should be there, based on the output).

aw man ... WTF?^^ lol

jorisvandenbossche

Thanks for working on this!

jorisvandenbossche · 2017-05-22T20:19:02Z

pandas/core/reshape/reshape.py

@@ -1129,8 +1134,7 @@ def get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False,
    1  0  1    0
    2  0  0    1

-    >>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'],
-                        'C': [1, 2, 3]})
+    >>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'], 'C': [1, 2, 3]})


This will give a linting error (we check for PEP8)

The way you can solve this is by adding ... (then it should pass the doctests):

>>> df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'], ... 'C': [1, 2, 3]})

Specifically, the lines should be less than 80 characters wide.

aw man .... I did that to several ...any suggested way to fix it?

like in mass? lol

jorisvandenbossche · 2017-05-22T20:19:40Z

pandas/core/reshape/reshape.py

@@ -940,6 +941,7 @@ def wide_to_long(df, stubnames, i, j, sep="", suffix='\d+'):
    8      3      3  2.1  2.9
    >>> l = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age')
    >>> l
+    ... # doctest: +NORMALIZE_WHITESPACE


can you give some explanation why this is needed in this case ?

@jorisvandenbossche I think the issue was when there's a df.index.name the output has whitespace for the rest of that line (which we don't want to include in the source)

Aha, ...
Is that actually something we might want to solve in pandas, in the repr? (it's not a bug, but I also don't think it is a feature someone relies upon? So could change that (if it is easy) to not have to do this here) But that is certainly for another PR

Or we could set this flag globally if this occurs a lot? (if that is possible)

That's what @TomAugspurger was thinking, but I don't think he knows either?

jorisvandenbossche · 2017-05-22T20:21:13Z

pandas/core/reshape/reshape.py

@@ -689,7 +689,7 @@ def _convert_level_number(level_num, columns):
        new_labels = [np.arange(N).repeat(levsize)]
        new_names = [this.index.name]  # something better?

-    new_levels.append(level_vals)
+    new_levels.append(frame.columns.levels[level_num])


This is an actual change in the code. Is this to fix a bug? If so, could you do this as a separate PR (and add a test for it)?

I didn't make that change so I don't know what that is

Since it is in your commit, you somehow made that change :-)
But if it was not the intent, you can just revert it (change it back to how it was based on the diff you see here)

I didn't - I swear I didn't. I don't even know what that means. I'll change it back, but I promise it wasn't me.

jorisvandenbossche · 2017-05-22T20:22:31Z

pandas/core/reshape/reshape.py


    >>> s.unstack(level=0)
       one  two
-    a  1   2
-    b  3   4
+    a  1.0  3.0


Ah, this was actually an error in the example!

Oh no! So ...... how do I fix it? Cause when I try to change it, I get an error

No, no, now it is correct! So your change is perfect. I just noticed that by running the doctests, we actually corrected some errors in the docs, which is its purpose, so that is good :-)

jorisvandenbossche · 2017-05-22T20:23:45Z

pandas/core/reshape/reshape.py

-    two  a   3
-         b   4
+    one  a    1.0
+         b    2.0


another option would be to change the series construction, use np.arange(1, 5) instead of np.arange(1.0, 5.0).

I think using integers makes the example slightly simpler

Agreed - working on this now.

done - will be committing soon

jorisvandenbossche · 2017-05-22T20:35:13Z

pandas/core/reshape/reshape.py

       X  id
    0  0   0
    1  1   1
-    2  2   2
+    2  1   2
+
    >>> pd.wide_to_long(df, ['A(quarterly)', 'B(quarterly)'],
                        i='id', j='year', sep='-')
             X     A(quarterly)  B(quarterly)


Is it possible that the X column below is not correct? (it has no 1's)

X is random ... which didn't make sense to me.

>>> df = pd.DataFrame({'A(quarterly)-2010': np.random.rand(3), ... 'A(quarterly)-2011': np.random.rand(3), ... 'B(quarterly)-2010': np.random.rand(3), ... 'B(quarterly)-2011': np.random.rand(3), ... 'X' : np.random.randint(3, size=3)}) >>> df['id'] = df.index >>> df ... # doctest: +NORMALIZE_WHITESPACE A(quarterly)-2010 A(quarterly)-2011 B(quarterly)-2010 B(quarterly)-2011 \ 0 0.548814 0.544883 0.437587 0.383442 1 0.715189 0.423655 0.891773 0.791725 2 0.602763 0.645894 0.963663 0.528895 X id 0 0 0 1 1 1 2 1 2```

You can either change X to something not random, or leave it as is(with the random seed, it is also consistent), but the output below should just match the input (which is now not the case I think)

jorisvandenbossche · 2017-05-22T20:41:18Z

pandas/core/reshape/reshape.py

-    1          0.634401          0.611024          0.361789          0.630976
-    2          0.849432          0.722443          0.228263          0.092105
-    \
+    ... # doctest: +NORMALIZE_WHITESPACE


You can also put this on the previous line (like the first example in https://docs.python.org/2/library/doctest.html#directives)

I think it was too long then :)

I meant

>>> df # doctest: +NORMALIZE_WHITESPACE

which would not be too long in this case

ok - I'll add in next commit

jorisvandenbossche · 2017-05-22T20:49:30Z

pandas/core/reshape/reshape.py

+         b    2
+    two  a    3
+         b    4
+    dtype: int32


this one will depend on the platform you are using, so to be more robust, we should probably best add the dtype to np.arange. I would use 'int64', as this is the default in pandas (in numpy the default is platform dependent).
Or as an alternative use range(1, 5), then pandas Series will always convert this to int64

As per our discussion, left at int32 - np.arrange and range() did not seem to make a difference so I kept it at np.arrange

ProsperousHeart · 2017-05-22T22:27:59Z

What does it mean when the continuous-integration/travis-ci/py failed?

jorisvandenbossche · 2017-05-22T22:33:16Z

In this case it is due to some linting errors (some style issues, based on PEP8 and flake are checked in the third build on travis, which is the one that is failing. If you go to travis and click on the failing build, and scroll down, you can see why).
The output from travis:

$ ci/lint.sh

inside ci/lint.sh
Linting  *.py
pandas/core/reshape/reshape.py:992:80: E501 line too long (84 > 79 characters)
pandas/core/reshape/reshape.py:993:80: E501 line too long (81 > 79 characters)
pandas/core/reshape/reshape.py:994:80: E501 line too long (81 > 79 characters)
pandas/core/reshape/reshape.py:995:80: E501 line too long (81 > 79 characters)
pandas/core/reshape/reshape.py:1000:1: W293 blank line contains whitespace
pandas/core/reshape/reshape.py:1001:80: E501 line too long (88 > 79 characters)
pandas/core/reshape/reshape.py:1015:80: E501 line too long (117 > 79 characters)
pandas/core/reshape/reshape.py:1133:71: W291 trailing whitespace
Linting *.py DONE

jorisvandenbossche · 2017-05-22T22:34:29Z

See http://pandas.pydata.org/pandas-docs/stable/contributing.html#python-pep8 (I would recommend you to set up the editor you are using to check for this as well, that should be possible with most code editors)

ProsperousHeart · 2017-05-22T22:49:40Z

I don't see any way (may not have search correct) for Notepad++ and PEP8

codecov · 2017-05-23T16:55:56Z

Codecov Report

Merging #16432 into master will decrease coverage by 0.37%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #16432      +/-   ##
==========================================
- Coverage   90.79%   90.42%   -0.38%     
==========================================
  Files         161      161              
  Lines       51063    51025      -38     
==========================================
- Hits        46363    46139     -224     
- Misses       4700     4886     +186

Flag	Coverage Δ
#multiple	`88.26% <ø> (-0.38%)`	⬇️
#single	`40.17% <ø> (+0.01%)`	⬆️

Impacted Files	Coverage Δ
pandas/core/reshape/pivot.py	`95.08% <ø> (ø)`	⬆️
pandas/core/reshape/reshape.py	`99.28% <ø> (-0.01%)`	⬇️
pandas/core/reshape/concat.py	`97.62% <ø> (ø)`	⬆️
pandas/core/reshape/tile.py	`90.25% <ø> (ø)`	⬆️
pandas/io/formats/excel.py	`74.24% <0%> (-22.41%)`	⬇️
pandas/io/excel.py	`62.31% <0%> (-18.33%)`	⬇️
pandas/conftest.py	`95.83% <0%> (-0.6%)`	⬇️
pandas/io/parsers.py	`95.32% <0%> (-0.35%)`	⬇️
pandas/core/window.py	`96.24% <0%> (-0.24%)`	⬇️
pandas/util/testing.py	`80.79% <0%> (-0.2%)`	⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b0d9ee0...ffd3e3c. Read the comment docs.

jorisvandenbossche · 2017-05-23T17:35:03Z

pandas/core/reshape/pivot.py

+    ...               "one", "two", "two", "two", "one"], dtype=object)
+    >>> c = np.array(["dull", "dull", "shiny", "dull", "dull", "shiny",
+    ...               "shiny", "dull", "shiny", "shiny", "shiny"], 
+    ...               dtype=object)

    >>> crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])


Can you make this pd.crosstab instead of crosstab ?

done - will be in next commit

jorisvandenbossche · 2017-05-23T17:37:15Z

pandas/core/reshape/tile.py

-    >>> pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]), 3,
-               labels=["good","medium","bad"])
+    >>> result, bins = pd.cut(np.array([.2, 1.4, 2.5, 6.2, 9.7, 2.1]),
+    ...                                  3, labels=["good","medium","bad"])


Here, you still would need

>>> result ... [output] >>> bins ... [output]

I'm not sure what that would look like :(

ProsperousHeart · 2017-05-23T18:01:15Z

Once confirmed, pivot completed.

TomAugspurger · 2017-05-26T15:09:44Z

Just pushed some fixes for the pep8 failures, and enabled the doctests on the doc build. We'll check the output of https://travis-ci.org/pandas-dev/pandas/jobs/236410485 when it finishes.

jorisvandenbossche · 2017-05-31T09:37:29Z

@ProsperousHeart Thanks a lot for working on this!
Merging this now (follow-up PRs to do further improvements are always welcome!)

The appveyor failure is the matplotlib one, so ignoring that one.

There is one failing doctest, but that seems a python version issue (get different results when testing that locally as well on different python version). Will do another PR to check if changing doc build to py 3.6 solves things.

TomAugspurger added the Docs label May 22, 2017

TomAugspurger added this to the 0.21.0 milestone May 22, 2017

TomAugspurger reviewed May 22, 2017

View reviewed changes

jorisvandenbossche reviewed May 22, 2017

View reviewed changes

jorisvandenbossche changed the title ~~1st update for issue 3439~~ DOC: correct docstring examples (#3439) May 22, 2017

jorisvandenbossche reviewed May 22, 2017

View reviewed changes

jorisvandenbossche reviewed May 23, 2017

View reviewed changes

ProsperousHeart mentioned this pull request May 23, 2017

BUG: cut does not respect order of passed labels #16459

Closed

ProsperousHeart and others added 14 commits May 26, 2017 09:49

1st update for issue 3439

5608ed9

2nd update for issue 3439

10dda38

3rd update for issue 3439

cdf1197

4th update for issue 3439

02e0fcc

5th update for issue 3439

20fe63a

6th update for issue 3439

a19a729

7th update for issue 3439

5d972ad

8th update for issue 3439

5379089

8th update for issue 3439

f7f3289

9th update for issue 3439

891d42b

9th update for issue 3439

a701026

10th update for issue 3439

7105f0a

PEP8 fixes

24a0f1c

Enable doc build

ffd3e3c

TomAugspurger force-pushed the reshape-3439 branch from 232f82d to ffd3e3c Compare May 26, 2017 15:08

jorisvandenbossche merged commit d4f80b0 into pandas-dev:master May 31, 2017

jorisvandenbossche mentioned this pull request May 31, 2017

DOC: change doc build to python 3.6 #16545

Merged

Kiv pushed a commit to Kiv/pandas that referenced this pull request Jun 11, 2017

DOC: correct docstring examples (pandas-dev#3439) (pandas-dev#16432)

36d6171

stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017

DOC: correct docstring examples (pandas-dev#3439) (pandas-dev#16432)

105f5b8

DOC: correct docstring examples (#3439) #16432

DOC: correct docstring examples (#3439) #16432

Conversation

ProsperousHeart commented May 22, 2017 • edited by jorisvandenbossche Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ProsperousHeart May 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ProsperousHeart commented May 22, 2017

jorisvandenbossche commented May 22, 2017

jorisvandenbossche commented May 22, 2017

ProsperousHeart commented May 22, 2017

codecov bot commented May 23, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ProsperousHeart commented May 23, 2017

TomAugspurger commented May 26, 2017

jorisvandenbossche commented May 31, 2017

ProsperousHeart commented May 22, 2017 •

edited by jorisvandenbossche

Loading

ProsperousHeart May 22, 2017 •

edited

Loading

codecov bot commented May 23, 2017 •

edited

Loading