Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Nov 29, 2023
1 parent 3ba5fc7 commit aff347d
Show file tree
Hide file tree
Showing 5 changed files with 18 additions and 11 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
f09c6c9d
64c1c1ee
2 changes: 1 addition & 1 deletion data-science-with-pandas-2.html
Original file line number Diff line number Diff line change
Expand Up @@ -1912,7 +1912,7 @@ <h2 data-number="7.5" class="anchored" data-anchor-id="grouping"><span class="he
<div class="cell" data-execution_count="27">
<div class="sourceCode cell-code" id="cb40"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb40-1"><a href="#cb40-1" aria-hidden="true" tabindex="-1"></a>grouped_data.mean()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/tmp/ipykernel_2494/1133710423.py:1: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
<pre><code>/tmp/ipykernel_2437/1133710423.py:1: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
grouped_data.mean()</code></pre>
</div>
<div class="cell-output cell-output-display" data-execution_count="27">
Expand Down
3 changes: 2 additions & 1 deletion references.html
Original file line number Diff line number Diff line change
Expand Up @@ -303,8 +303,9 @@ <h1 class="title">References</h1>

</header>

<div id="refs" role="list" style="display: none">


</div>



Expand Down
6 changes: 3 additions & 3 deletions search.json
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@
"href": "data-science-with-pandas-2.html#grouping",
"title": "7  Grouping, Indexing, Slicing, and Subsetting DataFrames",
"section": "7.5 Grouping",
"text": "7.5 Grouping\nWe often want to calculate summary statistics grouped by subsets or attributes within fields of our data. For example, we might want to calculate the average weight of all individuals per site.\nAs we have seen above we can calculate basic statistics for all records in a single column using the syntax below:\n\nsurveys_df['weight'].describe()\n\ncount 32283.000000\nmean 42.672428\nstd 36.631259\nmin 4.000000\n25% 20.000000\n50% 37.000000\n75% 48.000000\nmax 280.000000\nName: weight, dtype: float64\n\n\nIf we want to summarize by one or more variables, for example sex, we can use Pandas’ .groupby() method. Once we’ve created a groupby DataFrame, we can quickly calculate summary statistics by a group of our choice.\n\ngrouped_data = surveys_df.groupby('sex')\ngrouped_data.describe()\n\n\n\n\n\n\n\n\nrecord_id\nmonth\n...\nhindfoot_length\nweight\n\n\n\ncount\nmean\nstd\nmin\n25%\n50%\n75%\nmax\ncount\nmean\n...\n75%\nmax\ncount\nmean\nstd\nmin\n25%\n50%\n75%\nmax\n\n\nsex\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nF\n15690.0\n18036.412046\n10423.089000\n3.0\n8917.50\n18075.5\n27250.00\n35547.0\n15690.0\n6.587253\n...\n36.0\n64.0\n15303.0\n42.170555\n36.847958\n4.0\n20.0\n34.0\n46.0\n274.0\n\n\nM\n17348.0\n17754.835601\n10132.203323\n1.0\n8969.75\n17727.5\n26454.25\n35548.0\n17348.0\n6.396184\n...\n36.0\n58.0\n16879.0\n42.995379\n36.184981\n4.0\n20.0\n39.0\n49.0\n280.0\n\n\n\n\n2 rows × 56 columns\n\n\n\nThe output is a bit overwhelming. Let’s just have a look at one statistical value, the mean, to understand what is happening here:\n\ngrouped_data.mean()\n\n/tmp/ipykernel_2494/1133710423.py:1: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.\n grouped_data.mean()\n\n\n\n\n\n\n\n\n\nrecord_id\nmonth\nday\nyear\nplot_id\nhindfoot_length\nweight\n\n\nsex\n\n\n\n\n\n\n\n\n\n\n\nF\n18036.412046\n6.587253\n15.880943\n1990.644997\n11.440854\n28.836780\n42.170555\n\n\nM\n17754.835601\n6.396184\n16.078799\n1990.480401\n11.098282\n29.709578\n42.995379\n\n\n\n\n\n\n\nWe see that the data is divided into two groups, one group where the value in the column sex equals “F” and another group where the value in the column sex equals “M”. The statistics is then calculated for all samples in that specific group for each of the columns in the dataframe. Note that samples annotated with sex equals NaN and column values with NaN are left out."
"text": "7.5 Grouping\nWe often want to calculate summary statistics grouped by subsets or attributes within fields of our data. For example, we might want to calculate the average weight of all individuals per site.\nAs we have seen above we can calculate basic statistics for all records in a single column using the syntax below:\n\nsurveys_df['weight'].describe()\n\ncount 32283.000000\nmean 42.672428\nstd 36.631259\nmin 4.000000\n25% 20.000000\n50% 37.000000\n75% 48.000000\nmax 280.000000\nName: weight, dtype: float64\n\n\nIf we want to summarize by one or more variables, for example sex, we can use Pandas’ .groupby() method. Once we’ve created a groupby DataFrame, we can quickly calculate summary statistics by a group of our choice.\n\ngrouped_data = surveys_df.groupby('sex')\ngrouped_data.describe()\n\n\n\n\n\n\n\n\nrecord_id\nmonth\n...\nhindfoot_length\nweight\n\n\n\ncount\nmean\nstd\nmin\n25%\n50%\n75%\nmax\ncount\nmean\n...\n75%\nmax\ncount\nmean\nstd\nmin\n25%\n50%\n75%\nmax\n\n\nsex\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nF\n15690.0\n18036.412046\n10423.089000\n3.0\n8917.50\n18075.5\n27250.00\n35547.0\n15690.0\n6.587253\n...\n36.0\n64.0\n15303.0\n42.170555\n36.847958\n4.0\n20.0\n34.0\n46.0\n274.0\n\n\nM\n17348.0\n17754.835601\n10132.203323\n1.0\n8969.75\n17727.5\n26454.25\n35548.0\n17348.0\n6.396184\n...\n36.0\n58.0\n16879.0\n42.995379\n36.184981\n4.0\n20.0\n39.0\n49.0\n280.0\n\n\n\n\n2 rows × 56 columns\n\n\n\nThe output is a bit overwhelming. Let’s just have a look at one statistical value, the mean, to understand what is happening here:\n\ngrouped_data.mean()\n\n/tmp/ipykernel_2437/1133710423.py:1: FutureWarning: The default value of numeric_only in DataFrameGroupBy.mean is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.\n grouped_data.mean()\n\n\n\n\n\n\n\n\n\nrecord_id\nmonth\nday\nyear\nplot_id\nhindfoot_length\nweight\n\n\nsex\n\n\n\n\n\n\n\n\n\n\n\nF\n18036.412046\n6.587253\n15.880943\n1990.644997\n11.440854\n28.836780\n42.170555\n\n\nM\n17754.835601\n6.396184\n16.078799\n1990.480401\n11.098282\n29.709578\n42.995379\n\n\n\n\n\n\n\nWe see that the data is divided into two groups, one group where the value in the column sex equals “F” and another group where the value in the column sex equals “M”. The statistics is then calculated for all samples in that specific group for each of the columns in the dataframe. Note that samples annotated with sex equals NaN and column values with NaN are left out."
},
{
"objectID": "data-science-with-pandas-2.html#structure-of-a-groupby-object",
Expand Down Expand Up @@ -361,13 +361,13 @@
"href": "what-next.html#courses",
"title": "What Next?",
"section": "Courses:",
"text": "Courses:\n\nBest practices for writing reproducible code (UU?)\nVarious intermediate and advanced courses (eScience?) Center\nSoftware Carpentries\nPython for Data Science and Data Wrangling\nPython Data Science Handbook"
"text": "Courses:\n\nBest practices for writing reproducible code by UU RDM support\nVarious intermediate and advanced programming courses by the eScience Center\nSoftware Carpentries\nPython for Data Science and Data Wrangling (online book)\nPython Data Science Handbook (online book)"
},
{
"objectID": "what-next.html#find-us",
"href": "what-next.html#find-us",
"title": "What Next?",
"section": "Find us:",
"text": "Find us:\nWe are happy to help you in your journey to master Python and use it in your own projects. You can find us at the following places: - Walk-In Hours, come with your questions! - Programming Cafe, informal meetup about programming. Bring your laptop, work on your project and get help when you need it! - UU Research Engineers - UU RDM consultants"
"text": "Find us:\nWe are happy to help you in your journey to master Python and use it in your own projects. You can find us at the following places:\n\nWalk-In Hours, come with your questions!\nProgramming Cafe, informal meetup about programming. Bring your laptop, work on your project and get help when you need it!\nUU Research Engineers\nUU RDM consultants"
}
]
16 changes: 11 additions & 5 deletions what-next.html
Original file line number Diff line number Diff line change
Expand Up @@ -311,16 +311,22 @@ <h2 class="anchored" data-anchor-id="libraries">Libraries:</h2>
<section id="courses" class="level2">
<h2 class="anchored" data-anchor-id="courses">Courses:</h2>
<ol type="1">
<li><a href="https://www.uu.nl/en/research/research-data-management/training-workshops/best-practices-for-writing-reproducible-code">Best practices for writing reproducible code <span class="citation" data-cites="UU">(<span><strong>UU?</strong></span>)</span></a></li>
<li><a href="https://www.esciencecenter.nl/events/?f=workshops">Various intermediate and advanced courses <span class="citation" data-cites="eScience">(<span><strong>eScience?</strong></span>)</span> Center</a></li>
<li><a href="https://www.uu.nl/en/research/research-data-management/training-workshops/best-practices-for-writing-reproducible-code">Best practices for writing reproducible code by UU RDM support</a></li>
<li><a href="https://www.esciencecenter.nl/events/?f=workshops">Various intermediate and advanced programming courses by the eScience Center</a></li>
<li><a href="https://carpentries.org/community-lessons/">Software Carpentries</a></li>
<li><a href="https://wesmckinney.com/book/">Python for Data Science and Data Wrangling</a></li>
<li><a href="https://jakevdp.github.io/PythonDataScienceHandbook/">Python Data Science Handbook</a></li>
<li><a href="https://wesmckinney.com/book/">Python for Data Science and Data Wrangling (online book)</a></li>
<li><a href="https://jakevdp.github.io/PythonDataScienceHandbook/">Python Data Science Handbook (online book)</a></li>
</ol>
</section>
<section id="find-us" class="level2">
<h2 class="anchored" data-anchor-id="find-us">Find us:</h2>
<p>We are happy to help you in your journey to master Python and use it in your own projects. You can find us at the following places: - <a href="https://www.uu.nl/en/research/research-data-management/workshops/walk-in-hours-research-data-and-software">Walk-In Hours</a>, come with your questions! - <a href="https://www.uu.nl/en/research/research-data-management/workshops/programming-cafe">Programming Cafe</a>, informal meetup about programming. Bring your laptop, work on your project and get help when you need it! - <a href="https://www.uu.nl/en/research/research-data-management/support/research-engineers">UU Research Engineers</a> - <a href="https://www.uu.nl/en/research/research-data-management/support/data-managers-and-consultants">UU RDM consultants</a></p>
<p>We are happy to help you in your journey to master Python and use it in your own projects. You can find us at the following places:</p>
<ul>
<li><a href="https://www.uu.nl/en/research/research-data-management/workshops/walk-in-hours-research-data-and-software">Walk-In Hours</a>, come with your questions!</li>
<li><a href="https://www.uu.nl/en/research/research-data-management/workshops/programming-cafe">Programming Cafe</a>, informal meetup about programming. Bring your laptop, work on your project and get help when you need it!</li>
<li><a href="https://www.uu.nl/en/research/research-data-management/support/research-engineers">UU Research Engineers</a></li>
<li><a href="https://www.uu.nl/en/research/research-data-management/support/data-managers-and-consultants">UU RDM consultants</a></li>
</ul>


</section>
Expand Down

0 comments on commit aff347d

Please sign in to comment.