diff --git a/lab/final/geog-414-final.ipynb b/lab/final/geog-414-final.ipynb new file mode 100644 index 0000000..a35e50e --- /dev/null +++ b/lab/final/geog-414-final.ipynb @@ -0,0 +1,338 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Dtex9L6CV_aV" + }, + "source": [ + "# GEOG-414 Final Exam\n", + "\n", + "\n", + "**Exam Structure**\n", + "\n", + "The final exam accounts for 20% of the total grade, equivalent to 200 points. The exam consists of five questions, with each question carrying a weight of 40 points. There are four questions about Earth Engine and one question about DuckDB. While it is an open-book exam, it is essential that you complete it independently, without collaborating with others. You are allowed to utilize online resources to find solutions. The exam must be completed within 120 minutes and is due at precisely 12:30 pm. However, please note that for each 10-minute interval of late submission, a penalty of 10% will be deducted from your score.\n", + "\n", + "**Submission Requirements**\n", + "\n", + "1. **Screenshots:** For each question, upload a screenshot of your map/chart. Ensure the screenshot includes your name on it.\n", + "2. **HTML file:** Submit an HTML version of your notebook. Ensure all code outputs are visible. (Export via VS Code: Notebook > Export > HTML).\n", + "3. **Colab ink:** Provide a link to your notebook hosted on Google Colab for interactive review." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WeQJFQdtV_aZ" + }, + "source": [ + "## Question 1\n", + "\n", + "Create annual cloud-free Landsat composite (**2015-2023**) of the state of Tennessee and display them on the map using false color composite.\n", + "\n", + "Relevant datasets:\n", + "\n", + "* [TIGER: US Census States](https://developers.google.com/earth-engine/datasets/catalog/TIGER_2018_States): `ee.FeatureCollection(\"TIGER/2018/States\")`\n", + "* [USGS Landsat 8 Level 2, Collection 2, Tier 1](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2): `ee.ImageCollection(\"LANDSAT/LC08/C02/T1_L2\")`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "o3XSVJk_V_aZ" + }, + "outputs": [], + "source": [ + "import ee\n", + "import geemap\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HXBWkQjUV_aa" + }, + "source": [ + "![](https://i.imgur.com/UKQUi85.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OQAa0h0AV_ab" + }, + "source": [ + "## Question 2\n", + "\n", + "Based on Question 1, extract annual water areas (**2015-2023**) for the state of Tennessee based on the [Normalized Difference Water Index (NDWI)](https://en.wikipedia.org/wiki/Normalized_difference_water_index) and display them on the m. See [this example](https://developers.google.com/earth-engine/guides/image_visualization#color-palettes).\n", + "\n", + "Relevant datasets:\n", + "\n", + "* [TIGER: US Census States](https://developers.google.com/earth-engine/datasets/catalog/TIGER_2018_States): `ee.FeatureCollection(\"TIGER/2018/States\")`\n", + "* [USGS Landsat 8 Level 2, Collection 2, Tier 1](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2): `ee.ImageCollection(\"LANDSAT/LC08/C02/T1_L2\")`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "m31x7mzgV_ab" + }, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KYPJXB7wV_ab" + }, + "source": [ + "![](https://i.imgur.com/GSfICAZ.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Z6et2IKwV_ac" + }, + "source": [ + "## Question 3\n", + "\n", + "Based on Question 2, create the maximum water extent (**2015-2023**) for the state of Tennessee. Each pixel in the maximum water extent indicates that the pixel has been detected as water at lease once since 2015. Also extract surface water extent for the state of Tennessee based on the [JRC Global Surface Water Mapping Layers](https://developers.google.com/earth-engine/datasets/catalog/JRC_GSW1_4_GlobalSurfaceWater) (select the `occurrence` band). Create a split map to visually compare the water areas extracted from two different methods (i.e., NDWI and JRC).\n", + "\n", + "**Hints:** use the sum() function on the ImageCollection and then convert it to a binary image to get the maximum water extent.\n", + "\n", + "Relevant datasets:\n", + "\n", + "* [TIGER: US Census States](https://developers.google.com/earth-engine/datasets/catalog/TIGER_2018_States): `ee.FeatureCollection(\"TIGER/2018/States\")`\n", + "* [USGS Landsat 8 Level 2, Collection 2, Tier 1](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2): `ee.ImageCollection(\"LANDSAT/LC08/C02/T1_L2\")`\n", + "* [JRC Global Surface Water Mapping Layers](https://developers.google.com/earth-engine/datasets/catalog/JRC_GSW1_4_GlobalSurfaceWater): `ee.Image(\"JRC/GSW1_4/GlobalSurfaceWater\").select('occurrence')`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "fcDdTv5DV_ac" + }, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xX340vLdV_ac" + }, + "source": [ + "![](https://i.imgur.com/lHtlpB4.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PCvund6KV_ad" + }, + "source": [ + "## Question 4\n", + "\n", + "Create annual composite of 4-band(RGBN) NAIP imagery and Normalized Difference Vegetation Index (NDVI) for Knox County, Tennessee and display them on the m.\n", + "\n", + "Relevant datasets:\n", + "* [TIGER: US Census Counties](https://developers.google.com/earth-engine/datasets/catalog/TIGER_2018_Counties): `ee.FeatureCollection(\"TIGER/2018/Counties\")`\n", + "* [NAIP: National Agriculture Imagery Program](https://developers.google.com/earth-engine/datasets/catalog/USDA_NAIP_DOQQ): `ee.ImageCollection(\"USDA/NAIP/DOQQ\")`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "N1UcWhM9V_ad" + }, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "15ZdwXi0V_ad" + }, + "source": [ + "![](https://i.imgur.com/yWkyENq.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tvT-MKqZV_ad" + }, + "source": [ + "## Question 5\n", + "\n", + "Analyzing the NYC crime data from 2003 to 2011 using DuckDB. The database `nyc_data.db` is available for download from [here](https://github.com/opengeos/data/raw/main/duckdb/nyc_data.db.zip). The database contains two tables: `nyc_homicides` and `nyc_neighborhoods`. The `nyc_homicides` table contains the homicide data from 2003 to 2011, and the `nyc_neighborhoods` table contains the neighborhood boundaries of New York City. Use these two tables to answer the following questions:\n", + "\n", + "1. What is the total number of homicides in New York City from 2003 to 2011?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Wa3WC0UCV_ad" + }, + "outputs": [], + "source": [ + "# add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "v6Jq7PmbV_ad" + }, + "source": [ + "2. Find out the top 10 neighborhoods with the highest number of homicides in New York City from 2003 to 2011." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FP-2koBjV_ae" + }, + "source": [ + "3. Create a bar chart to visualize the number of homicides in New York City by year from 2003 to 2011. The bar chart title should contain your name." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "lSW3Ny67V_ae" + }, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HNnw6p9vV_ae" + }, + "source": [ + "![](https://i.imgur.com/HUwUYkY.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ueJr7W4pV_ae" + }, + "source": [ + "4. Create a pie chart to visualize the number of homicides in New York City by borough from 2003 to 2011. The pie chart title should contain your name." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "uLSq0K2hV_ae" + }, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "glA429T3V_ae" + }, + "source": [ + "![](https://i.imgur.com/px7UYTF.png)" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "hide_input": false, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.0" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": true, + "sideBar": true, + "skip_h1_title": false, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": true, + "toc_window_display": false + }, + "varInspector": { + "cols": { + "lenName": 16, + "lenType": 16, + "lenVar": 40 + }, + "kernels_config": { + "python": { + "delete_cmd_postfix": "", + "delete_cmd_prefix": "del ", + "library": "var_list.py", + "varRefreshCmd": "print(var_dic_list())" + }, + "r": { + "delete_cmd_postfix": ") ", + "delete_cmd_prefix": "rm(", + "library": "var_list.r", + "varRefreshCmd": "cat(var_dic_list()) " + } + }, + "types_to_exclude": [ + "module", + "function", + "builtin_function_or_method", + "instance", + "_Feature" + ], + "window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/lab/lab_07.ipynb b/lab/lab_07.ipynb new file mode 100644 index 0000000..3255f4a --- /dev/null +++ b/lab/lab_07.ipynb @@ -0,0 +1,157 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lab 7\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/giswqs/geog-414/blob/master/book/labs/lab_07.ipynb)\n", + "\n", + "## Submission requirements\n", + "\n", + "1. Upload a screenshot of your map for each question.\n", + "2. Provide a link to your notebook on Colab. See instructions [here](https://geog-414.gishub.org/book/labs/instructions.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Datasets\n", + "\n", + "The datasets being used in the lab are listed below:\n", + "\n", + "- [TIGER: US Census Counties](https://developers.google.com/earth-engine/datasets/catalog/TIGER_2018_Counties)\n", + "- [Landsat-9](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T1_L2)\n", + "- [Sentinel-2](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR)\n", + "- [NAIP](https://developers.google.com/earth-engine/datasets/catalog/USDA_NAIP_DOQQ)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "tags": [] + }, + "source": [ + "## Question 1\n", + "\n", + "Create a fishnet with a 4-degree interval based on the extent of `[-112.5439, 34.0891, -85.0342, 49.6858]`. Use the fishnet to download the Landsat 7 image tiles by the fishnet using the `geemap.download_ee_image_tiles()` function. Relevant Earth Engine assets:\n", + "\n", + "- `ee.Image('LANDSAT/LE7_TOA_5YEAR/1999_2003')`\n", + "\n", + "![](https://i.imgur.com/L1IH3fq.png)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 2\n", + "\n", + "Create annual cloud-free Landsat imagery for the years 2017-2023 for a US county of your choice. Download the images to your computer. \n", + "\n", + "![](https://i.imgur.com/MN2UXHx.png)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 3\n", + "\n", + "Create annual cloud-free Sentinel-2 imagery for the years 2017-2023 for a US county of your choice. Download the images to your computer. You can download a coarse resolution image to speed up the download process. Narrow down the date range (e.g., summer months) to reduce the number of images, which can avoid memory errors.\n", + "\n", + "![](https://i.imgur.com/r5RQlEJ.png)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 4\n", + "\n", + "Create annual cloud-free NAIP imagery for the years 2010-2023 for a US county of your choice. Download the images to your computer. You can download a coarse resolution image to speed up the download process. \n", + "\n", + "![](https://i.imgur.com/h66FC8h.png)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 5\n", + "\n", + "Download a US county of your choice and save it as a shapefile or GeoJSON file. \n", + "\n", + "![](https://i.imgur.com/PuK2Vp3.png)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/lab/lab_08.ipynb b/lab/lab_08.ipynb new file mode 100644 index 0000000..2bc374d --- /dev/null +++ b/lab/lab_08.ipynb @@ -0,0 +1,255 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lab 8\n", + "\n", + "**Submission requirements**\n", + "\n", + "1. An HTML version of your notebook (VS Code > Notebook > Export > HTML). The HTML file must show the output of your code.\n", + "2. A link to your notebook on Colab.\n", + "\n", + "**Datasets**:\n", + "\n", + "The following datasets are used in this lab. You don't need to download them manually, they can be accessed directly from the notebook.\n", + "\n", + "- [nyc_subway_stations.tsv](https://open.gishub.org/data/duckdb/nyc_subway_stations.tsv)\n", + "- [nyc_neighborhoods.tsv](https://open.gishub.org/data/duckdb/nyc_neighborhoods.tsv)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# %pip install duckdb duckdb-engine jupysql" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import duckdb\n", + "\n", + "%load_ext sql" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%config SqlMagic.autopandas = True\n", + "%config SqlMagic.feedback = False\n", + "%config SqlMagic.displaycon = False" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 1: Creating Tables\n", + "\n", + "Create a database, then write a SQL query to create a table named `nyc_subway_stations` and load the data from the file `nyc_subway_stations.tsv` into it. Similarly, create a table named `nyc_neighborhoods` and load the data from the file `nyc_neighborhoods.tsv` into it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 2: Column Filtering\n", + "\n", + "Write a SQL query to display the `ID`, `NAME`, and `BOROUGH` of each subway station in the `nyc_subway_stations` dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 3: Row Filtering\n", + "\n", + "Write a SQL query to find all subway stations in the `nyc_subway_stations` dataset that are located in the borough of Manhattan." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 4: Sorting Results\n", + "\n", + "Write a SQL query to list the subway stations in the `nyc_subway_stations` dataset in alphabetical order by their names." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 5: Unique Values\n", + "\n", + "Write a SQL query to find the distinct boroughs represented in the `nyc_subway_stations` dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 6: Counting Rows\n", + "\n", + "Write a SQL query to count the number of subway stations in each borough in the `nyc_subway_stations` dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 7: Aggregating Data\n", + "\n", + "Write a SQL query to list the number of subway stations in each borough, sorted in descending order by the count." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 8: Joining Tables\n", + "\n", + "Write a SQL query to join the `nyc_subway_stations` and `nyc_neighborhoods` datasets on the borough name, displaying the subway station name and the neighborhood name." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 9: String Manipulation\n", + "\n", + "Write a SQL query to display the names of subway stations in the `nyc_subway_stations` dataset that contain the word \"St\" in their names." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 10: Filtering with Multiple Conditions\n", + "\n", + "Write a SQL query to find all subway stations in the `nyc_subway_stations` dataset that are in the borough of Brooklyn and have routes that include the letter \"R\"." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/lab/lab_09.ipynb b/lab/lab_09.ipynb new file mode 100644 index 0000000..7034189 --- /dev/null +++ b/lab/lab_09.ipynb @@ -0,0 +1,314 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lab 9\n", + "\n", + "In this lab, you will explore spatial data analysis using Python and DuckDB. You'll work with real-world datasets, ranging from global country statistics to specific building datasets. This will give you a practical understanding of handling, analyzing, and visualizing spatial data.\n", + "\n", + "**Submission requirements**\n", + "\n", + "1. **HTML Version:** Submit an HTML version of your notebook. Ensure all code outputs are visible. (Export via VS Code: Notebook > Export > HTML).\n", + "2. **Colab Link:** Provide a link to your notebook hosted on Google Colab for interactive review.\n", + "\n", + "## Setup\n", + "\n", + "Ensure you have DuckDB and Leafmap installed. Run the following command if needed:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# %pip install duckdb leafmap" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import duckdb\n", + "import leafmap" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 1\n", + "\n", + "Connect to a duckdb database and install the `httpfs` and `spatial` extensions" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 2\n", + "\n", + "Download the [Admin 0 – Countries](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/) vector dataset from Natural Earth using the `leafmap.download_file()` function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 3\n", + "\n", + "Create a new table in your database called `countries` and load the data from the downloaded country shapefile into it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Calculate the total population of all countries in the database using the `POP_EST` column." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Show the top 10 countries with the largest population." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Select countries in Europe with a population greater than 10 million and order them by population in descending order." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Save the results of the previous query as a new table called `europe`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Export the `europe` table as a GeoJSON file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 4\n", + "\n", + "Create a table called `text_zones` and load the data from the [taxi_zones.parquet](https://beta.source.coop/cholmes/nyc-taxi-zones/taxi_zones.parquet) into it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Find out the unique values in the `borough` column and order them alphabetically." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Export the `text_zones` table as a parquet file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 5\n", + "\n", + "Explore the [Google Open Buildings](https://beta.source.coop/cholmes/google-open-buildings/v2/geoparquet-admin1/) and select a country of your choice with relatively small number of buildings (i.e., small file size). Get the three character country code and replace `[COUNTRY_NAME]` in the following path with the country code. Use it to load all the parquet files for the selected country into a new table called `buildings`.\n", + "\n", + "`s3://us-west-2.opendata.source.coop/google-research-open-buildings/v2/geoparquet-admin1/country=[COUNTRY_NAME]/*.parquet`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Find out the number of buildings in the selected country." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Find out the total area of all buildings in the selected country." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Export the `buildings` table as a GeoPackage file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/lab/lab_10.ipynb b/lab/lab_10.ipynb new file mode 100644 index 0000000..58ad841 --- /dev/null +++ b/lab/lab_10.ipynb @@ -0,0 +1,161 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lab 10\n", + "\n", + "**Submission requirements**\n", + "\n", + "1. **HTML Version:** Submit an HTML version of your notebook. Ensure all code outputs are visible. (Export via VS Code: Notebook > Export > HTML).\n", + "2. **Colab Link:** Provide a link to your notebook hosted on Google Colab for interactive review.\n", + "\n", + "## Setup\n", + "\n", + "Uncomment and run the following cell to install the required packages." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "# %pip install duckdb leafmap lonboard" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import duckdb\n", + "import leafmap" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 1\n", + "\n", + "Download the [nyc_data.zip](https://github.com/opengeos/data/raw/main/duckdb/nyc_data.zip) dataset using leafmap. The zip file contains the following datasets. Create a new DuckDB database and import the datasets into the database. Each dataset should be imported into a separate table. \n", + "\n", + "- nyc_census_blocks\n", + "- nyc_homicides\n", + "- nyc_neighborhoods\n", + "- nyc_streets\n", + "- nyc_subway_stations" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Visualize the `nyc_subway_stations` and `nyc_streets` datasets on the same map using leafmap and lonboard." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 3\n", + "\n", + "Find out what neighborhood the `BLUE` subway stations are in." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 4\n", + "\n", + "Find out what streets are within 200 meters of the `BLUE` subway stations." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Question 5\n", + "\n", + "Visualize the `BLUE` subway stations and the streets within 200 meters of the `BLUE` subway stations on the same map using leafmap and lonboard." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Add your code here" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}