{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "In [a previous article](https://astraeahelp.zendesk.com/knowledge/articles/360043452652/en-us?brand_id=360003221551), we introduced the `spark.read.chip` function for reading in subsets of scenes from Earth observation data, and in [another article](https://astraeahelp.zendesk.com/knowledge/articles/360051206972/en-us?brand_id=360003221551), we showed how to write out chips in GeoTIFF format. In this article, we will show how to write out your chips in PNG format.\n", "\n", "*Note: if you would like to run through this example in EarthAI Notebook, you can download the companion notebook and vector data source from the attachments provided at the end of this article.*\n", "\n", "# Import Libraries\n", "\n", "We will start by importing all of the Python libraries used in this example." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from earthai.init import *\n", "import earthai.chipping.strategy\n", "import pyspark.sql.functions as F\n", "\n", "import os\n", "import geopandas\n", "import rasterio\n", "import ipyleaflet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Query Imagery at STEP Sites\n", "\n", "In [a previous article](https://astraeahelp.zendesk.com/knowledge/articles/360043452652/en-us?brand_id=360003221551), we introduced the [System for Terrestrial Ecosystem Parameterization](http://www.gofcgold.wur.nl/sites/gofcgold_refdataportal-step.php#:~:text=The%20System%20for%20Terrestrial%20Ecosystem,%2C%20ecosystems%2C%20and%20vegetation%20types.) (STEP) data set, and used it to query the EarthAI Catalog to identify Landsat 8 scenes that intersect with cropland and urban sites around the world. The code in the cell block below replicates those steps for use in the following sections. Please refer to the previous article for more details on these operations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Read in the STEP data set\n", "step_gdf = geopandas.read_file(\"data/step_september152014_70rndsel_igbpcl.geojson\")\n", "\n", "# Filter to include only the cropland and urban classes\n", "step_subset_gdf = step_gdf[step_gdf.igbp.isin([12, 13])]\n", "\n", "# Query Landsat 8 imagery at STEP sites\n", "cat = earth_ondemand.read_catalog(\n", " step_subset_gdf.geometry,\n", " start_datetime='2014-06-01', \n", " end_datetime='2014-06-15',\n", " max_cloud_cover=10,\n", " collections='landsat8_l1tp'\n", ")\n", "\n", "# Join the imagery catalog back to the STEP data\n", "step_cat = geopandas.sjoin(step_subset_gdf, cat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**step_cat** can include multiple Landsat 8 scenes for each STEP site, taken at different dates/times. For simplicity in demonstrating chip writing, we select just a single scene for each site. The code below selects the scene with the least cloud coverage." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "step_cat['grp_col'] = step_cat['siteid']\n", "step_cat = step_cat.sort_values('eo_cloud_cover').groupby(['grp_col']).first()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Read Chips\n", "\n", "We use the centroid-centered chipping strategy, which creates chips of the specified dimensions centered at a point, or at the centroids of each polygon, depending on what input geometry is passed. The returned RasterFrame will have chips of uniform dimensions - one for each input geometry. This chipping strategy is useful for deep learning applications.\n", "\n", "We pass the chipping strategy, `earthai.chipping.strategy.CentroidCentered`, to the `spark.read.chip` function. We specify the chip dimensions as 50 by 50 pixels.\n", "\n", "_To see a list of all chipping strategies and a description of their behavior, run `earthai.chipping.chipping_strategies()`._" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rf = spark.read.chip(step_cat, ['B4', 'B3', 'B2'], \n", " chipping_strategy=earthai.chipping.strategy.CentroidCentered(50, 50)) \\\n", " .withColumnRenamed('B4', 'red') \\\n", " .withColumnRenamed('B3', 'green') \\\n", " .withColumnRenamed('B2', 'blue') \\\n", " .filter(rf_tile_max('red') > 0).cache() # filter out chips with all NoData values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Write Chips\n", "\n", "To write chips in PNG format, we use the `rf.write.chip` function. This function requires a file path and file name column as input. The file path points to the directory that will store the chips when they are written out. The file name column provides the file name to use for each chip. The file name column can also include a subdirectory structure if desired. \n", "\n", "In the cell below, we create the __file_path_name__ column that concatenates the __igbp__ label with the unique __siteid__ value to create a subdirectory structure that organizes the chips by label. The cropland chips will be written out the __\"12\"__ folder and the urban chips will be written out to the __\"13\"__ folder within the main directory." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rf = rf.withColumn('file_path_name', \n", " F.concat_ws('/', F.col('igbp'), F.col('siteid')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As specified below, the main folder containing the chips will be called __\"chips_png\"__. It will be created in the same directory where your notebook resides. \n", "\n", "To write the chips out in PNG format rather than GeoTIFF format, you need to pass __True__ to the `png` parameter. When three bands are provided, the output is a downsampled RGB composite. When one band is provided, the output is a greyscale image. \n", "\n", "A single PNG will be written out for each row of your DataFrame. Run the cell below to start writing chips.\n", "\n", "_It takes 3-4 minutes to write out the 149 chips in this RasterFrame on a Dedicated Instance type._" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rf.write.chip('chips_png', filenameCol='file_path_name', png=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the chips are written out, you can navigate through the chip directory in the left menu, right click on any of the files, and select ___Download___ to save the file to your local machine." ] } ], "metadata": { "kernelspec": { "display_name": "EarthAI Environment", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" }, "zendesk": { "draft": true, "id": 360051235732, "section_id": 360008732711, "title": "Writing Image Chips to PNG Files" } }, "nbformat": 4, "nbformat_minor": 4 }