This example demonstrates how to "burn-in" a geometry into a raster. It runs through the steps of 1) acquiring imagery scenes from the EarthAI Catalog, and 2) using RasterFrames to read imagery and rasterize a geometry.
Note: if you would like to run through this example in EarthAI Notebook, you can download the companion notebook and vector data source from the attachments provided at the end of this article.
We will start by importing all of the Python libraries used in this example.
from earthai.init import * import earthai.chipping.strategy import pyspark.sql.functions as F import geopandas
Query Imagery at STEP Sites
In a previous article, we introduced the System for Terrestrial Ecosystem Parameterization (STEP) data set, used it to query the EarthAI Catalog to identify Landsat 8 scenes that intersect with urban and cropland sites around the world, and then used
spark.read.chip to create tiles that intersect the STEP site geometries. The code in the cell block below replicates those steps for use in the following sections. Please refer to the previous article for more details on these operations.
# Read in the STEP data set step_gdf = geopandas.read_file("data/step_september152014_70rndsel_igbpcl.geojson") # Filter to include only the urban and cropland classes step_subset_gdf = step_gdf[step_gdf.igbp.isin([12, 13])] # Query Landsat 8 imagery at STEP sites cat = earth_ondemand.read_catalog( step_subset_gdf.geometry, start_datetime='2014-06-01', end_datetime='2014-06-15', max_cloud_cover=10, collections='landsat8_l1tp' ) # Join the imagery catalog back to the STEP data step_cat = geopandas.sjoin(step_subset_gdf, cat) # Create chips rf = spark.read.chip(step_cat, ['B4', 'B3', 'B2'], chipping_strategy=earthai.chipping.strategy.IntersectingExtent()) \ .withColumnRenamed('B4', 'red') \ .withColumnRenamed('B3', 'green') \ .withColumnRenamed('B2', 'blue') \ .filter(rf_tile_max('red') > 0)
To rasterize the STEP site geometries into a new column called target, we first need to reproject the geometries to the same CRS as the imagery. Then, we can call the
rf_rasterize function and pass it the STEP site geometry in the reprojected CRS (geo_native), the boundaries of the tile column, the igbp column, and the tile dimensions. The value of the igbp column will be “burned-in” to the returned tile where the geo_native intersects the tile. Any pixels outside the geo_native will be assigned a NoData value.
rf = rf.withColumn('geo_native', st_reproject('geometry', rf_mk_crs(str(step_subset_gdf.crs)), rf_crs('red'))) \ .withColumn('target', rf_rasterize('geo_native', rf_geometry('red'), 'igbp', rf_dimensions('red').cols, rf_dimensions('red').rows)) rf.select('red', 'green', 'blue', 'target')
If desired, you can now use the rasterized geometry column, target, to mask the other bands using the