This example demonstrates how to write out a RasterFrame to a GeoTIFF. RasterFrames provides a specialized Spark DataFrame writer for rendering a RasterFrame to a GeoTIFF. It is an expensive operation since creating a GeoTIFF requires that all of the data be in the memory of one computer.
In this example, we use a lower resolution collection and limit the area of interest to create a manageable data size. We run through the steps of 1) acquiring imagery scenes from the EarthAI Catalog, 2) using RasterFrames to read imagery, and 3) writing a RasterFrame to GeoTIFF.
Import Libraries
We will start by importing all of the Python libraries used in this example.
from earthai.init import * import earthai.chipping.strategy import pyspark.sql.functions as F import ipyleaflet import geopandas
Query the EarthAI Catalog
We read in a GeoJSON file containing U.S. state boundaries and filter the GeoDataFrame to North Carolina.
We use the geometry column in the GeoDataFrame to query the EarthAI catalog for MODIS surface reflectance data from September 1, 2020 covering North Carolina.
states_url ='https://raw.githubusercontent.com/datasets/geo-admin1-us/master/data/admin1-us.geojson' states_gdf = geopandas.read_file(states_url) nc_gdf = states_gdf[states_gdf["name"] == "North Carolina"] cat = earth_ondemand.read_catalog(nc_gdf.geometry, start_datetime='2020-09-01', end_datetime='2020-09-01', collections='mcd43a4')
Read in MODIS Imagery
We join the catalog back to the GeoDataFrame containing North Carolina state boundaries in order to match the state boundary to the intersecting image scene. This step is critical for use of the chip reader since the chipping strategy needs the state boundary polygon.
cat = geopandas.sjoin(cat, nc_gdf, how='right').rename(columns={"geometry":"nc_bounds"})
Since MODIS scenes are very large, we use spark.read.chip
to read only the imagery intersecting North Carolina state boundaries. We read in the B01 (red), B04 (green), and B03 (blue) bands. The feature-aligned grid strategy creates a grid across North Carolina using the specified tile_dimensions
, and returns the generated chips.
To view all of the available bands for the MODIS collection, you can run earth_ondemand.item_assets('mcd43a4')
.
rf = spark.read.chip(cat, catalog_col_names=['B01', 'B04', 'B03'], geometry_col_name='nc_bounds', chipping_strategy=earthai.chipping.strategy.FeatureAlignedGrid(256), tile_dimensions=(256,256))
To view the North Carolina state boundary and chip outlines on a map using ipyleaflet, we need to reproject the chip geometries to the standard "EPSG:4326" projection before plotting the outlines.
rf = rf.select(F.col('B01').alias('red'), F.col('B04').alias('green'), F.col('B03').alias('blue')) \ .withColumn('chip_geom_wgs84', st_reproject(rf_geometry('red'), rf_crs('red'), F.lit('EPSG:4326')))
m = ipyleaflet.Map(center=(nc_gdf.geometry.centroid.y.mean(), nc_gdf.geometry.centroid.x.mean()), zoom=6) nc_layer = ipyleaflet.GeoData(geo_dataframe=nc_gdf, name='North Carolina', style={'fillColor': '#f003fc', 'color': '#f003fc'}) chips_gdf = geopandas.GeoDataFrame(rf.select('chip_geom_wgs84').toPandas(), geometry='chip_geom_wgs84', crs='EPSG:4326') chips_layer = ipyleaflet.GeoData(geo_dataframe=chips_gdf, name='Chips', style={'fillColor': '#32a852', 'color': '#32a852'}) m.add_layer(nc_layer) m.add_layer(chips_layer) m.add_control(ipyleaflet.LayersControl()) m

Write RasterFrame to GeoTIFF
As mentioned earlier, it's important to limit the imagery size or downsample the resolution to create a single GeoTIFF of your imagery since all of the data to be encoded has to be in the memory of one computer. You can provide a raster_dimensions
parameter, which will downsample (or upsample) your pixel resolution depending on the dimensions using bilinear resampling. If no raster_dimensions
parameter is specified, the RasterFrame contents are written at full resolution.
Since we used a lower resolution collection and limited our area of interest to North Carolina, we can write out the contents at full resolution.
The RasterFrame writer will interpret a RasterFrame with three or four tile columns as red, green, blue, and optionally alpha bands. You can select any combination of three or four tile columns to create your intended color composite. Any other number of tile columns will result in a greyscale interpretation.
We select the red, green, and blue bands to write out a natural color image.
rf.select('red', 'green', 'blue').write.geotiff('geotiff-overview.tif', crs='EPSG:4326')
geotiff-overview.tif will be written out to the same directory where your notebook resides. You can right click on the file in the left menu and select Download to save the file to your local machine for viewing in an external GIS software.
Comments
0 comments
Please sign in to leave a comment.