This article describes how to read in image scenes from the EarthAI Catalog into a RasterFrame.
In a previous article, we discussed how to query imagery data using the EarthAI Catalog API. Now we will show how to read that catalog of imagery scenes into a RasterFrame using spark.read.raster
.
Import Library
We import the EarthAI library as a first step.
from earthai.init import *
Query EarthAI Catalog
The code below queries the EarthAI Catalog for Landsat 8 imagery with a maximum cloud cover of 10% in the month of August 2018. We pass a single lat-long point of a location within the Yellowstone region. For more information on querying the EarthAI catalog, please refer the previous article where we discuss this in detail.
cat = earth_ondemand.read_catalog( geo='POINT(-110.0 44.5)', start_datetime='2018-08-01', end_datetime='2018-08-31', max_cloud_cover=10, collections='landsat8_c2l1t1', )
Read Imagery into a RasterFrame
cat contains a catalog of Landsat 8 imagery scenes. It contains references to the imagery files, but not actual imagery. To read in the imagery, you can pass cat to spark.read.raster
and it will read these imagery scenes into a RasterFrame.
When passing a catalog to spark.read.raster
, you must also provide a list of bands you wish to read in the catalog_col_names
parameter. To view a list of available Landsat 8 bands, you can run earth_ondemand.item_assets('landsat8_c2l1t1')
. This function provides information on the available bands, spatial resolution, and band details for each collection in the EarthAI catalog.
The band_name field is what should be passed to spark.read.raster
.
earth_ondemand.item_assets('landsat8_c2l1t1').sort_values('asset_name')
In this example, we will use spark.read.raster
to read in B4, B3, and B2, which correspond to the red, green, and blue bands. The band names are passed in as a list.
rf = spark.read.raster(cat, catalog_col_names = ['B4', 'B3', 'B2'])
You can select the band names to view tile samples.
rf.filter(rf_tile_max('B4') > 0).select('B4', 'B3', 'B2')
spark.read.raster
breaks imagery scenes up into a gridded set of tiles. By default, the dimensions of each tile will be 256 by 256 pixels, but if you want a different size, then you can pass a Tuple of dimensions, e.g. (512, 512), to the tile_dimensions
parameter.
You can run ?spark.read.raster
in a cell to get more information about the parameters that spark.read.raster
can take.
Comments
0 comments
Please sign in to leave a comment.