This article describes how to read in image scenes from the EarthAI Catalog into Numpy arrays using Rasterio.
In a previous article, we discussed how to query imagery data using the EarthAI Catalog API and in another article, we discussed how to read that catalog of imagery scenes into a RasterFrame using spark.read.raster
. Now we will show another way to read image scenes using Rasterio.
Import Library
We import the EarthAI library as a first step.
from earthai.init import * import rasterio import rasterio.env import matplotlib.pyplot as plt
Query EarthAI Catalog
The code below queries the EarthAI Catalog for Landsat 8 imagery with a maximum cloud cover of 10% in the month of August 2018. We pass a single lat-long point of a location within the Yellowstone region. For more information on querying the EarthAI catalog, please refer the previous article where we discuss this in detail.
cat = earth_ondemand.read_catalog( geo='POINT(-110.0 44.5)', start_datetime='2018-08-01', end_datetime='2018-08-31', max_cloud_cover=10, collections='landsat8_c2l1t1', )
Read Imagery into Numpy Arrays
The EarthAI catalog query returns a catalog of Landsat 8 imagery scenes. It contains references to the imagery files, but not actual imagery.
The first step to read in the imagery is to determine which bands you want to read in. To view a list of available Landsat 8 bands, you can run earth_ondemand.item_assets('landsat8_l1tp')
. This function provides information on the available bands, spatial resolution, and band details for each collection in the EarthAI catalog.
earth_ondemand.item_assets('landsat8_l1tp').sort_values('asset_name')
In this example, we read in B4, B3, and B2 bands, which correspond to the red, green, and blue bands, respectively.
The rasterio.env.Env
function provides Rasterio the AWS credentials it needs to pull the imagery from it's location in Amazon S3. The rasterio.open
function creates a DatasetReader object that contains important metadata about the image file as shown below.
with rasterio.env.Env(CURL_CA_BUNDLE='/etc/ssl/certs/ca-certificates.crt'): with rasterio.open(cat.iloc[0].B4) as src: display(src.meta)
To read the actual imagery into Numpy arrays, you have to call the read
function on the DatasetReader object. In the code below, we loop through each scene, read the red, green, and blue bands into Numpy arrays, and store them in a list. Since each image file only contains a single band, we pass 1 as a parameter in the read
function, which tells Rasterio to read in only the first band.
red_bands = [] grn_bands = [] blu_bands = [] with rasterio.env.Env(CURL_CA_BUNDLE='/etc/ssl/certs/ca-certificates.crt'): for idx, scene in cat.iterrows(): red_bands.append(rasterio.open(scene.B4).read(1)) grn_bands.append(rasterio.open(scene.B3).read(1)) blu_bands.append(rasterio.open(scene.B2).read(1))
To view one of your image scenes, you pass a Numpy array to plt.imshow
.
plt.imshow(red_bands[0]) plt.show()
Comments
0 comments
Please sign in to leave a comment.