This example demonstrates how to mask the MODIS Vegetation Indices data collection. There are 2 datasets in this collection. MOD13A1 is from the Terra satellite and MYD13A1 is from the Aqua satellite.
In this example, we run through the steps of 1) acquiring imagery scenes from the EarthAI Catalog, and 2) using RasterFrames to read and mask imagery.
Masking is a very common operation in remote sensing data processing to mark certain pixels as null values. It is useful for excluding cloudy or low quality pixels from your analysis.
We will start by importing all of the Python libraries used in this example.
from earthai.init import *
There are 2 quality indicator bands for the MOD13A1/MYD13A1 collections in the EarthAI catalog: PR and VIQ. The pixel reliability band (PR) contains a simplified categorization of the data that describe overall pixel quality. The PR values are described in the table below.
|NoData||Fill/No Data||Not Processed|
|0||Good Data||Use with confidence|
|1||Marginal data||Useful, but look at other QA information|
|2||Snow/Ice||Target covered with snow/ice|
|3||Cloudy||Target not visible, covered with cloud|
The VI quality assessment band (VIQ) contains detailed pixel quality descriptions. The VIQ values are described in the table below.
|0-1||VI Quality||00||VI produced with good quality|
|01||VI produced, but check other QA|
|10||Pixel produced, but most probably cloudy|
|11||Pixel not produced due to other reasons than clouds|
|2-5||VI Usefulness||0000||Highest quality|
|1101||Quality so low that it is not useful|
|1110||L1B data faulty|
|1111||Not useful for any other reason/not processed|
|8||Adjacent cloud detected||0||No|
|9||Atmosphere BRDF Correction||0||No|
|11-13||Land/Water Mask||000||Shallow ocean|
|001||Land (Nothing else but land)|
|010||Ocean coastlines and lake shorelines|
|011||Shallow inland water|
|101||Deep inland water|
|110||Moderate or continental ocean|
More information about MOD13A1/MYD13A1 quality indicators can be found here.
Query EarthAI Catalog
We pass a lat-long point near Dar es Salaam, Tanzania to the EarthAI catalog and query it for MOD13A1 image scenes from September 2019 with a maximum cloud cover of 10%.
cat = earth_ondemand.read_catalog(geo='Point(39.20 -6.80)', start_datetime='2019-09-01', end_datetime='2019-09-30', collections='mod13a1', max_cloud_cover=10 )
Read Image Bands
We pass cat to
spark.read.raster to read in the NDVI band.
To view all of the available bands for the MOD13A1 and MYD13A1 collection, you can run
rf = spark.read.raster(cat, catalog_col_names=['NDVI', 'PR', 'VIQ'])
As a first step, we use the PR band to mask out snow, ice, and cloud covered pixels using the
rf_mask_by_values function. This function replaces values in a tile column with NoData where the mask tile column is equal to one of the mask values. In the PR band, 2 represents snow/ice and 3 represents clouds, so we pass these values in a list to mask out these pixels in the NDVI band.
rf = rf.withColumn('NDVI_masked', rf_mask_by_values('NDVI', 'PR', [2, 3]))
We look at a sample of NDVI tiles before and after masking in the code below.
After masking out the snow, ice, and cloud covered pixels using the PR band, we are left with good and marginal data. With marginal data, it is suggested to look at other quality assessment information, so we use the VIQ band to look at detailed pixel quality information.
If we only want to keep high quality pixels, we can filter out all of the decreasing and lowest quality pixels using the vegetation index (VI) usefulness indicator. The VIQ indicator values are represented in binary digits as shown in the VIQ table above. The RasterFrames function,
rf_mask_by_bits, handles masking of binary digits. This function replaces values in a tile column with NoData where the mask tile column is equal to one of the mask values at each bit location.
We need to pass this function a starting bit, number of bits, and list of mask values. If you refer to the VIQ table above, you can see that the starting bit for VI usefulness is 2 and the number of bits in each mask value is 4. To get the mask values, we transform the binary digits to decimal for the decreasing and lowest quality pixels.
We pass all of this information to the
rf_mask_by_bits function as well as the NDVI_masked and VIQ bands.
rf = rf.withColumn('NDVI_masked', rf_mask_by_bits('NDVI_masked', 'VIQ', 2, 4, [2, 4, 8, 9, 10, 12, 13, 14, 15]))
We look at a sample of NDVI tiles before and after the additional masking in the code below.