In this example, we will read a shapefile as a Spark DataFrame. For this example we'll use The Nature Conservancy's Terrestrial Ecoregions spatial data layer.
In [1]:
from earthai.init import * import requests import zipfile import os
Out [1]:
Importing EarthAI libraries. EarthAI version 1.6.0; RasterFrames version 0.9.0; PySpark version 2.4.7 Creating SparkSession...
Download the Shapefile
In [2]:
! wget https://astraea.box.com/v/TerrestrialEcosystems
Out [2]:
wget: /opt/conda/envs/earthai/lib/libuuid.so.1: no version information available (required by wget) --2021-12-29 19:57:47-- https://astraea.box.com/v/TerrestrialEcosystems Resolving astraea.box.com (astraea.box.com)... 107.152.26.197 Connecting to astraea.box.com (astraea.box.com)|107.152.26.197|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://astraea.app.box.com/v/TerrestrialEcosystems [following] --2021-12-29 19:57:48-- https://astraea.app.box.com/v/TerrestrialEcosystems Resolving astraea.app.box.com (astraea.app.box.com)... 107.152.26.201 Connecting to astraea.app.box.com (astraea.app.box.com)|107.152.26.201|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] Saving to: ‘TerrestrialEcosystems’ TerrestrialEcosyste [ <=> ] 11.84K 47.7KB/s in 0.2s 2021-12-29 19:57:48 (47.7 KB/s) - ‘TerrestrialEcosystems’ saved [12128]
In the call to
read.shapefile
, pass the full path to either the .shp or .zip file.In [3]:
df = spark.read.shapefile(os.path.abspath('Terrestrial_Ecoregions.zip'))
df.select('__fid__', 'geometry', 'ECO_NAME')
Out[3]:
__fid__ | geometry | ECO_NAME |
Terrestrial_Ecoregions.1 | MULTIPOLYGON (((16396046.3195 -286864.58... | Admiralty Islands Lowland Rain Forests |
Terrestrial_Ecoregions.2 | MULTIPOLYGON (((14273952.844999999 -9179... | Banda Sea Islands Moist Deciduous Forests |
Terrestrial_Ecoregions.3 | MULTIPOLYGON (((15163437.869599998 -1415... | Biak-Numfoor Rain Forests |
Terrestrial_Ecoregions.4 | MULTIPOLYGON (((14161520.147600003 -4262... | Buru Rain Forests |
Terrestrial_Ecoregions.5 | MULTIPOLYGON (((15253961.773900002 -3333... | Central Range Montane Rain Forests |
We can work with this DataFrame as shown in the RasterFrames documentation and in querying raster data with vectors.
Comments
0 comments
Article is closed for comments.