What is a CRS or projection?
A coordinate refrence system is a set of mathematical constructs used to translate locations on the three-dimensional surface of the Earth to the two dimensional plane. Knowing what CRS spatial data is expressed in is important because it tells us where on the Earth the data is. Comparisons between spatial data can only be done when both data are in the same CRS.
Sometimes also called a projection, the translation of data from one CRS to another is often called reprojection.
Common Projections
There are several common projections that you will encounter as you work with raster and vector data.
- WGS84 - Basically the GPS coordinates of the location. If the CRS is not specified, many vector sources can be assumed to be in WGS84. This is also sometimes referred to as EPSG:4326 or CRS84.
- EPSG:3857 - The most common projection for web mapping services. Also known as web Mercator.
- Universal transverse mercator projections:
- Those using the WGS84 model of the Earth:
- EPSG:326XX for northern hemisphere, for example zone 15N is EPSG:32615
- EPSG:327XX for southern hemisphere, for example zone 1S is EPSG:32701
- Those using the WGS84 model of the Earth:
Convenience Functions
Let's get started. First, import convenience functions from earthai.geo
for reprojecting [Shapely][Shapely] vector geometries in Python.
from earthai.init import * from earthai.geo import reproject_from_wgs84, reproject_to_wgs84, reproject_on_the_fly, reproject from shapely.geometry import Point from shapely.wkt import loads as loads_wkt import geopandas import folium
doha_wgs84 = Point(51.533333, 25.286667) crs84 = 'OGC:CRS84'
Reproject to the Qatar national grid. Many countries define one or more CRSs to be used in official maps.
reprojected = reproject_from_wgs84(doha_wgs84, 'epsg:28600') reprojected.x, reprojected.y
Reproject it back to longitude and latitude. There is a small numeric error.
reproject(reprojected, 'epsg:28600', crs84 ).wkt
wyoming_wgs84 = loads_wkt('POLYGON((-111.04226573835854 45.03030926505925,-104.05496105085854 45.03030926505925,-104.05496105085854 41.01665169983997,-111.04226573835854 41.01665169983997,-111.04226573835854 45.03030926505925))') wyoming_wgs84.bounds
The reproject_on_the_fly
function will reproject to the appropriate UTM zone for the geometry, based on its center. It returns both the geometry and the projection that it is in.
g, p = reproject_on_the_fly(wyoming_wgs84) g.bounds, p
We reproject back to WGS84.
reproject_to_wgs84(g, p).bounds
Reprojecting GeoPandas Objects
The official GeoPandas docs covers both setting the CRS for and reprojecting GeoDataFrames and GeoSeries.
Here we'll present a quick example.
gs = geopandas.GeoSeries([Point(-77, 40), Point(-80, 42)]) gs.crs = crs84 gs
Reproject a GeoPandas object using the to_crs
method.
gs.to_crs('epsg:32617')
Reprojecting in a Spark DataFrame
In EarthAI Notebook, we have convenient functions for working with columns of vector geometries.
We will start by creating a Spark DataFrame from scratch. The geom
column will contain geometry objects.
from pyspark.sql import Row spark_geo_df = spark.createDataFrame([Row(geom=g) for g in gs]) spark_geo_df
To reproject a column of vector geometries, use the st_reproject
function. The arguments are as our reproject
function for Shapely geometries: geometry, origin CRS, and destination CRS. However they can each be columns on the Spark DataFrame.
For the CRS columns, you can pass a string lit
{ open=new}, rf_mk_crs
, st_wgs84_crs
, or rf_crs
.
spark_geo_df.withColumn('geo_3005', st_reproject('geom', st_wgs84_crs(), lit('epsg:3005')))
Working with Raster Extents
Most tile columns in RasterFrames have embedded spatial information: the CRS and extent of the tile. The extent can be interepreted as a vector geometry. There are some convenience functions for considering raster extents. The code below shows:
- querying Earth OnDemand and reading rasters
- reprojecting the tile extent from the native CRS of the imagery to WGS84
- converting the WGS84 extent to a GeoPandas GeoDataFrame
- viewing the result in a web map
raster_df = spark.read.raster( earth_ondemand.read_catalog(Point(-112, 33), start_datetime='2019-01-01', end_datetime='2019-05-31', cloud_cover=2, collections='landsat8_l1tp'), ['B2'] )
rf_geometry
lets us extract the native extent as a Polygon object.
rf_crs
extracts the CRS of the imagery. It is formatted as a proj
string
raster_df.select(rf_geometry('B2'), rf_crs('B2'))
Combine them together to get tile coverage in our desired CRS.
extents = raster_df.select( st_reproject(rf_geometry('B2'), rf_crs('B2'), st_wgs84_crs() ).alias('tile_geo_wgs84') ) extents
Finally, we will convert a subset of the DataFrame to GeoPandas and make a map using folium
.
extents_gdf = geopandas.GeoDataFrame( extents.limit(200).toPandas(), geometry='tile_geo_wgs84', crs=crs84 ) folium.Map([33, -112], zoom_start=8).add_child(folium.GeoJson(extents_gdf))
Comments
0 comments
Article is closed for comments.