::p_load(sf, tidyverse) pacman
Hands-on Exercise 1: Geospatial Data Wrangling
Overview
In this hands-on exercise, the class learnt how to import and wrangle Geospatial data using the appropriate R packages.
Learning Outcomes:
Install and load the
sf
andtidyverse
packages in R.Import geospatial and non-geospatial data using appropriate functions.
Explore and manipulate data frames using Base R and
sf
functions.Assign or transform coordinate systems using
sf
functions.Convert data into an
sf
data frame using thesf
package.Perform geospatial operations using
sf
functions.Conduct data wrangling tasks using the
dplyr
package.Perform Exploratory Data Analysis (EDA) using
ggplot2
functions.”
Getting started
The code chunk below install and load sf and tidyverse packages into R environment
Importing Geospatial Data:
The data used for the exercise are as follow:
- Master Plan 2014 Subzone Boundary (Web) from data.gov.sg
- Pre-Schools Location from data.gov.sg
- Cycling Path from LTADataMall
- Latest version of Singapore Airbnb listing data from Inside Airbnb
Importing feature data
A utility known as st_read is used in the data import procedure. This function reads many map formats and extensions, including.shp,.dbf,.prj, and.shx. The following parameters are used by the function:
The
dsn
Parameter, specifies the location of which where we map our files.The
layer
parameter in this section emphasizes specific map featureLastly, we note that extensions such as
.shp
,.dbf
,.prj
and.shx
are not necessary.
= st_read(dsn = "data/geospatial",layer = "MP14_SUBZONE_WEB_PL") mpsz
Reading layer `MP14_SUBZONE_WEB_PL' from data source
`C:\Zackkoh94\ISSS624\Hands-on_Ex1\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
Shapefiles are a common format for storing geospatial vector data, representing various geographic features like points, lines, and polygons. The “MP14_SUBZONE_WEB_PL” refers to a specific layer within a shapefile which contains polygon features. These polygons may represent subzones within a geographic region, such as those outlined in the Master Plan 2014 Subzone Boundary (Web) data.
The dataset serves as a forward-looking plan for Singapore’s development over the next 10 to 15 years, known as the Development Master Plan 2014. Subzones typically revolve around key focal points, like neighborhood centers or activity nodes, and a Planning Area can consist of more than 10 subzones. The data is sourced from the Singapore Government.
<- st_read(dsn = "/Zackkoh94/ISSS624/Hands-on_Ex1/data/geospatial", layer = "CyclingPathGazette") cyclingpath
Reading layer `CyclingPathGazette' from data source
`C:\Zackkoh94\ISSS624\Hands-on_Ex1\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 2558 features and 2 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: 11854.32 ymin: 28347.98 xmax: 42626.09 ymax: 48948.15
Projected CRS: SVY21
This code imports data representing polylines from a shapefile. Polylines are used to depict linear features like roads, rivers, or cycling paths, consisting of connected straight lines. In this instance, the data represents a cycling path within Singapore, excluding park connectors. The source of this data is the Land Transport Authority.
<- st_read(dsn = "/Zackkoh94/ISSS624/Hands-on_Ex1/data/geospatial/PreSchoolsLocation.kml") preschool
Reading layer `PRESCHOOLS_LOCATION' from data source
`C:\Zackkoh94\ISSS624\Hands-on_Ex1\data\geospatial\PreSchoolsLocation.kml'
using driver `KML'
Simple feature collection with 2290 features and 2 fields
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 103.6878 ymin: 1.247759 xmax: 103.9897 ymax: 1.462134
z_range: zmin: 0 zmax: 0
Geodetic CRS: WGS 84
This code imports geospatial data in KML format, which is commonly used for annotating and visualizing geographic information on maps and Earth browsers. Specifically, it imports data about the locations of pre-schools in Singapore from a KML file. The source of this data is the Singapore Government.
Checking the Content of A Simple Feature Data Frame
The following codes are for retrieve information related to the content of a simple feature data frame:
Working with st_geometry()
st_geometry(mpsz)
Geometry set for 323 features
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
First 5 geometries:
MULTIPOLYGON (((31495.56 30140.01, 31980.96 296...
MULTIPOLYGON (((29092.28 30021.89, 29119.64 300...
MULTIPOLYGON (((29932.33 29879.12, 29947.32 298...
MULTIPOLYGON (((27131.28 30059.73, 27088.33 297...
MULTIPOLYGON (((26451.03 30396.46, 26440.47 303...