Dataset information
Available languages
German
Keywords
landsat-7, landsat-5, opendata, land, bodenbedeckung, global-settlement-growth, urbanization, inspireidentifiziert
Dataset description
The World Settlement Footprint (WSF) 2019 is a 10 m resolution binary mask outlining the extent of human settlements globally derived by means of 2019 multitemporal Sentinel-1 (S1) and Sentinel-2 (S2) imagery. Based on the hypothesis that settlements generally show a more stable behavior with respect to most land-cover classes, temporal statistics are calculated for both S1- and S2-based indices. In particular, a comprehensive analysis has been performed by exploiting a number of reference building outlines to identify the most suitable set of temporal features (ultimately including 6 from S1 and 25 from S2). Training points for the settlement and non-settlement class are then generated by Thresholding specific features, which vary depending on the 30 climate types of the well-established Köppen Geiger scheme. Binary classification based on random forest is applied and, finally, a dedicated post-processing performed is where ancillary datasets are employed to further reduce omission and commission errors. Here, the whole classification process has been carried entirely out within the Google Earth Engine platform. To assess the high accuracy and reliability of the WSF2019, two independent crowd-sourcing-based validation exercises have been carried out with the support of Google and Mapswipe, respectively, where overall 1M reference labels have been collected based Photointerpretation of very high-resolution optical imagery.
Starting backwards from the year 2015 — for which the WSF2015 is used as a reference — settlement and non-settlement training samples for the given target year t are iteratively extracted by applying morphological filtering to the settlement mask derived for the year t+ 1, as well as possibly mislabeled samples by adaptively Thresholding the temporal mean NDBI, MNDWI and NDVI. Finally, binary random forest classification in performed.
To quantitatively assess the high accuracy and reliability of the dataset, an extensive campaign based on crowdsourcing Photointerpretation of very high-resolution airborne and satellite historical imagery has been performed with the support of Google. In particular, for the years 1990, 1995, 2000, 2005, 2010 and 2015, ~200K reference cells of 30x30m size distributed over 100 sites around the world have been labelled, hence summing up to overall ~1.2M validation samples.
It is worth noting that past Landsat-5/7 availability considerably varies across the world and over time. Independently from the implemented approach, this might then result in a lower quality of the final product where few/no scenes have been collected. Accordingly, to provide the users with a suitable and intuitive measure that accounts for the goodness of the Landsat imagery, we conceived the Input Data Consistency (IDC) score, which ranges from 6 to 1 with: 6) Very good; 5) Good; 4) fair; 3) Moderate; 2) Low; 1) Very low. The IDC score is available on a yearly basis between 1985 and 2015 and supports a proper interpretation of the WSF evolution product.
The WSF evolution and IDC score datasets are organised in 5138 GeoTIFF files (EPSG4326 projection) each one referring to a portion of 2x2 degree size (~222x222km) on the ground. WSF evolution values range between 1985 and 2015 corresponding to the estimated year of settlement detection, whereas 0 is no data.
A comprehensive publication with all technical details and accuracy figures is currently being finalised. For the time being, please refer to Marconcini et al. 2021.
Build on reliable and scalable technology