Glossary of Building Attributes Provided by OBI
-
Building Type Source
Building type source refers to the reason based on which the classification type of the building is obtained.
- Buildings with less than 20 square meters of area are considered Residential and their Building Type Source is "area"
- Buildings which are associated with OSM buildings are classified based on the available OSM information, in which case the source is "OSM Derived"
- For the rest of the buildings, their types are determined by the model, in which case Building Type Source refers to classification_model and it provides the confidence of the model with its choice.
-
Coordinates
Coordinates refers to the coordinates of the center of the building provided as latitude and longitude.
-
Roof Area
Roof area refers to the area of the building when viewed from top given in square meters.
-
Building Faces
The number of outer walls (faces) of the building.
-
Perimeter
The perimeter of the building, i.e., the total length of the outer walls of the building given in meters.
-
Data Source
Data source is the source the building footprint was obtained from, valid values are Google, Microsoft, and OSM.
-
GHS-SMOD
GHS-SMOD refers to degree of urbanization as defined and categorized by Global Human Settlement Layer.
-
Height
The height of the building given in meters as well as in number of floors.
-
Gross Floor Area
Gross floor area is the total of footprint areas of each floor on the building. Computed as footprint area times the amount of floors for the building provided as square meters.
-
Electricity Access*
The estimated mean likelihood that a building of interest is connected to the electric grid.
-
Electricity Consumption*
A mean-point estimate of a modeled distribution curve of monthly electricity consumption for a building, given in kWh.
* Currently only available in Kenya
We would like to acknowledge Dr. Stephen Lee (Open Energy Maps) and Associate Professor Jay Taneja (University of Massachusetts Amherst) for providing building-level electricity access and consumption estimates and supporting with model evaluation metrics of our results in Kenya.
Creating the data set
The building footprints obtained from Google-Microsoft Open Buildings (combined and published by VIDA) are merged with select buildings from OpenStreet Maps building footprints. The building footprint catalog is further enriched by other data from public sources, e.g., the degree of urbanization of the neighborhood, or obtained from other closed sources, e.g., estimated building height information on the building level.
The following chart outlines the steps leading to the creation of the enriched building footprint catalogue.
The basic building catalog is created by ingesting Google-Microsoft Open Buildings data (combined and published by VIDA). The VIDA-derived data set provides no additional information about the buildings, therefore ingesting it creates a “blank template” to be enriched by later steps.
The building catalogue is stored in a relational structure, i.e., as a table, containing the polygon of each building given in GPS coordinates, its surface area in square meters, its perimeter in meters, the number of outer walls and the GPS coordinates of the center of the building, called centroid. This centroid is used to identify buildings in what follows.
In addition, building footprints are cross-referenced with available OSM information obtaining additional type of buildings (School, Hospital, Airport Facility, etc.) based on OSM area, building and node information available.
Sentinel-2 images are used to obtain building roof images for every building. Sentinel-2 images are provided as 110x100 km large tiles of Earth’s surface, where clouds are represented as white pixels. To cover the Earth’s surface only and remove any clouds from the images, several images are downloaded of the same region from a pre-defined time-range and merged together into a singular image to fill in cloud covered areas.
Once proper satellite images are obtained the roof image of each building is cropped. For each building the proper satellite image tile is determined, which contains the building based on its building footprint polygon defined in the building catalogue. The roof image is cropped from the tile and stored for later use, while additional metadata are also collected about the satellite image used to enrich the building data set.
Global Human Settlement Layer (GHSL) provides a publicly available data layer named Settlement Model (SMOD) grid, which is used to classify buildings into Urban/Suburban/Rural categories based on their location. This layer, represented as a black-and-white image of a country provides a categorization of each 1x1 km large grid cell as a pixel. GHSL-SMOD categorizes each grid cell as one of nine categories, which are simplified for our purposes into Urban, Suburban and Rural grid cells. Each building is classified based on which grid cell it belongs to.
The WSF3DV3 data layer is provided by DLR and is used to calculate the height of buildings. This data layer is provided in a form of a map, in which an additional black-and-white scale layer indicates the height of buildings. A single pixel of the map represents approximately ten meters. Using the footprint of each building the median height of each building is extracted from the map and provided as part of the visualization.
Building height is estimated in meters and as number of floors, based on which the gross floor area of each building is computed in square meters.
Open Energy Maps provides electricity access and electricity consumption related estimates for most of the buildings in our footprint catalog in Kenya. Building footprint catalogs used by Open Energy Maps and Open Building Insights are not identical, requiring the use of a matching algorithm to map the building footprints from the two different sources, which are describing the same building.
Electricity access and consumption estimated are ingested to the data set and provided as an attribute for most buildings in Kenya.
During this step each buildings are categorized into residential or non-residential types based on several criteria.
First, each building being smaller than 20 square meters of area are classified as residential buildings, their type_source is set to area and confidence is set to undefined as it relates to the confidence of the machine learning model, which is not used in this part of the inferencing process.
Next, the custom classification model evaluates the remaining buildings. For each evaluated building, the fields type and type_source are updated to res / non-res and classification_model, respectively. Furthermore, the model information (model_info) and confidence level of the model's classification choice for the given building between 0 and 1 (confidence) are presented.
For more information about this model please see the Methodology page.
Data Sources
-
Sentinel-2
Cloud-Optimized GeoTIFFs
Sentinel-2 satellite images are downloaded from the public S3 bucket Sentinel-2 Cloud-Optimized GeoTIFFs containing satellite images of the Earth’s surface divided into pre-defined tiles.
-
German Aerospace Center
(DLR)
Provides WSF3DV3 data layer that is used for the building height calculation process. -
Google-Microsoft Open Buildings (combined and published by VIDA)
Publicly available data contain a catalogue of buildings with specific coordinates and polygons (i.e. shapes of the buildings) for any given country or region. -
GHS
Settlement Model Grid
Publicly available data downloaded as GeoTIF to categorize buildings into Urban or Rural categories. -
Open Street Map
(OSM)
Publicly available data contain a catalogue of buildings with specific coordinates and polygons (i.e. shapes of the buildings). Data are downloaded as shapefiles (.shp) from geofabrik.de. -
Ookla’s Open Data
Open data sets available on a complimentary basis to help people make informed decisions around internet connectivity and internet speed. -
Overture Maps
Publicly available data contain a catalogue of buildings with specific coordinates and polygons (i.e. shapes of the buildings). -
Open Energy Maps
Providing building-level electricity access and consumption estimates for Kenya.
References
Multimodal Data Fusion for Estimating Electricity Access and Demand by Stephen J. Lee, 2023.
World Settlement Footprint 3D - A first three-dimensional survey of the global building stock by Esch, Brzoska, Dech, Leutner, Palacios-Lopez, Metz-Marconcini, Marconcini, Roth and Zeidler, 2022.