The tool downloads OpenStreetMap QA Tile information and satellite imagery tiles and saves them as an .npz file for use in Machine Learning training.
satellite imagery from Mapbox and Digital Globe
pip install label_makerNote that running this library this requires tippecanoe as a "peer-dependency" and that command should be available from your command-line before running this.
Before running any commands, it is necessary to create a config.json file to specify inputs to the data preparation process:
{
"country": "togo",
"bounding_box": [1.09725, 6.05520, 1.34582, 6.30915],
"zoom": 12,
"classes": [
{ "name": "Roads", "filter": ["has", "highway"] },
{ "name": "Buildings", "filter": ["has", "building"] }
],
"imagery": "http://a.tiles.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}.jpg?access_token=ACCESS_TOKEN",
"background_ratio": 1,
"ml_type": "classification"
}country: The OSM QA Tile extract to download. The value should be a country string matching a value found inlabel_maker/countries.txtbounding_box: The bounding box to create images from. This should be given in the form:[xmin, ymin, xmax, ymax]as longitude and latitude values between[-180, 180]and[-90, 90]respectively. Values should use the WGS84 datum, with longitude and latitude units of decimal degrees.zoom: The zoom level to create images as. This functions as a rough proxy for resolution. Values should be given as integers.classes: An array of classes for machine learning training. Each class is defined as an object with two required properties:name: class namefilter: A Mapbox GL Filter to define any vector features matching this class. Filters are applied with the standalone featureFilter from Mapbox GL JS.
imagery: A template string for a tiled imagery service. Note that you will generally need an API key to obtain images and there may be associated costs. The above example requires a Mapbox access tokenbackground_ratio: For single-class classification problems, we need to download images with no matching class. We will downloadbackground_ratiotimes the number of images matching the one class.ml_type: One of"classification","object-detection", or"segmentation". For the final label numpy arrays (y_trainandy_test), we will produce a different label depending upon thetype. -"classification": An array of the same length asclasses. Each array value will be either1or0based on whether it matches the class at the same index"object-detection": An array of bounding boxes of the form[xmin, ymin, width, height, class_index]. In this case, the values are not latitude and longitude values but pixel values measured from the upper left-hand corner. Each feature is tested against each class so if a feature matches two or more classes, it will have the corresponding number of bounding boxes created."segmentation": An array of shape(256, 256)with values matching the class_index label at that position. The classes are applied sequentially according toconfig.jsonso latter classes will be written over earlier class labels.
label-maker is most easily used as a command line tool. There are five commands documented below. All commands accept two flags:
-dor--dest: string directory for storing output files. (default:'data')-cor--config: string location of config.json file. (default'config.json')
Example:
$ label-maker download --dest flood-monitoring-project --config flood.jsonDownload and unzip OSM QA tiles
$ label-maker download
Saving QA tiles to data/ghana.mbtiles
100% 18.6 MiB 1.8 MiB/s 0:00:00 ETARetiles the OSM data to the desired zoom level, creates label data (labels.npz), calculates class statistics, creates visual label files (either GeoJSON or PNG files depending upon ml_type). Requires the OSM QA tiles from the previous step.
$ label-maker labels
Determining labels for each tile
---
Residential: 638 tiles
Total tiles: 1189
Write out labels to data/labels.npzDownloads example satellite images for each class. Requires the labels.npz file from the previous step. Accepts an additional flag:
-nor--number: integer number of examples images to create per class. (default:5)
$ label-maker preview -n 10
Writing example images to data/examples
Downloading 10 tiles for class ResidentialDownloads all imagery tiles needed for training. Requires the labels.npz file from the labels step.
$ label-maker images
Downloading 1189 tiles to data/tilesBundles the satellite images and labels to create a final data.npz file. Requires the labels.npz file from the labels step and downloaded image tiles from the images step.
$ label-maker package
Saving packaged file to data/data.npzOnce you have a packaged data.npz file, you can use numpy.load to load it. As an example, here is how you can supply the created data to a Keras Model:
# the data, shuffled and split between train and test sets
npz = np.load('data.npz')
x_train = npz['x_train']
y_train = npz['y_train']
x_test = npz['x_test']
y_test = npz['y_test']
# define your model here, example usage in Keras
model = Sequential()
# ...
model.compile(...)
# train
model.fit(x_train, y_train, batch_size=16, epochs=50)
model.evaluate(x_test, y_test, batch_size=16)For more detailed walkthroughs, check out the examples page
This library builds on the concepts of skynet-data. It wouldn't be possible without the excellent data from OpenStreetMap and Mapbox under the following licenses:
- OSM QA tile data copyright OpenStreetMap contributors and licensed under ODbL
- Mapbox Satellite data can be traced for noncommercial purposes.