Unicef and Mapbox have released the first results of the positive impacts of Giga, the International Telecommunication Union (ITU) and UNICEF’s initiative to connect every school in the world to the internet. The data shows that connectivity has a clear association with economic advantage, and there are promising places to invest to ensure equitable access to the internet to the greatest number of people possible.
The analysis used the Mapbox Isochrone API to answer the question: “How many people can reach these educational facilities within a given travel time?” Read on to learn how to perform a similar accessibility analysis for other places and facilities using the Isochrone API, your own custom data, and sample code to get your project started.
Giga is working to connect every school in the world to the internet, and their first step is to map the location of every school and its current level of connectivity. Researchers have found that connected schools not only create educational experiences, but they also lead to increased economic growth in surrounding communities because schools serve as internet connectivity hubs for people of all ages. In partnership with UNICEF, Mapbox set out to assess the population and economic conditions of people within a reasonable traveling distance to and from connected schools.
Why use Isochrones?
A common approach to accessibility analysis is to use as-the-crow-flies distance to estimate how far someone can travel from any given point by drawing a circular buffer around each facility, and then identifying the communities that lie within that area.
As you can see from the image above, while the circular buffer approach is quick to calculate, it is not a reliable representation of the accessibility. In fact, you may incorrectly assume a portion of the population is covered. Isochrones take into account the road network to assess the true travel time from a given location, including factors like:
- Quality of road network: unpaved, slower speed roads will yield longer travel time
- Availability of roads: walking to nearest road will yield longer travel time
- Topography: natural barriers such as mountains or rivers and roads with steep ascent/descent will influence the distance and travel time.
Popular tools such as Accessmod are powerful, but it requires users to load their own set of road networks or a friction surface. In contrast, by using the Mapbox Isochrone API, we can quickly generate geometries representing the possible area reachable in a certain time period, and take advantage of Mapbox’s living global road network data.
School connectivity in Kazakhstan
With UNICEF, we analyzed internet access across the 7,437 schools and 3.67 million students in Kazakhstan, sorting them into High, Medium, and Low connectivity zones.
We found that 14.78 million people have access to schools with High connectivity (as defined by a school with a >10mbps connection.) The rest of the country is made up of 932,000 people with Medium connectivity (between 1-10mbps), and 40,500 people with Low connectivity (<1mbps.) This leaves 3 million people without any school connectivity at all. We’ve illustrated this data with interactive maps and data layers, which UNICEF, government agencies, and their partners can use to prioritize their investments as they establish and improve connectivity in the areas that need it most.
Run your own accessibility analysis
Accessibility analysis is a useful technique for everything from choosing a retail site location to organizing a vaccination campaign. To use the Isochrone API to calculate accessibility from any set of points, start with these Python scripts. The scripts can be used as is or they can be modified; they can also serve as inspiration for integration with new or existing models.
Requirements to run the scripts are Python 3, the scripts, and two input data sets:
- Input points: The facility sites for which you are calculating accessibility.
- Population raster: A .tiff file with each cell representing the number of people. Common sources for this data include SEDAC, Facebook HD Population, GRID3, and WorldPop.
To run the model, we recommend setting up a Python 3 environment using conda and the supplied environment.yml file. Using the conda package manager will ensure the underlying dependencies are installed cleanly, such as the GEOS Geometry Engine, used by both the Shapely and Geopandas libraries for performing spatial analysis.
Once you have configured the environment, use the run.sh script to run the Python scripts using the default parameters. Then, you can edit the script to modify the model parameters, or run each script individually to achieve a suitable model for your use case and region. For example, for critical health care services, people may be more willing to travel longer distances, so we would increase the ‘travel time’ parameter to 90 minutes from the default value of 30 minutes.
Configure the script using the following parameters, which map to those offered by the Isochrone API:
- Input: A .geojson or .csv file containing points to (required)
- token: A Mapbox API token (required)
- output: The filename to use for saving the output geometry. Default is inputfile_travelprofile_traveltime.json
- minutes: The time in minutes to travel from the point. Default is 30m
- profile: The mode of transport, driving, walking or cycling. Default is driving
- Generalize: a tolerance to use for simplifying the output geometries using the Douglas-Peucker algorithm, in meters. Use 0 (the default) to return the full isochrones, and to avoid occasional self-intersecting geometry errors
- limit: If provided, the script will only read in the first n features of your input data. Useful for testing
- force: Overwrite any existing output files with the same name. Otherwise, the script will error if a file exists.
python isochrones.py --help # for detailed usage info
python isochrones.py --profile=driving --minutes=30
--generalize=0 --token=$MAPBOX_ACCESS_TOKEN sample_data/points.geojson
The output is a GeoJSON collection of all the isochrones for all points in the data sets. The next step is to run analyze.py, which takes those isochrones and the original points, and calculates the total population within that area. Configure the script using the following parameters:
- Input: A .geojson file containing the isochrones generated in the previous step
- pop_tiff: The path to a population raster file (required)
- buffer_distance: The distance in meters used to buffer the input isochrones to reduce their geometric complexity and account for walking time to communities along the road. Default: 1
- limit: If provided, only load a limited number of features, useful for testing small portions of a larger dataset
- output: File name to save output geometry, specify a name with .gpkg to save as Geopackage (useful for visualizing in a GIS, or transferring to a database) or .geojson for Geojson (useful for tiling or visualizing with Mapbox GL JS)
- points: Input facility points, used for buffering to incorporate those living within direct walking distance, and missing roads
- points_buffer_distance: The distance in meters to buffer the facility `points`. Default: 4000
python analyze.py --pop_tiff=sample_data/pop.tiff
Buffer the accessibility areas
Isochrones show travel time from any point along the road network, but in many parts of the world, people live in areas that are adjacent to roads, but not in concentrated settlements. To include these populations in the accessibility area, the script includes a buffering step that expands the shapes to include those who may start their journey with a walk to the nearest road.
Before buffering, the geometry must be projected from the geographic coordinate system into an appropriate projected coordinate system to avoid creating highly distorted shapes that are not equidistant around a point. For this purpose, the pyproj library calculates the nearest UTM zone, which provides a ready-made projection with a reasonable tradeoff of shape, area, and orientation distortions. Once projected, the data can be buffered accurately. We will unproject the result back to the WGS84 geographic coordinate space used by Mapbox and the GeoJSON standard.
In addition to generating buffers around the isochrones, we generate circular buffers around the input points. In rural areas, there may be direct pedestrian routes used by people who live within walking distance of a facility. This optional geodesic buffering step adds additional coverage to ensure we count people living in those areas.
Next, the script dissolves the buffered isochrones and buffered points together into one accessibility “surface” representing all the area reachable from our input points. To save the surface area shape for visualization purposes later, use the --output parameter.
The final analytical step is to calculate the population covered by that shape. The `rasterstats` package provides a zonal_stats function to do just that.
Iterate and visualize
While building the configuration for an accessibility analysis, it is helpful to run quick iterations of analyze.py and view the intermediate results. Use the --output file.gpkg option to save the resulting accessibility surface to a Geopackage format, then view it in a GIS tool such as QGIS. Once the set of input model parameters is set, run the model with --output accessibility.geojson to generate output as GeoJSON for tiling.
To view the results in an interactive map, either upload GeoJSON directly to Mapbox Studio or use the Mapbox Tiling Service to quickly generate vector tiles from the accessibility shapes. For a large country like Kazakhstan, the resulting geometry is quite large; vector tiles make it possible to visualize these layers performantly in ways that GeoJSON cannot. See the README for an example workflow.
Let’s work together
We’d love to hear about your travel time accessibility questions. Let us show you how an isochrones-based approach might help. Get in touch to discuss with our team.