Think of real world features like streets, rivers, houses and cities. Each feature on OpenStreetMap stems from a real world observation. Observations can be made in ground surveys, through sensors like GPS, from satellite imagery, from photos or they can be copied from third party data. Selecting the appropriate source and translating it correctly into OpenStreetMap is immensely important - it’s the definition of great mapping.
When mapping, it’s absolutely important to make sure that the information we add to the map is properly sourced - no matter whether the source is traced like imagery or imported. There’s no use at all to add information to the map that we cannot appropriately source. If you’re not sure whether a source is fine to add, don’t add it.
Here are examples from what sources it is appropriate to add to OpenStreetMap:
- Your firsthand local knowledge for instance from a ground survey
- Satellite imagery explicitly available to OpenStreetMap for tracing - for example Bing or Mapbox Satellite
- GPS tracks explicitly available for OpenStreetMap - for example GPS tracks you collected yourself, or GPS tracks uploaded and available from OpenStreetMap.org
- Mapillary photos
- Original information on web sites such as a store’s address on the store’s web site
Do NOT use these sources:
- Proprietary maps - most paper maps or for example Google Maps, including Google Street View
- Any source data without clear permissive terms
- Whenever you’re not sure whether a source is appropriate to use or maybe incorrect
A lot of OpenStreetMap data is original - collected by community members first hand. This data is mostly gathered in surveys and then in a separate step entered into OpenStreetMap with the same editors we use for remote mapping. Here are typical survey techniques:
- collecting GPS tracks
- collect data with print maps
- using geolocated photos for instance through Mapillary
While the easiest way to get started is to trace remotely off satellite imagery, ground surveyed data is highly valuable as it’s a first hand account from the mapped place and it takes longest to collect.
Mapping party with Ônibus Hacker in Rio de Janeiro, 2012.
The availability of high resolution satellite imagery has allowed OpenStreetMap’s data volume to explode. It’s really easy and fast to go into OpenStreetMap and trace features from imagery. Compared to ground surveys, tracing from satellite imagery is much faster. Obviously, not all information can be traced from imagery - road names, place names, points of interests like schools and cafés aren’t visibile from imagery. In fact, a lot of original data creation today in OpenStreetMap is a combination of tracing and surveying - where as much as possible information is traced from imagery before subsequent ground surveys add detail.
Read the state of satellite imagery from the Wold Bank for a primer on how imagery is used for various analysis.
Third party data
There is a large volume of third party data available out there, mostly from government but also other sources. If they are licensed in a compatible way and of high enough quality, they can be used to improve the map. Imports are usually complex as they involve conflation with existing OpenStreetMap data and data cleaning. While ground survey data and satellite traced data can be added to OpenStreetMap without coordinating with other community members, imports do require a formal proposal and a community peer review. An example for imports are the New York City building and address import that we’ve led or the US Census Bureau TIGER data that has been imported in the United States. Here’s an animation of the progress importing buildings in New York City:
GPS probe data
We’ve already mentioned collecting GPS tracks as a ground surveying techniques. Large bodies of GPS data collected for instance systematically from cars can be used for more than that. They can be bundled and processed to compute missing roads, misaligned roads, wrong oneways, average speed and speed limits. This is a fairly advanced technique but it should be mentioned here as it’s not a direct data import but rather an indirect process of creating a derived data set that can be used to improve OpenStreetMap.