Our geocoder combines address points and interpolation lines from hundreds of open and proprietary datasets around the world. When searching for an address, we first look for a matching address point, right to the rooftop or parcel level. These results represent the highest level of accuracy a result can return. But when a building is very new, the address is wrong or unofficial, or data isn’t captured well in the area, we might not have a point for it. When there isn’t an address point we use the street network to interpolate where the address should be.
Interpolation is extremely important to increase the number of successful results we return. Each block has start and end number ranges, and our geocoder can use them to make a “best guess” for where a house should be along the street. This allows us to return results that are geographically close when an address point is not available.
Example of OpenAddress Point Data
Unfortunately, open interpolation data can be hard to come by outside of North America. So we are building out an open source method to produce an interpolation layer given a street network and address points (check out our code on Github).
Address points like the ones above are grouped together by street name and then matched with a street using a probabilistic model combining geographic proximity and textual similarity. Winding order of the street network is then normalized and parity of the left and right points is then determined. Then using some of the same techniques used to power interpolation in our open source geocoder Carmen we determine the points closest to the beginning and the end of the street. Finally, we add this data to the street network and discard the points.
Here you can see addresses are being separated into odd and even groups, matched to the street network. The start/end nodes are then determined – green for odd, red for even. The start and end information is then combined with the street network to create the interpolation line.
This isn’t the only exciting thing happening to our geocoder. We are looking for someone who can help us find data, expand coverage, run benchmarks and launch features. Reach out to me @nickingalls or email@example.com with any questions!