The definitive Map API buying guide
This section thoroughly explains the Geocoding API features you should pay attention to before choosing an appropriate vendor.
Geocoding service converts addresses into geographic coordinates.
Reverse Geocoding service does the opposite - converts geographic coordinates into a human-readable address.
Thereby, forward geocoding is typically more complicated than reverse geocoding because it comprises non-trivial elements, such as contextualization, localization, search results ranking, etc.
These are the essential characteristics of a good Geocoding service and its variants:
Single line text search is the conditio sine qua non.
The one-line free text search is the industry standard for Forward Geocoding. It's the most important feature you should seek in Geocoding API because free text search matches real-world users' way of thinking and behavior. Anything else, e.g., split search by country, city, and address, looks very odd to the end-user.
However, the contextualization of real-world user's search phrase is anything but trivial task. The geocoding engine needs to guess if the specific word at a given position is a city name, street name, PLZ, or something else.
Country-specific addressing schemes even further complicate the geocoding process. For example, in western European countries, the house number comes after the street name and exactly opposite in the USA. Search phrases may also include different addressing formats, such as postal codes in the UK or Makani numbers in UAE. Contextualization of these formats also needs to be addressed.
I’m feeling lucky is the expected behavior. The very first results must be the exact match of the search phrase.
Good search ranking is the second most important feature of the Geocoding engine. Not only the best-fit results must appear first, but also the most relevant ones.
The search for generic objects, e.g., parking, or street names, should return the best-fit place closest to your current location first. However, a combination of a street with a specific city should return that exact place, regardless of your location. Search for relevant places, such as "Burj Khalifa" should return the Burj Khalifa building as the first result, and other places with Burj or Khalifa in their name located in your vicinity should come after.
The administrative hierarchy is an equally essential ingredient in geocoding and reverse geocoding variants.
The administrative hierarchy provides the connection between a street and its corresponding suburb, city, area, and country, and many use cases rely on this information. For example, real-estate ads seemingly require only the "plain map display" from the end user's perspective. Under the hood, to filter and display ads within a certain suburb or city, the connection between administrative hierarchy must be present.
Especially OSM data lacks the administrative hierarchy. The reason being is that OSM does not impose that each OSM way or point-like object must contain a parent area. This brings many challenges for OSM-based Map API vendors, and acquiring quality administrative boundaries is the first step. Even mediocre administrative boundaries will result in roads assigned to the wrong areas, from suburbs, and all the way up. Naive approaches are based on Nominatim geocoding engine and always result in slow performances.
The search engine should support search phrases in different writing systems and not be limited to Latin inputs only. Due to its complexity, many Geocoding algorithms do not have support for East Asian and Indian scripts. To give you a glimpse: the same letter in Arabic can have different behavior depending on its position in the word. Therefore, a Geocoding algorithm must take this into consideration faced with a partial phrase. Indian reverse geocoding , compared to the rest of the world, Indian addressing formats are incredibly unstructured. Indian addresses are based on neighborhood names, nearby points of interest (nearby relevant buildings), frequently combined with sketchy directions, for example, "Raheja Atlantis Sector 31 Gurgaon near Cafe O2". This format is convenient for locals but a nightmare for a Geocoding algorithm to parse. On top of that, Reverse Geocoding algorithms should also provide the formatted address in the same format.
Handling partial search phrases is another important feature of Geocoding API. Real-world users typically enter search terms in some sorts of text boxes, and the goal is to provide the best possible suggestion as fastest as possible, meaning with as little as possible keystroke inputs. The contextualization of partial search phrases becomes even more difficult.
Real-world users will enter misspelled phrases, so the geocoding engine needs to implement some sort of fuzzy logic to allow spelling errors and return prefix matches.
Address by itself is sometimes just not enough, and a nearby landmark (POI) is an excellent way to enrich forward and reverse geocoding results to an end-user. Note that the POI database changes faster than the road network, and maintaining such a database is harder; therefore, the results significantly vary between map data.
All of the above features inevitably complicate geocoding algorithms and decrease performances. So, it's always desirable to test the APIs' speed and quality upfront.
If the goal is an on-premisses solution, the hardware resources must be minimal - either commodity hardware or modest cloud-based virtual machines.
For example, due to its organization, Nominatim's geocoding and reverse geocoding are infamous for their poor performance, both during the setup and production. Nominatim's prerequisites for the entire world database are 64GB RAM, 900GB NVMe disks, and just the import process takes around 2 days (7-8 days on traditional spinning disks).
Check how Compact Maps ticks the boxes of your Map API buying checklist!