Toponym resolution

In geographic information systems, toponym resolution is the relationship process between a toponym, i.e. the mention of a place, and an unambiguous spatial footprint of the same place.
The same geographic names have historically been used by emigrant settlers to denote their new homes, leading to referential ambiguity of place names. Sometimes, the original name gets modified. In many cases, a name is reused without modification. To map a set of place names or toponyms that occur in a document to their corresponding latitude/longitude coordinates, a polygon, or any other spatial footprint, a disambiguation step is necessary. A toponym resolution algorithm is an automatic method that performs a mapping from a toponym to a spatial footprint.
Most methods for toponym resolution employ a gazetteer of possible mappings between names and spatial footprints.

Resolution process

The "unambiguous spatial footprint of the same place" of definition can be in fact unambiguous, or "not so unambiguous". There are some different contexts of uncertainty where the resolution process can occur:

When the evidence is geographical and with no uncertainty. For example, to obtain the country name of a photo place, when the place is a GPS position, at 1000 km far from country borders.
When the evidence is geographical, but with considerable uncertainty. Imagine a similar scenario where the GPS error is 100 meters and the place is near from, ~100 meters, of the country borders.
When the evidence is only textual. Imagine a letter where the narrator is a tourist telling about his trip after he returned from vacation. The only evidences are textual, in the narrative.
Mixed sources of evidence: more than one evidence, no one precise.
From geographical evidence

The toponym resolution sometimes is a simple conversion from name to abbreviation, in special when the abbreviation is used as standard geocode. For example, converting the official country name Afghanistan into an ISO country code, AF.
In annotating media and metadata, the conversion using a map and the geographical evidence, is the most usual approach to obtain toponym, or a geocode that represents the toponym.

From textual evidence

In contrast to geocoding of postal addresses, which are typically stored in structured database records, toponym resolution is typically applied to large unstructured text document collections to associate the locations mentioned in them with maps.
The process of annotating media using spatial footprints is known as Geotagging. In order to automatically geotag a text document, the following steps are usually undertaken: toponym recognition and toponym resolution.
Toponym recognition can be considered as a special case of named-entity recognition where the objective is to merely derive location entities. However, the result of named-entity recognition can be further improved using hand-crafted rules or statistical rules.
For obtaining location interpretations, resolution models tend to leverage gazetteers such as GeoNames and OpenStreetMap. A naive approach to resolve toponyms is to pick the most populated interpretation from the list of candidates. For example, in the following excerpt:
The naive approach seems viable since toponyms Toronto and London refer to their most common interpretation, located in Canada and Britain respectively, whereas in the following piece from a news article:
This approach fails to pinpoint toponym London as the city located in Ontario, Canada. Hence, selecting the highest population cannot work well for toponyms in a localized context.
Additionally, toponym resolution does not address metonymy in general. Nonetheless, a resolution technique can still disambiguate a metonymy reference as long as it is identified as a toponym in the recognition phase. For instance, in the following excerpt:
Canada indicates a metonymy and refers to "the government of Canada". However, it can be identified as a location by a generic named-entity recognizer and thus, a toponym resolver is able to disambiguate it.

Approaches

Toponym resolution methods can be generally divided into supervised and unsupervised models. Supervised methods typically cast the problem as a learning task wherein the model first extracts contextual and non-contextual features and then, a classifier is trained on a labelled dataset. Adaptive model is one of the prominent models proposed in resolving toponyms. For each interpretation of a toponym, the model derives context-sensitive features based on geographical proximity and sibling relationships with other interpretations. In addition to context related features, the model benefits from context-free features including population, and audience location. On the other hand, unsupervised models do not warrant annotated data. They are superior to supervised models when the annotated corpus is not sufficiently large, and supervised models may not generalize well.
Unsupervised models tend to better exploit the interplay of toponyms mentioned in a document. The Context-Hierarchy Fusion model estimates the geographic scope of documents and leverages the connections between nearby place names as evidence to resolve toponyms. By means of mapping the problem to a conflict-free set cover problem, this model achieves a coherent and robust resolution.
Furthermore, adopting Wikipedia and knowledge bases have been shown effective in toponym resolution. TopoCluster models the geographical senses of words by incorporating Wikipedia pages of locations and disambiguates toponyms using the spatial senses of the words in the text.

Geoparsing

Geoparsing is a special toponym resolution process of converting free-text descriptions of places into unambiguous geographic identifiers, such as geographic coordinates expressed as latitude-longitude. One can also geoparse location references from other forms of media, for examples audio content in which a speaker mentions a place. With geographic coordinates the features can be mapped and entered into Geographic information systems. Two primary uses of the geographic coordinates derived from unstructured content are to plot portions of the content on maps and to search the content using a map as a filter.
Geoparsing goes beyond geocoding. Geocoding analyzes unambiguous structured location references, such as postal addresses and rigorously formatted numerical coordinates. Geoparsing handles ambiguous references in unstructured discourse, such as "Al Hamra," which is the name of several places, including towns in both Syria and Yemen.
A geoparser is a piece of software or a service that helps in this process. Some examples:

automated georeferencing
– Semi-automatic georeferencing
– Freely available GIS information for areas outside of the U.S.A. and Antarctica, updated monthly by the National Geospatial-Intelligence Agency and the U.S. Board on Geographic Names
– Freely available database containing information on almost 2 million physical features, places, and landmarks in the U.S.A.
– CLAVIN is an open source software package for document geotagging and geoparsing that employs context-based geographic entity resolution.
– Geoparser.io is a web service that identifies places mentioned in text, disambiguates those places, and returns GeoJSON with detailed metadata about the places found in the text.
– Geocode.xyz is a web service that identifies both place names and street addresses mentioned in text.
– geoparsepy is a free Python geoparsing library supporting free text location identification and disambiguation using the OpenStreetMap database

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Toponym resolution

Resolution process

From geographical evidence

From textual evidence

Approaches

Geoparsing