Putting natural language text such as news, blogs and tweets on the map is what many people want to do, because that promotes our understanding about the spatial context of the text. It is especially critical for emergency information media, because instant and automatic mapping for the stream of information from many sources allows us to respond quickly to the situation.
This mapping is relatively easy if we have machine-readable location metadata in addition to text, but usually this cannot be expected because natural language text can satisfy human-readability without structured information. Although the geocoding of a structured address became practical, geo-tagging of natural language text still remains to be a difficult task.
Hence the purpose of thie project is to make a geo-tagging system giving location metadata to natural language text using geographic information systems (GIS) and natural language processing (NLP). Moreover, to establish a ecosystem to support sustainable development of the system, we also focus on developing dictionaries of geographic named entities through collaboration with linked open data initiatives and participatory / voluntary systems, and through the development of libraries that can be used by other frameworks for web development.
Copyright 2011, Asanobu KITAMOTO, National Institute of Informatics.