Playful Technology Limited

Ontologies for Named Entity Recognition

Semantic relationships make an ontology useful for Named Entity Recognition

I once had two projects in succession where I was trying to identify named entities in free text. One was successful, the other wasn't, and the reasons why are interesting.

The first project was for True 212, and I used the WikiData ontology.The second was for a pharmaceutical company, and used the MeSH ontology. In each case, a search of the ontology database would return several false positives - for example, searching for "Africa" might return the continent or the Roman Province, whereas "Lagos" could be the capital of Nigeria or a railway station in Portugal. However, WikiData doesn't just store entities, it makes claims about them - that is, it encodes semantic relationships between them. Therefore, if a document mentions both "Lagos" and "Africa", a Named Entity Recognition system based on WikiData can use the fact that Lagos is a city in Africa to determine which Lagos you mean and which Africa you mean.

That unfortunately wasn't the case with MeSH. It didn't encode relationships between the medical terms it documents in any useful way, so it wasn't possible to perform the same sort of disambiguation as with WikiData. The key insights from this are that relationships are meaning, and that before working with an ontology, it's vital to know not just what entities it contains, but what relationships between them it encodes. An ontology of entities can be used for manual tagging, but for analysis, you need an ontology of relationships.

By @Dr Peter J Bleackley in
Tags : #NLP, #Named Entity Recognition, #Ontology, #WikiData, #MeSH,