Part of speech (POS) tagging in Maltese is carried out using TnT. TnT is an implementation of a statistical part of speech tagger, by Thorsten Brants. The model is trained on manually annotated texts, reaching an accuracy of 96%. Below is a list of the tags which are used, along with a description.
The part of speech tagger can be used in two ways: online, as well as integrated in other applications as a web-service.
Online graphical user interface
A graphical user interface is available here, containing different levels of tagging which can be applied to a given text.
The POS tagger is also available as a web-service. The WSDL link is http://metanet4u.research.um.edu.mt/services/MtPOS?wsdl.
The service has two methods which can be invoked:
- String tagOneWordReturn(String text)
- String tagParagraphReturn(String text)
Both methods take a string as input, that being the text to be tagged, and return another string of that text tagged. The difference between the two is tagOneWordReturn returns the output as one word per line, while the other returns it as tagged paragraphs (if there was any in the input string).
The format of the output is as follows: