Core of "dunning" identification method.

Package Specification

Package includes profile creating and also identification of text documents using these profiles. Both these processes are encapsulated in sk.fiit.nazou.nalit.nalit_interface.RunDunning class.

Profile creating process is following: CreateHash class loads input (training) text file, parses Markov chains from this file and stores them in MultiHashtable class — occurences of Markov chains are stored as integers. Next, these integer values in Multihashtable are converted to Doubles — in this process, new hashtable is created, class ProbabilityHashtable. Finally, this ProbabilityHashtable (=profile) is stored to file as Object using sk.fiit.nazou.nalit.convert_encoding.LoadOrSaveObject class.

Text file language identification is done using ReadTextAndGetProbability class — first, selected profiles are loaded from files, and then for each input text file is computed, which profile (=language) fits bets. Alternatively, input text to be identifies is passed directly as String.