A natural language (represented by texts generated by native speakers) is considered as a complex system, and the type thereof to which natural languages belong is ascertained. Namely, the authors hypothesize that a language is a self-organized critical system and that the texts of a language are “avalanches” flowing down its word cooccurrence graph. The respective statistical characteristics for distributions of the number of words in the texts of English and Russian languages are calculated; the samples were constructed on the basis of corpora of literary texts and of a set of social media messages (as a substitution to the oral speech). The analysis found that the number of words in the texts obeys power-law distribution.
from #AlexandrosSfakianakis via Alexandros G.Sfakianakis on Inoreader http://ift.tt/2AdZWHG
via IFTTT
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου