Issue |
EPL
Volume 113, Number 1, January 2016
|
|
---|---|---|
Article Number | 18002 | |
Number of page(s) | 6 | |
Section | Interdisciplinary Physics and Related Areas of Science and Technology | |
DOI | https://doi.org/10.1209/0295-5075/113/18002 | |
Published online | 28 January 2016 |
Scaling laws and model of words organization in spoken and written language
1 School of Electronic Science and Engineering, Ministry of Education Key Laboratory of Modern Acoustics, Institute for Biomedical Electronics Engineering, Nanjing University - Nanjing 210093, China
2 Department of Physics, Boston University - Boston, MA 02215, USA
3 Department of Neurology and Program in Neuroscience, Harvard Medical School, Beth Israel Deaconess Medical Center - Boston, MA 02215, USA
4 College of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications Nanjing 210003, China
5 Harvard Medical School and Division of Sleep Medicine, Brigham and Women's Hospital Boston, MA 02115, USA
6 Institute of Solid State Physics, Bulgarian Academy of Sciences - Sofia 1784, Bulgaria
Received: 3 December 2015
Accepted: 6 January 2016
A broad range of complex physical and biological systems exhibits scaling laws. The human language is a complex system of words organization. Studies of written texts have revealed intriguing scaling laws that characterize the frequency of words occurrence, rank of words, and growth in the number of distinct words with text length. While studies have predominantly focused on the language system in its written form, such as books, little attention is given to the structure of spoken language. Here we investigate a database of spoken language transcripts and written texts, and we uncover that words organization in both spoken language and written texts exhibits scaling laws, although with different crossover regimes and scaling exponents. We propose a model that provides insight into words organization in spoken language and written texts, and successfully accounts for all scaling laws empirically observed in both language forms.
PACS: 89.20.-a – Interdisciplinary applications of physics / 89.65.Ef – Social organizations; anthropology / 89.65.Gh – Economics; econophysics, financial markets, business and management
© EPLA, 2016
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.