International School on Semiotic Dynamics, Language and Complexity
Directors: Vittorio Loreto and Luc Steels
Ettore Majorana Foundation and Center For Scientific Culture
Erice, 12-15 December 2005


Themes


Here you can find the four major themes and the invited speakers of the school.

Issues and Challenges / Semiotic Dynamics on the Web / Human Language Dynamics / Complex Systems Models

Issues and Challenges


Tecumseh Fitch (Saint Andrews Edinburgh - Max Planck EVA Leipzig)

Critical Transitions in the Evolution of Language


Human language stands out from the communication systems of other species in several distinct ways. First, unlike calls in many other species, and unlike laughter, screams or crying in our own species, linguistic signals are learned. Although learning of complex vocalizations is unique to humans among primates, it is shared with more distant relatives including seals, birds and whales. However, language differs from whale or bird song in that we use these complex signals to convey equally complex compositional meanings. In contrast, the meanings of songs in other species are relatively simple and holistic (concerning species identity, mate attraction and territory defense). This semanticity is the aspect of language that makes it unique, and the selective forces that could lead to intentional encoding and sharing of information in this way remain a subject of debate. Based on the comparative data, I suggest that these two critical characteristics of language require different selective regimes to evolve. The first transition, the acquisition of vocal learning, is easier to explain. Vocal imitation has evolved repeatedly in nature, typically in the context of sexual selection. This renders plausible Darwin's hypothesis that the early stages of pre-linguistic hominid communication functioned were songlike. In sharp contrast, the intentional encoding of information in signals, so conspicuously absent in song, is typically found in nature in kin-communication situations. Thus, alarm calls in many birds and vertebrates are preferentially directed towards kin, and are often not omitted entirely if only non-kin are present ("audience effects"). A more interesting example is the honey-bee dance "language", the only nonhuman communication system that can relay detailed information about the spatial location of distant objects. This system clearly evolved in a kin-communication context, as honeybee dancers are communicating with their sisters, who are their closest kin. These data suggest that kin selection via kin communication was a crucial but neglected element in the evolution of human language, and that a two stage model of language evolution can parsimoniously account for these facts.

Stefano Nolfi (CNR Rome)

Behaviour as a Complex Adaptive System: Coordinated Behaviours and the Emergence of Simple Communication Forms


In this presentation we will illustrate the complex and adaptive system nature of behaviour. The complex system nature of behaviour derives from the fact that behaviour and behavioural properties are phenomena that occur at a given time scale and result from several non-linear interactions occurring at a smaller time scale. Moreover, it is due to the fact that behaviour typically displays a hierarchical organization in which the interaction between low level properties, arising from lower level fine-grained interactions, lead to higher level properties and from the fact that high level properties also affect the lower level properties. These aspects will be illustrated by analysing two concrete examples in which a population of mobile robots, evolved for the ability to solve a cooperative problem, develop an ability to display a coherent behaviour and a simple communication system.

Luciano Pietronero (University of Rome)

Luc Steels (Sony CSL Paris - University of Brussels, VUB)

What is Semiotic Dynamics and Why is it a Hot Topic Today?


Semiotic Dynamics is a new research field studying how semiotic systems emerge and evolve in groups of humans or artificial agents. Semiotic systems (such as human natural languages or collaborative tagging systems on the web) relate symbols, meanings and the objects referred to by the symbols through meanings. Semiotic Dynamics uses the techniques of complex systems science to develop theoretical insights into evolving semiotic systems and the techniques of AI to simulate the emergence of semiotic systems in artificial agents (including robots) and support human semiotic dynamics through collaborative web tools. In this talk I trace the history of semiotic dynamics, review some early results, and survey open issues and challenges.

Semiotic Dynamics on the Web


Ciro Cattuto (Centro Enrico Fermi Rome)

Semiotic Dynamics and Collaborative Tagging

- PDF
A new paradigm has been quickly gaining ground on the World-Wide Web: Collaborative Tagging. In web applications like Del.icio.us, Flickr, CiteULike, Connotea, users manage their personal collection of online resources by enriching them with semantically meaningful information in the form of freely chosen tags. Despite the anarchic and selfish nature of users' behavior, the global dynamics of these systems exhibits collaborative aspects that lead to a self-organised categorization ("folksonomy") of a large and evolving body of online resources. Here we approach collaborative tagging in terms of semiotic dynamics, that is we regard tags as basic dynamical entities and focus on the evolution and emergence of tagging patterns. We collect data from a popular online system and select a semantic context by extracting all the resources associated with a given tag. On studying the frequency-rank distribution of tags co-occurring with the selected one, we find a heavy-tailed behavior which is the mark of human activity and observe properties that point to an emergent hierarchy of tags. We introduce a stochastic model embodying two main aspects of collaborative tagging: (i) a fundamental multiplicative character closely related to the idea that users are exposed to each other's tagging activity; (ii) a notion of memory - or aging of resources - in the form of a heavy-tailed access to the past state of the system. Remarkably, our simple modelling is able to account quantitatively for the measured frequency-rank properties of tag association, with a surprisingly high accuracy. This is a clear indication that collaborative tagging is able to recruit the uncoordinated actions of web users to create a predictable and coherent semiotic dynamics at the emergent level.

Steffen Staab (University of Koblenz-Landau)

Learning Categorizations from Text

- Power Point
We present two models for learning categorizations from text: The first is based on formal concept analysis (FCA): We parse texts in a shallow way in order to derive a description of terms based on the syntactic context in which they appear. FCA analyzes the correlation between a concept's intensional descriptions and its extent in order to come up with a minimal lattice of concepts. We exploit the contextual specification of terms to derive a lattice of categorizations. Second, we consider the Web as a large knowledge resources that contains explicit knowledge about categorizations in form of key phrases. An example key phrase in a particular context is "a car is a vehicle", leading to the corresponding formal representation. We show empirical results in order to determine their current quality and to outline potential for improvements.

Luis von Ahn (Carnegie Mellon University)

ESP: Labeling Images with a Computer Game

- Power Point
The ESP Game is a seductive online game - many people play over 40 hours a week! - and when people play they help determine the contents of images on the Web by providing meaningful labels for them (e.g., an image of a house with a tree gets the labels "house" and "tree"; an image of Britney Spears gets the labels "britney spears" as well as "woman", "girl", "blonde", "singer", etc.). The labels collected by the game are extremely accurate and are guaranteed to be correct even if the people who play the game don't want them to be so. In a few months, the ESP Game has collected millions of image labels. If the game is deployed at a popular site, most images on the Web can be labeled in a matter of weeks. Attaching proper labels to images on the Web would allow for more accurate image search engines, would improve the accessibility of websites (by providing descriptions of images to visually impaired individuals), would help Web browsers block pornography, and would provide valuable training data for computer vision algorithms. This approach to labeling images is simple but novel: rather than using computer vision techniques that don't work well enough, we encourage people to do the work for us by taking advantage of their desire to be entertained.

Human Language Dynamics


Simon Garrod (University of Glasgow)

Interactive alignment and routinization as mechanisms for language change

- Power Point
Although it is apparent that languages constantly undergo change (Labov, 2000), it is still not clear by what mechanism this occurs. In this talk I propose such a mechanism based on Pickering and Garrod's (2004) interactive alignment account of language processing in dialogue. First, I will discuss interactive alignment and the related process of routinization by which interlocutors fix local interpretations for expressions in dialogue (Pickering & Garrod, 2005). I will then consider experimental evidence for routinization and show how it may account for how communities of conversationalists converge on particular ways of using the language in particular contexts. Finally, I consider how routinization may serve as a general mechanism responsible for both language acquisition and language change.


Anatol Stefanowitsch (University of Bremen)

Quantitative Corpus Linguistics and Language Change


Quantitative corpus linguistics is a newly emerging empirical framework that combines a firm commitment to rigorous statistical methods with a linguistically sophisticated perspective on language structure and use. One of the central aims of this empirical framework is to detect patterns in language use that escape the naked eye (and even traditional, more qualitatively oriented text-based methods).
These patterns are especially difficult to detect in large and complex data sets. Language change can be seen as a dynamic succession of highly complex linguistic situations and thus language change data at any given point in time constitute precisely the kind of objects that call for statistical analysis (cf. Mair 2005; Stefanowitsch, in prep. a, b).
One particularly intriguing challenge for diachronic linguistics is the identification of incipient grammaticalization processes, i.e., processes of semantico-grammatical change waiting to happen. After discussing some recent quantitative studies of language change (e.g. Krug 2003, Koops 2004, Mair 2005, Stefanowitsch, in prep. a), I will present an in-depth attempt to identify an incipient grammaticalization process in Present-Day English (cf. Stefanowitsch, in prep. b).
Using the framework of collostructional analysis (Stefanowitsch and Gries 2003, 2005; Gries and Stefanowitsch 2004a, b), I will look at one grammatical context that has led to the grammaticalization of progressive aspect in a number of languages: the expression (be) in the middle of (see, e.g., Koops 2004). I will show that this expression shows signs of incipient grammaticalization that can only be uncovered by the statistical analysis of large amounts of data but that seem to constitute the beginning of a grammaticalization path towards a new aspectual system for English.


Bernard Victorri (Lattice-CNRS, ENS Paris)

Polysemy, Construction of Meaning and Language Change


The construction of meaning in language is a complex process, which cannot be reduced to a simple bottom-up computation. Most linguistic units (lexical words and grammatical markers) are polysemous: they have several distinct related meanings. Their meaning in a given sentence depends upon the meaning of the other linguistic units within the sentence (the so-called 'co-text'). The global meaning of the sentence is therefore the result of a dynamical process, in which linguistic units interact, leading to a more or less precise meaning for each unit as well as a global meaning for the whole sentence. The model presented here is designed to capture the dynamical nature of this process by using the mathematical framework of dynamical systems (Victorri & Fuchs 1996). Using an electronic dictionary of synonyms and a large corpus, we can compute the different meanings of a polysemous unit in different sentences (Jacquet et al. 2005). The same framework can be used for modelling semantic lexical change, stressing the role of polysemy and partial synonymy in natural language efficiency (Victorri 2004).



Complex Systems Models


Lada Adamic (University of Michigan - Ann Arbor)

The Dynamics of Viral Marketing

- PDF
We present an analysis of a person-to-person recommendation network, consisting of 4 million people who made 16 million recommendations on half a million products. We observed the propagation of recommendations and the cascade sizes, which can be explained by a stochastic model. We then established how the recommendation network grows over time and how effective it is from the viewpoint of the sender and receiver of the recommendations. While on average recommendations are not very effective at inducing purchases and do not spread very far, there are product and pricing categories for which viral marketing seems to be very effective.

This is joint work with Jure Leskovec at CMU and Bernardo Huberman at HP Labs.

Alain Barrat (University of Paris)

Introduction to complex networks

- Power Point
In this talk I will review some aspects of the recent research activity on complex networks. Empirical evidence have shown the inadequacy of the usual homogeneous random graph paradigm for describing many real-world networks, in contexts ranging from social sciences to biology or computer science. New paradigms for modelization have emerged, and a growing body of work is concerned with the consequences that complex topology have on the dynamics taking place on networks, from resilience problem to epidemiology or evolution of opinion dynamics.

Jim Crutchfield (University of California - Santa Fe Institute)

Structure, Meaning and Function: A Dynamical Systems Perspective

- PDF


I will outline a theory of measurement semantics for a communication channel in which the receiver adaptively builds models of the source. The main question answered by this is, What semantic content does a particular measurement at a particular time contain? A by-product is that one comes to see how the structure of information sources, on the one hand, is central to the semantic content that can be attributed by an observer, on the other. I will give some examples of dynamical systems in which emergent structure, semantic information processing, and embedded function can be delineated.


Ramon Ferrer I Cancho (University of Rome)

Some Insights into the Complexity of Human Language and Other Communication Systems from Information and Network Theory


Word frequencies are known to follow Zipf's law. The law is an apparently universal property of world languages. Zipf's law for word frequencies does not have a good reputation among scientists because it can be reproduced by very simple mechanisms. It has been argued that those simple models make Zipf's law absolutely useless for understanding human language and the communication systems of other species. We show how simple models fail to explain Zipf's law in human language. We review recent models of Zipf's law for word frequencies suggesting that human language operates at a critical balance between maximizing the information transfer and minimizing the cost of word use. We review the evidence of Zipf's law in animal communication systems and indicate how Zipf's law can provide clues about the principles underlying the communication systems of humans and other species.

Vittorio Loreto (University of Rome)

Complex Systems Approach to Semiotic Dynamics

- Power Point
What processes can explain how very large populations are able to converge on the use of a particular word or grammatical construction without global coordination? Answering this question helps to understand why new language constructs usually propagate with a rather sudden transition towards global agreement. In this talk I will discuss a class of microscopic models of communicating autonomous agents performing language games without any central control. These systems undergo a disorder/order transition, going trough a sharp symmetry breaking process to reach a shared set of conventions. Before the transition, the system builds up non-trivial scale-invariant correlations, for instance in the distribution of competing synonyms, which make the system ready for the transition towards shared conventions. If observed on the time-scale of collective behaviors, the transition becomes sharper and sharper with system size. These results not only help explaining why human language can scale up to very large populations but also suggest ways to optimize artificial semiotic dynamics and design new technologies that support or orchestrate self-organizing communication systems.

Giorgio Parisi (University of Rome)

The Complexity From a Point of View of a Physicist


In this talk I will try to express my view point on why complexity is interesting for physics and why the physical approach is interesting to complexity, A few examples will be presented, stressing both the accomplished results and the the open problems, The importance of interdisciplinary research in this context will be stressed.