During the last few years, collaborative tagging systems like Flickr, Del.icio.us or Bibsonomy got more and more popular because they allow users to easily upload resources like photos, bookmarked URLs and BibTeX entries and to share them with other users. Additionally, the users can organize their resources by assigning tags or keywords to them. Over time, one can observe the emergence of a loose categorization system which can be used for retrieving specific resources and navigating through the large set of resources, which is frequently called a folksonomy.
Thus, folksonomies constitute intriguing dynamic systems constructed by the collaboration and interaction of its users. They offer new possibilities for finding resources. But at the same time they constitute a challenge for existing models of categorization and retrieval of resources because the usage of tags at the micro-level of the individual user and at the macro-level of groups of users and of the complete user community has neither been understood nor has been put in a relationship with each other.
Recent research has brought forward an interesting temporal perspective on the understanding of folksonomies by viewing them as dynamic stochastic systems with memory. But this perspective abstracts away the background knowledge common to folksonomy users putting too much emphasis on imitation of other users and random generation of vocabulary. We advocate the hypothesis that both components, i. e. the background knowledge and the imitation, are needed for explaining and understanding the tagging behavior of users. We describe our proposal in the technical report below. It better approximates behavior found in actual tagging systems and it thus gives us more meaningful insights into the tagging process. For example, it helps us to distinguish between effects in the tagging system caused by the natural language behavior of users and effects that are specific to the user interface of tagging systems.
In the following, we provide for each of the co-occurrence streams from the technical report three files:
|Tag||Tag Assignments||Users||Tags||Resources||Stream||URLs||Web Corpus|
Finally, we provide the Java software that was used for doing the simulations described in the technical report and the generated artificial tag streams: