Using social network analysis to enhance information retrieval systems
Type
conference paper
Date Issued
2008-09-12
Author(s)
Abstract (De)
It is an ongoing trend that people increasingly reveal very personal information on social network sites in particular and in the World Wide Web in general. As this information becomes more and more publicly available from these various social network sites and the web in general, the social relationships between people can be identified. This in turn enables the automatic extraction of social networks. This trend is furthermore driven and enforced by recent initiatives such as facebook's connect, MySpace's data availability and Google's FriendConnect by making their social network data available to anyone.
Furthermore the current development of the World Wide Web, termed as "Web 2.0" by O'Reilly, enables increasingly more people to publish information without profound technical knowledge. Blogs for example have gained a lot of attention in recent years. The whole blogosphere including more than 70 million blogs forms a reasonable body of information and knowledge. Additionally, hypertext links made between blogs have been described as conversation, affiliation, or readership, implying a form of implicit social structure. That means that the publicly available information is increasingly annotated with author information which allows the extraction of social networks, too.
These recent developments described above, together with increasing computing power and an increased amount of freely available scientific publication data in diverse databases, has led to a dramatic growth in interest for social network analysis (SNA) and in network analysis in general. However, there is little attention about the application of SNA for use in information retrieval systems. Recent studies suggest that the social network of a person has a significant impact on his/her information acquisition. Additionally SNA offers methods that enable the identification of important persons within social networks, who could have a significant influence on the importance of certain information.
Therefore the paper proposes the application of available social network data in the context of information retrieval systems. An outline of the research design for the exploration of meaningful sources for social network extraction and the impact of meaningful SNA methods and measures in the context of information retrieval systems is presented. An evaluation of these methods and measures is conducted on ScientificCommons.org, a search platform for open access publications with more than 21 million publications and 8.5 million extracted authors and their co-authorship network.
The contribution of this paper is based on an analysis of online information sources in terms of their usability for the extraction of social networks and a research framework for the analysis and application of social network methods to information retrieval systems. The research framework was applied to the co-authorship network of scientific publications. The co-authorship network was used to compute different centrality measures of the authors, which then in turn have been used to refine the relevance ranking of publications within information retrieval systems. The performance of the different rankings based on the different centrality measures has been evaluated by the measurement of the click-through performance in the search results.
Furthermore the current development of the World Wide Web, termed as "Web 2.0" by O'Reilly, enables increasingly more people to publish information without profound technical knowledge. Blogs for example have gained a lot of attention in recent years. The whole blogosphere including more than 70 million blogs forms a reasonable body of information and knowledge. Additionally, hypertext links made between blogs have been described as conversation, affiliation, or readership, implying a form of implicit social structure. That means that the publicly available information is increasingly annotated with author information which allows the extraction of social networks, too.
These recent developments described above, together with increasing computing power and an increased amount of freely available scientific publication data in diverse databases, has led to a dramatic growth in interest for social network analysis (SNA) and in network analysis in general. However, there is little attention about the application of SNA for use in information retrieval systems. Recent studies suggest that the social network of a person has a significant impact on his/her information acquisition. Additionally SNA offers methods that enable the identification of important persons within social networks, who could have a significant influence on the importance of certain information.
Therefore the paper proposes the application of available social network data in the context of information retrieval systems. An outline of the research design for the exploration of meaningful sources for social network extraction and the impact of meaningful SNA methods and measures in the context of information retrieval systems is presented. An evaluation of these methods and measures is conducted on ScientificCommons.org, a search platform for open access publications with more than 21 million publications and 8.5 million extracted authors and their co-authorship network.
The contribution of this paper is based on an analysis of online information sources in terms of their usability for the extraction of social networks and a research framework for the analysis and application of social network methods to information retrieval systems. The research framework was applied to the co-authorship network of scientific publications. The co-authorship network was used to compute different centrality measures of the authors, which then in turn have been used to refine the relevance ranking of publications within information retrieval systems. The performance of the different rankings based on the different centrality measures has been evaluated by the measurement of the click-through performance in the search results.
Funding(s)
Language
German
HSG Classification
contribution to scientific community
Refereed
Yes
Event Title
5th Applications of Social Network Analysis (ASNA)
Event Location
Zurich
Event Date
12.-13.09.2008
Subject(s)
Eprints ID
46444
File(s)![Thumbnail Image]()
Loading...
open.access
Name
2008_07_Using_social_network_analysis_to_enhance_information_retrieval_systems.pdf
Size
278.4 KB
Format
Adobe PDF
Checksum (MD5)
23e3b7cda11e0423e33be2e5139dd4ff