Symposium on Bias and Diversity in IR (DIV-2011)

This symposium will address diversity and bias of content from a multitude of users and information sources on the Web. In this context, we will discuss methodologies and applications for diversification of search results as well as combining and extracting information from multiple sources. To this end, we will present and discuss innovative methods in NLP, multimedia analysis, search technology, and opinion mining that enable bias-aware and diversity-aware information access technology for the Web. The symposium is intended for a broad range of students, researchers, and IR practitioners interested in and dealing with diversity of content occurring in Web search results, on the Social Web, or in blogs. The Web and especially the Social Web lives from the multitude of actors that are involved in content creation. It has achieved a democratization of content production that potentially gives a voice to everybody. However, today's search technology fails to reflect this variety in an explicit and structured way. For controversially discussed topics such as "global warming" the content available on the Web reflects the full variety of positions and their evolution over time. However, due to the sheer mass of available content and the way content is ranked by current search technology (mainly based on relevance and popularity) it is difficult to get a structured diversity overview. In addition, content is partly strongly biased without making the underlying intention explicit. A better overview over existing opinions - and support in discovering bias and analysing the underlying diversity (driven by differences in cultural backgrounds, schools of thoughts, temporal context etc.) - clearly helps in building an own opinion in a well-informed way and in reflecting and contextualizing own positions. We will present leading edge technology that can help making diversity a real and tangible asset of the Web.



After a general introduction into the topic and the main research challenges in the area, this symposium day will be structured into three sessions covering different aspects of dealing with bias and diversity in Web search,

  • Discovering facts and opinions and their evolution over time
  • Detecting bias and diversity in multimedia data
  • A testbed for diversification in search

Organizationally, the day will combine tutorial style parts given by experts in the field, a high level of interactivity with the audience and hands-on style parts that give the audience the opportunity to gather deep technical knowledge and first-hand experience in building bias- and diversity-aware technology for next-generation services on the Web.

Discovering Fact and Opinions and their Evolution over Time

This session will be split into two parts: The first part will deal with methods and technologies for discovering opinions from different types of Web content. The second part will deal with discovering entity evolution and temporal facts. Discovering opinions articulated in Web content is one of the cornerstones of dealing in a systematic way with diversity on the Web. This part of the session will give an introduction into state-of-the-art technology in opinion mining and sentiment analysis, and will also look into more in depth analysis methods for opinion mining. The constantly evolving Web reflects the evolution of society in the cyberspace. Web collections contain knowledge about entities (people, companies, political parties, etc.) as of a certain time point. Within the tutorial we will present approaches that allow discovering entity evolution and associating them with the appropriate temporal fact(s). To this end, we will first introduce and compare approaches for temporal fact extraction from factual knowledge sources such as Wikipedia. After that, we will present strategies to resolve ambiguous names and map them onto the right entity in order to trace their evolution.

Detecting Bias and Diversity in Multimedia Content

Documents on the Web increasingly use images and other non-textual materials to convey additional information or enhance the content in some way. The tutorial will (also) explore how multimedia content analysis using recent tools from research in image processing and computer vision can be coupled with textual information extraction and used to enhance web search in a variety of ways. The use of these tools to support diversity in both querying and result presentation will be covered together with approaches to sentiment analysis in images and other emerging multimedia techniques.

A Testbed for Diversification in Search

The goal of this practical session is to put the theory discussed in the previous sessions into action. We will start with an introduction to the testbed developed in the EU FET Project LivingKnowledge, which is a development platform supporting bias and diversity aware search. The testbed integrates the technologies developed in the project, and a framework for adding and evaluating new bias and diversity aware components as well as using the integrated components to support bias and diversity aware applications. In the hands on session, we will use the testbed to create a baseline application. We will then walk through a real example of integrating an opinion extractor into the application including adding the extractor to a document analysis pipeline, evaluating the performance of the extractor, indexing and searching on the output of the extractor, and finally integrating the output into the baseline application. Time permitting we will perform the same steps for an image analysis tool.


DIV-2011 is supported and organized by the EU FET Project LivingKnowledge.