ABSTRACT

Contents 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 11.2 Long-Term Personalization Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 11.3 Requirements for a User Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

11.3.1 Capturing and Segmenting News Broadcasts . . . . . . . . . . . . . . . . . . 313 11.3.2 Exploiting External Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 11.3.3 Categorizing News Stories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

11.4 Tackling User Profiling Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 11.4.1 User Profile Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 11.4.2 Capturing Evolving Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

11.4.2.1 Constant Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 11.4.2.2 Exponential Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 11.4.2.3 Linear Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 11.4.2.4 Inverse Exponential Weighting . . . . . . . . . . . . . . . . . . . . . . 322

11.4.3 Capturing Different Aspects of Interest . . . . . . . . . . . . . . . . . . . . . . . . 323 11.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

11.1 Introduction With the growing capabilities and the falling prices of current hardware systems, there are ever-increasing possibilities to store and manipulate videos in a digital format. Also with ever-increasing broadband capabilities it is now possible to view video online as easily as text-based pages were viewed when the web first appeared. People are now producing their own digital libraries from materials created through digital cameras and camcorders, and use a number of systems to place thismaterial on the web, as well as store them in their own individual collections [9]. An interesting research problem is to assist users in dealing with such large and swiftly increasing volumes of video, i.e., in helping them to satisfy their information need by finding videos they are interested in. For example, a user who enjoys sitcoms might benefit from a personalized video retrieval system that automatically identifies this interest and, further, informs the user about other sitcoms he or she is not aware of. An important question that needs to be answered in this context is how users’ personal information needs can be identified. A promising method is to employ relevance feedback (RF) techniques. RF can be split into two main paradigms: explicit and implicit RF. Employing explicit RF, users are asked to judge the relevance of videos. Unfortunately though, users tend not to provide constant feedback, which is rather problematic when feedback is required to identify users’ interests over a longer period of time. Deviating from the method of explicitly asking the user to rate results, the use of implicit feedback techniques helps learning users’ interest unobtrusively. The main advantage is that this approach relieves the user from providing explicit feedback. As a large quantity of implicit data can be gathered without disturbing the users’ work flow, the implicit approach is an attractive alternative. In order to study these research challenges, we focus in this work on news videos. News broadcasts consist of many short independent news items that users can be interested in. Thus, news bulletins allow for the development of user profiling and recommendation techniques that rely on documents with similar features. In the context of this chapter, we hence assume that users’ interests in certain news topics can be identified by identifying those news items that users’ interacted most with. Further, we assume that users stay interested in certain news topics over a longer time period and thus might provide implicit RF over a longer time period, i.e., over multiple interaction sessions.