Capturing Long-Term User Interests in Online Television News Programs

doi:10.1201/b11723-19

ABSTRACT

Contents 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 11.2 Long-Term Personalization Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 11.3 Requirements for a User Proﬁle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

11.3.1 Capturing and Segmenting News Broadcasts . . . . . . . . . . . . . . . . . . 313 11.3.2 Exploiting External Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 11.3.3 Categorizing News Stories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

11.4 Tackling User Proﬁling Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 11.4.1 User Proﬁle Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 11.4.2 Capturing Evolving Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

11.4.2.1 Constant Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 11.4.2.2 Exponential Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 11.4.2.3 Linear Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 11.4.2.4 Inverse Exponential Weighting . . . . . . . . . . . . . . . . . . . . . . 322

11.4.3 Capturing Diﬀerent Aspects of Interest . . . . . . . . . . . . . . . . . . . . . . . . 323 11.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325

11.1 Introduction With the growing capabilities and the falling prices of current hardware systems, there are ever-increasing possibilities to store and manipulate videos in a digital format. Also with ever-increasing broadband capabilities it is now possible to view video online as easily as text-based pages were viewed when the web ﬁrst appeared. People are now producing their own digital libraries from materials created through digital cameras and camcorders, and use a number of systems to place thismaterial on the web, as well as store them in their own individual collections [9]. An interesting research problem is to assist users in dealing with such large and swiftly increasing volumes of video, i.e., in helping them to satisfy their information need by ﬁnding videos they are interested in. For example, a user who enjoys sitcoms might beneﬁt from a personalized video retrieval system that automatically identiﬁes this interest and, further, informs the user about other sitcoms he or she is not aware of. An important question that needs to be answered in this context is how users’ personal information needs can be identiﬁed. A promising method is to employ relevance feedback (RF) techniques. RF can be split into two main paradigms: explicit and implicit RF. Employing explicit RF, users are asked to judge the relevance of videos. Unfortunately though, users tend not to provide constant feedback, which is rather problematic when feedback is required to identify users’ interests over a longer period of time. Deviating from the method of explicitly asking the user to rate results, the use of implicit feedback techniques helps learning users’ interest unobtrusively. The main advantage is that this approach relieves the user from providing explicit feedback. As a large quantity of implicit data can be gathered without disturbing the users’ work ﬂow, the implicit approach is an attractive alternative. In order to study these research challenges, we focus in this work on news videos. News broadcasts consist of many short independent news items that users can be interested in. Thus, news bulletins allow for the development of user proﬁling and recommendation techniques that rely on documents with similar features. In the context of this chapter, we hence assume that users’ interests in certain news topics can be identiﬁed by identifying those news items that users’ interacted most with. Further, we assume that users stay interested in certain news topics over a longer time period and thus might provide implicit RF over a longer time period, i.e., over multiple interaction sessions.