ABSTRACT

This chapter presents an analysis of feature engineering for Twitter-based applications. It discusses how Twitter data can be downloaded from the Twitter Application Programming Interface (API). The chapter describes different types of data available in tweets downloaded from the Twitter API. It discusses the data related to tweet text, Twitter users and other metadata which exist in Twitter JSON objects. The chapter also discusses various textual features, image and video features, Twitter metadata-related features and network features that can be extracted from them. It also discusses applications that use different feature types along with a justification for why certain features perform well in the context of informal short text messages such as tweets. The chapter describes five Twitter-based applications that utilize the different feature types and highlights the features that perform well in the corresponding application setting. It concludes by discussing Twitris, a real-time semantic social web analytics platform, and its use of Twitter features.