ABSTRACT

In this chapter we conclude the research presented in the book along with guidelines for future work. In our research, we have identified a few interesting challenges in learning from task heterogeneity in real-world applications. We have proposed algorithms and models along with theoretically backed solutions for learning from task heterogeneity. We also discuss few limitations of the proposed models and algorithms to learn from task heterogeneity in social media. Specifically, we discuss: (1) impact of concept drift on the proposed models, (2) addressing model bias and machine learning fairness, (3) model robustness and negative transfer, (4) ethical concerns on using machine learning models in healthcare domain and (5) misinformation and disinformation in social media data.

The major challenges for effectively and efficiently mining social media data to build functional applications include: (1) Data reliability and acceptance: most social media data (especially in the context of healthcare-related social media) is not regulated and little has been studied on the benefits of healthcare-specific social media; (2) Data heterogeneity: social media data is generated by users with both demographic and geographic diversity; (3) Model transparency and trustworthiness: most existing machine learning models for addressing heterogeneity are considered as black box models, not many providing explanations for why they do what they do to trust them.

In response to these challenges, three main research directions have been investigated in this thesis: (1) Analyzing social media influence on healthcare: to study the real world impact of social media as a source to offer or seek support for patients with chronic health conditions; (2) Learning from task heterogeneity: to propose various models and algorithms that are adaptable to new social media platforms and robust to dynamic social media data, specifically on modeling user behaviors, identifying similar actors across platforms, and adapting black box models to a specific learning scenario; (3) Explaining heterogeneous models: to interpret predictive models in the presence of task heterogeneity. In this thesis, novel algorithms with theoretical analysis from various aspects (e.g., time complexity, convergence properties) have been proposed. The effectiveness and efficiency of the proposed algorithms is demonstrated by comparison with state-of-the-art methods and relevant case studies.