Abstract
Big data is ubiquitous and can only become bigger, which challenges traditional data mining and machine learning methods. Social media is a new source of data that is significantly different from conventional ones. Social media data are mostly user-generated, and are big, linked, and heterogeneous. We present the good, the bad and the ugly associated with the multi-faceted social media data and exemplify the importance of some original problems with real-world examples. We discuss bias in social media data, evaluation dilemma, data reduction, inferring invisible information, and big-data paradox. We illuminate new opportunities of developing novel algorithms and tools for data science. In our endeavor of employing the good to tame the bad with the help of the ugly, we deepen the understanding of ever growing and continuously evolving data and create innovative solutions with interdisciplinary and collaborative research of data science.
Original language | English (US) |
---|---|
Pages (from-to) | 137-143 |
Number of pages | 7 |
Journal | International Journal of Data Science and Analytics |
Volume | 1 |
Issue number | 3-4 |
DOIs | |
State | Published - Nov 1 2016 |
Keywords
- Big-data paradox
- Data analytics
- Data mining
- Evaluation
- Social media
ASJC Scopus subject areas
- Information Systems
- Modeling and Simulation
- Computer Science Applications
- Computational Theory and Mathematics
- Applied Mathematics