TY - GEN
T1 - Predicting Linguistically Sophisticated Social Determinants of Health Disparities with Neural Networks
T2 - 2023 IEEE International Conference on Big Data, BigData 2023
AU - Cascalheira, Cory J.
AU - Chapagain, Santosh
AU - Flinn, Ryan E.
AU - Zhao, Yuxuan
AU - Boubrahimi, Soukaina Filali
AU - Klooster, Dannie
AU - Gonzalez, Alejandra
AU - Lund, Emily M.
AU - Laprade, Danica
AU - Scheer, Jillian R.
AU - Hamdi, Shah Muhammad
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - LGBTQ+ minority stress is a pervasive form of anti-LGBTQ+ adverse events and psychological strain that drives health inequities among LGBTQ+ people. Minority stress is also linguistically sophisticated (e.g., composed of cultural idioms, psycholinguistic permutations, and lexical density). Because minority stress is a linguistically sophisticated social determinant of health disparities, it is challenging to detect using natural language processing (NLP). Using 5,789 human-annotated Reddit posts from the LGBTQ+ Minority Stress on Social Media (MiSSoM+) Dataset, we investigated and compared the performance of four neural networks and two traditional machine learning architectures in modeling minority stress at both the factor (i.e., separate components of minority stress) and composite level. A novel hybrid model combining Bidirectional Encoder Representations from Transformers and convolutional neural network (BERT-CNN) improved the prediction of composite minority stress (F1 = 0.84). Our experiments on separate factors of minority stress are the first to demonstrate that hybrid neural network models can detect semantically complex expressions of prejudiced events (F1 = 0.87), expected rejection (F1 = 0.92), internalized stigma (F1 = 0.91), identity concealment (F1 = 0.92), and minority coping (F1 = 0.84). We also substantially improved the prediction of gender dysphoria (F1 = 0.94) - a conceptually new candidate component of minority stress. Big data analytics may not be a panacea for the problem of minority stress, but our work joins a growing literature base to show that deep learning models are remarkable in detecting linguistically sophisticated social determinants of health disparities in big data, thus providing evidence in support of the potential benefit from the innovative use of such technology in eliminating group-specific health inequities.
AB - LGBTQ+ minority stress is a pervasive form of anti-LGBTQ+ adverse events and psychological strain that drives health inequities among LGBTQ+ people. Minority stress is also linguistically sophisticated (e.g., composed of cultural idioms, psycholinguistic permutations, and lexical density). Because minority stress is a linguistically sophisticated social determinant of health disparities, it is challenging to detect using natural language processing (NLP). Using 5,789 human-annotated Reddit posts from the LGBTQ+ Minority Stress on Social Media (MiSSoM+) Dataset, we investigated and compared the performance of four neural networks and two traditional machine learning architectures in modeling minority stress at both the factor (i.e., separate components of minority stress) and composite level. A novel hybrid model combining Bidirectional Encoder Representations from Transformers and convolutional neural network (BERT-CNN) improved the prediction of composite minority stress (F1 = 0.84). Our experiments on separate factors of minority stress are the first to demonstrate that hybrid neural network models can detect semantically complex expressions of prejudiced events (F1 = 0.87), expected rejection (F1 = 0.92), internalized stigma (F1 = 0.91), identity concealment (F1 = 0.92), and minority coping (F1 = 0.84). We also substantially improved the prediction of gender dysphoria (F1 = 0.94) - a conceptually new candidate component of minority stress. Big data analytics may not be a panacea for the problem of minority stress, but our work joins a growing literature base to show that deep learning models are remarkable in detecting linguistically sophisticated social determinants of health disparities in big data, thus providing evidence in support of the potential benefit from the innovative use of such technology in eliminating group-specific health inequities.
KW - bidirectional encoder representation of transformers (BERT)
KW - convolutional neural network (CNN)
KW - deep learning
KW - multi-label text classification
KW - sexual and gender minority
KW - word embedding
UR - http://www.scopus.com/inward/record.url?scp=85184981556&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85184981556&partnerID=8YFLogxK
U2 - 10.1109/BigData59044.2023.10386882
DO - 10.1109/BigData59044.2023.10386882
M3 - Conference contribution
AN - SCOPUS:85184981556
T3 - Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023
SP - 1314
EP - 1321
BT - Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023
A2 - He, Jingrui
A2 - Palpanas, Themis
A2 - Hu, Xiaohua
A2 - Cuzzocrea, Alfredo
A2 - Dou, Dejing
A2 - Slezak, Dominik
A2 - Wang, Wei
A2 - Gruca, Aleksandra
A2 - Lin, Jerry Chun-Wei
A2 - Agrawal, Rakesh
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 15 December 2023 through 18 December 2023
ER -