TY - JOUR
T1 - Using artificial intelligence to identify administrative errors in unemployment insurance
AU - Young, Matthew M.
AU - Himmelreich, Johannes
AU - Honcharov, Danylo
AU - Soundarajan, Sucheta
N1 - Funding Information:
The authors would like to thank the editorial team and anonymous reviewers for their input and assistance. The authors would also like to thank Sajiah Naqib for her help in identifying and correcting mistakes in our experimental setup. Any remaining errors are our own.
Publisher Copyright:
© 2022 The Authors
PY - 2022
Y1 - 2022
N2 - Administrative errors in unemployment insurance (UI) decisions give rise to a public values conflict between efficiency and efficacy. We analyze whether artificial intelligence (AI) – in particular, methods in machine learning (ML) – can be used to detect administrative errors in UI claims decisions, both in terms of accuracy and normative tradeoffs. We use 16 years of US Department of Labor audit and policy data on UI claims to analyze the accuracy of 7 different random forest and deep learning models. We further test weighting schemas and synthetic data approaches to correcting imbalances in the training data. A random forest model using gradient descent boosting is more accurate, along several measures, and preferable in terms of public values, than every deep learning model tested. Adjusting model weights produces significant recall improvements for low-n outcomes, at the expense of precision. Synthetic data produces attenuated improvements and drawbacks relative to weights.
AB - Administrative errors in unemployment insurance (UI) decisions give rise to a public values conflict between efficiency and efficacy. We analyze whether artificial intelligence (AI) – in particular, methods in machine learning (ML) – can be used to detect administrative errors in UI claims decisions, both in terms of accuracy and normative tradeoffs. We use 16 years of US Department of Labor audit and policy data on UI claims to analyze the accuracy of 7 different random forest and deep learning models. We further test weighting schemas and synthetic data approaches to correcting imbalances in the training data. A random forest model using gradient descent boosting is more accurate, along several measures, and preferable in terms of public values, than every deep learning model tested. Adjusting model weights produces significant recall improvements for low-n outcomes, at the expense of precision. Synthetic data produces attenuated improvements and drawbacks relative to weights.
KW - Administrative errors
KW - Artificial intelligence
KW - Machine learning
KW - Social policy
KW - Unemployment insurance
UR - http://www.scopus.com/inward/record.url?scp=85137414032&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137414032&partnerID=8YFLogxK
U2 - 10.1016/j.giq.2022.101758
DO - 10.1016/j.giq.2022.101758
M3 - Article
AN - SCOPUS:85137414032
SN - 0740-624X
VL - 39
JO - Government Information Quarterly
JF - Government Information Quarterly
IS - 4
M1 - 101758
ER -