Tag Archives: deep learning

Learning Faces to Predict Matching Probability in an Online Dating Market

Kwon, Soonjae, Sung-Hyuk Park, Gene Moo Lee, Dongwon Lee (2021) “Learning Faces to Predict Matching Probability in an Online Dating Market”. Work-in-progress.

  • Under review for a conference presentation.
  • Based on an industry collaboration

With the increasing use of online matching markets, predicting the matching probability among users is crucial for better market design. Although previous studies have constructed visual features to predict the matching probability, facial features extracted by deep learning have not been widely used. By predicting user attractiveness in an online dating market, we find that deep learning-enabled facial features can significantly enhance prediction accuracy. We also predict the attractiveness at various evaluator groups and explain their different preferences based on the theory of evolutionary psychology. Furthermore, we propose a novel method to visually interpret deep learning-enabled facial features using the latest deep learning-based generative model. Our work contributes to IS researchers utilizing facial features using deep learning and interpreting them to investigate underlying mechanisms in online matching markets. From a practical perspective, matching platforms can predict matching probability more accurately for better market design and recommender systems for maximizing the matching outcome.

Corporate Social Network Analysis: A Deep Learning Approach

Cao, Rui, Gene Moo Lee, Hasan Cavusoglu (2020) “Corporate Social Network Analysis: A Deep Learning Approach,” Working Paper.

Identifying inter-firm relationships is critical in understanding the industry landscape. However, due to the dynamic nature of such relationships, it is challenging to capture corporate social networks in a scalable and timely manner. To address this issue, this research develops a framework to build corporate social network representations by applying natural language processing (NLP) techniques on a corpus of 10-K filings, describing the reporting firms’ perceived relationships with other firms. Our framework uses named-entity recognition (NER) to locate the corporate names in the text, topic modeling to identify types of relationships included, and BERT to predict the type of relationship described in each sentence. To show the value of the network measures created by the proposed framework, we conduct two empirical analyses to see their impacts on firm performance. The first study shows that competition relationship and in-degree measurements on all relationship types have prediction power in estimating future earnings. The second study focuses on the difference between individual perspectives in an inter-firm social network. Such a difference is measured by the direction of mentions and is an indicator of a firm’s success in network governance. Receiving more mentions from other firms is a positive signal to network governance and it shows a significant positive correlation with firm performance next year.

Targeting Pre-Roll Ads using Video Analytics

Park, Sungho, Gene Moo Lee, Donghyuk Shin, Sang-Pil Han. “Targeting Pre-Roll Ads using Video Analytics”, Under Reject ana Resubmit, Management Science. [Submitted: April 25, 2021]

  • Funded by Sauder Exploratory Research Grant 2020
  • Presented at Southern Methodist University (2020), University of Washington (2020), INFORMS (2020), WITS (2020), HKUST (2021), Maryland (2021), American University (2021)
  • Research assistants: Raymond Situ, Miguel Valarao

Pre-roll video ads continue to rise at an unparalleled pace, creating new opportunities and challenges. They are more immersive than conventional banner ads and must be viewed at least partially before the content video is played. On the other hand, the prevailing skippable format of pre-roll video ads that allows viewers to skip ads after five seconds generates opportunity costs for advertisers and online platforms when the ad is skipped. Against this backdrop, we propose a novel video analytics method for improving pre-roll video ad performance by extracting multi-modal (audio, video, text) properties from both video ads and content videos using deep learning and signal processing techniques, and then analyzing their effect on video ad completion. The findings indicate that the ad-content congruence in various modalities is essential in explaining viewers’ ad completion. Specifically, visual congruence (i.e., celebrity overlap in ad and content) and textual congruence (i.e., topic similarity of ad and content) play important roles as viewers may shape ex-ante expectations of the congruence based on visual cues (i.e., thumbnail images) and previous experience (i.e., watched content clips from the same program) before watching the content video. We also discover, through predictive analyses, that video ad completion can be reliably predicted by features derived from the proposed method. Surprisingly, there is no discernible loss of predictive power when analyzing only the first five seconds of ads and content videos rather than their entire length, resulting in significant cost savings when processing large video datasets.

Enhancing Social Media Analysis with Visual Data Analytics: A Deep Learning Approach (MISQ 2020)

Shin, Donghyuk, Shu He, Gene Moo Lee, Andrew B. Whinston, Suleyman Cetintas, Kuang-Chih Lee (2020) “Enhancing Social Media Analysis with Visual Data Analytics: A Deep Learning Approach,” MIS Quarterly, 44(4), pp. 1459-1492. [SSRN]

  • Based on an industry collaboration with Yahoo! Research
  • The first MISQ methods article based on machine learning
  • Presented in WeB (Fort Worth, TX 2015), WITS (Dallas, TX 2015), UT Arlington (2016), Texas FreshAIR (San Antonio, TX 2016), SKKU (2016), Korea Univ. (2016), Hanyang (2016), Kyung Hee (2016), Chung-Ang (2016), Yonsei (2016), Seoul National Univ. (2016), Kyungpook National Univ. (2016), UKC (Dallas, TX 2016), UBC (2016), INFORMS CIST (Nashville, TN 2016), DSI (Austin, TX 2016), Univ. of North Texas (2017), Arizona State (2018), Simon Fraser (2019), Saarland (2021), Kyung Hee (2021)

This research methods article proposes a visual data analytics framework to enhance social media research using deep learning models. Drawing on the literature of information systems and marketing, complemented with data-driven methods, we propose a number of visual and textual content features including complexity, similarity, and consistency measures that can play important roles in the persuasiveness of social media content. We then employ state-of-the-art machine learning approaches such as deep learning and text mining to operationalize these new content features in a scalable and systematic manner. For the newly developed features, we validate them against human coders on Amazon Mechanical Turk. Furthermore, we conduct two case studies with a large social media dataset from Tumblr to show the effectiveness of the proposed content features. The first case study demonstrates that both theoretically motivated and data-driven features significantly improve the model’s power to predict the popularity of a post, and the second one highlights the relationships between content features and consumer evaluations of the corresponding posts. The proposed research framework illustrates how deep learning methods can enhance the analysis of unstructured visual and textual data for social media research.

Predicting Litigation Risk via Machine Learning

Lee, Gene Moo*, James Naughton*, Xin Zheng*, Dexin Zhou* (2020) “Predicting Litigation Risk via Machine Learning,” Working Paper. [SSRN] (* equal contribution)

This study examines whether and how machine learning techniques can improve the prediction of litigation risk relative to the traditional logistic regression model. Existing litigation literature has no consensus on a predictive model. Additionally, the evaluation of litigation model performance is ad hoc. We use five popular machine learning techniques to predict litigation risk and benchmark their performance against the logistic regression model in Kim and Skinner (2012). Our results show that machine learning techniques can significantly improve the predictability of litigation risk. We identify two best-performing methods (random forest and convolutional neural networks) and rank the importance of predictors. Additionally, we show that models using economically-motivated ratio variables perform better than models using raw variables. Overall, our results suggest that the joint consideration of economically-meaningful predictors and machine learning techniques maximize the improvement of predictive litigation models.