Author Archives: gene lee

Matching Mobile Applications for Cross Promotion (ISR 2020)

Lee, Gene Moo, Shu He, Joowon Lee, Andrew B. Whinston (2020) Matching Mobile Applications for Cross-Promotion. Information Systems Research 31(3), pp. 865-891.

  • Based on an industry collaboration with IGAWorks
  • Presented in Chicago Marketing Analytics (Chicago, IL 2013), WeB (Auckland, New Zealand 2014), Notre Dame (2015), Temple (2015), UC Irvine (2015), Indiana (2015), UT Dallas (2015), Minnesota (2015), UT Arlington (2015), Michigan State (2016), Korea Univ (2021)
  • Dissertation Paper #3
  • Research assistant: Raymond Situ

The mobile applications (apps) market is one of the most successful software markets. As the platform grows rapidly, with millions of apps and billions of users, search costs are increasing tremendously. The challenge is how app developers can target the right users with their apps and how consumers can find the apps that fit their needs. Cross-promotion, advertising a mobile app (target app) in another app (source app), is introduced as a new app-promotion framework to alleviate the issue of search costs. In this paper, we model source app user behaviors (downloads and postdownload usages) with respect to different target apps in cross-promotion campaigns. We construct a novel app similarity measure using latent Dirichlet allocation topic modeling on apps’ production descriptions and then analyze how the similarity between the source and target apps influences users’ app download and usage decisions. To estimate the model, we use a unique data set from a large-scale random matching experiment conducted by a major mobile advertising company in Korea. The empirical results show that consumers prefer more diversified apps when they are making download decisions compared with their usage decisions, which is supported by the psychology literature on people’s variety-seeking behavior. Lastly, we propose an app-matching system based on machine-learning models (on app download and usage prediction) and generalized deferred acceptance algorithms. The simulation results show that app analytics capability is essential in building accurate prediction models and in increasing ad effectiveness of cross-promotion campaigns and that, at the expense of privacy, individual user data can further improve the matching performance. This paper has implications on the trade-off between utility and privacy in the growing mobile economy.

Development of Topic Trend Analysis Model for Industrial Intelligence using Public Data (J. Technology Innovation 2018)

Park, S., Lee, G. M., Kim, Y.-E., Seo, J. (2018). Development of Topic Trend Analysis Model for Industrial Intelligence using Public Data (in Korean)Journal of Technology Innovation, 26(4), 199-232.

  • Funded by the Korea Institute of Science and Technology Information (KISTI)
  • Demo website: https://misr.sauder.ubc.ca/edgar_dashboard/
  • Presented at UKC (2017), KISTI (2017), WITS (2017), Rutgers Business School (2018)

There are increasing needs for understanding and fathoming of the business management environment through big data analysis at the industrial and corporative level. The research using the company disclosure information, which is comprehensively covering the business performance and the future plan of the company, is getting attention. However, there is limited research on developing applicable analytical models leveraging such corporate disclosure data due to its unstructured nature. This study proposes a text-mining-based analytical model for industrial and firm-level analyses using publicly available company disclosure data. Specifically, we apply LDA topic model and word2vec word embedding model on the U.S. SEC data from the publicly listed firms and analyze the trends of business topics at the industrial and corporate levels.

Using LDA topic modeling based on SEC EDGAR 10-K document, whole industrial management topics are figured out. For comparison of different pattern of industries’ topic trend, software and hardware industries are compared in recent 20 years. Also, the changes in management subject at the firm level are observed with a comparison of two companies in the software industry. The changes in topic trends provide a lens for identifying decreasing and growing management subjects at industrial and firm-level. Mapping companies and products(or services) based on dimension reduction after using word2vec word embedding model and principal component analysis of 10-K document at the firm level in the software industry, companies and products(services) that have similar management subjects are identified and also their changes in decades.

For suggesting a methodology to develop an analytical model based on public management data at the industrial and corporate level, there may be contributions in terms of making the ground of practical methodology to identifying changes of management subjects. However, there are required further researches to provide a microscopic analytical model with regard to the relation of technology management strategy between management performance in case of related to the various pattern of management topics as of frequent changes of management subject or their momentum. Also, more studies are needed for developing competitive context analysis model with product(service)-portfolios between firms.

Developing Cyber Risk Assessment Framework for Cyber Insurance: A Big Data Approach (KIRI Research Report 2018)

Lee, G. M. (2018). Developing Cyber Risk Assessment Framework for Cyber Insurance: A Big Data Approach (in Korean)KIRI Research Report 2018-15.

As our society is heavily dependent on information and communication technology, the associated risk has also significantly increased. Cyber insurance has been emerged as a possible means to better manage such cyber risk. However, the cyber insurance market is still in a premature stage due to the lack of data sharing and standards on cyber risk and cyber insurance. To address this issue, this research proposes a data-driven framework to assess cyber risk using externally observable cyber attack data sources such as outbound spam and phishing websites. We show that the feasibility of such an approach by building cyber risk assessment reports for Korean organizations. Then, by conducting a large-scale randomized field experiment, we measure the causal effect of cyber risk disclosure on organizational security levels. Finally, we develop machine-learning models to predict data breach incidents, as a case of cyber incidents, using the developed cyber risk assessment data. We believe that the proposed data-driven methods can be a stepping-stone to enable information transparency in the cyber insurance market.

Predicting Litigation Risk via Machine Learning

Lee, Gene Moo*, James Naughton*, Xin Zheng*, Dexin Zhou* (2020) “Predicting Litigation Risk via Machine Learning,” Working Paper. [SSRN] (* equal contribution)

This study examines whether and how machine learning techniques can improve the prediction of litigation risk relative to the traditional logistic regression model. Existing litigation literature has no consensus on a predictive model. Additionally, the evaluation of litigation model performance is ad hoc. We use five popular machine learning techniques to predict litigation risk and benchmark their performance against the logistic regression model in Kim and Skinner (2012). Our results show that machine learning techniques can significantly improve the predictability of litigation risk. We identify two best-performing methods (random forest and convolutional neural networks) and rank the importance of predictors. Additionally, we show that models using economically-motivated ratio variables perform better than models using raw variables. Overall, our results suggest that the joint consideration of economically-meaningful predictors and machine learning techniques maximize the improvement of predictive litigation models.

Security Defense against Long-term and Stealthy Cyberattacks (DSS 2023)

Kookyoung Han, Choi, Jin Hyuk, Yun-Sik Choi, Gene Moo Lee, Andrew B. Whinston (2023) Security Defense against Long-term and Stealthy Cyberattacks. Decision Support Systems, 166: 113912.

  • Funded by NSF (Award #1718600) and UNIST
  • Best Paper Award at KrAIS 2017
  • Presented at UT Austin (2017), UNIST (2017), INFORMS (Houston, TX 2017), CIST (Houston, TX 2017), WITS (Seoul, Korea 2017), and KrAIS (Seoul, Korea 2017)
  • Previous titles: “Misinformation and Optimal Time to Detect”, “Optimal Stopping and Strategic Espionage”, “To Disconnect or Not: A Cybersecurity Game”

Modern cyberattacks such as advanced persistent threats have become sophisticated. Hackers can stay undetected for an extended time and defenders do not have sufficient countermeasures to prevent advanced cyberattacks. Reflecting on this phenomenon, we propose a game-theoretic model to analyze strategic decisions made by a hacker and a defender in equilibrium. In our game model, the hacker launches stealthy cyberattacks for a long time and the defender decides when to disable a suspicious user based on noisy observations of the user’s activities. Damages caused by the hacker can be enormous if the defender does not immediately ban a suspicious user under certain circumstances, which can explain the emerging sophisticated cyberattacks with detrimental consequences. Our model also predicts that the hacker may opt to be behavioral to avoid worst cases. This is because behavioral cyberattacks are less threatening and the defender decides not to immediately block a suspicious user to reduce cost of false detection.

    On the Spillover Effects of Online Product Reviews on Purchases: Evidence from Clickstream Data (ISR 2021)

    Kwark, Young*, Gene Moo Lee*, Paul A. Pavlou*, Liangfei Qiu* (2021) On the Spillover Effects of Online Product Reviews on Purchases: Evidence from Clickstream Data. Information Systems Research 32(3): 895-913. (* equal contribution)

    • Data awarded by Wharton Consumer Analytics Initiative
    • Presented in WCBI (Snowbird, UT 2015), KMIS (Busan, Korea 2016), Minnesota (2016), ICIS (Dublin, Ireland 2016), Boston Univ. (2017), HEC Paris (2017), and Korea Univ. (2018)
    • An earlier version was published in ICIS 2016
    • Research assistants: Bolat Khojayev, Raymond Situ

    We study the spillover effects of the online reviews of other covisited products on the purchases of a focal product using clickstream data from a large retailer. The proposed spillover effects are moderated by (a) whether the related (covisited) products are complementary or substitutive, (b) the choice of media channel (mobile or personal computer (PC)) used, (c) whether the related products are from the same or a different brand, (d) consumer experience, and (e) the variance of the review ratings. To identify complementary and substitutive products, we develop supervised machine-learning models based on product characteristics, such as product category and brand, and novel text-based similarity measures. We train and validate the machine-learning models using product pair labels from Amazon Mechanical Turk. Our results show that the mean rating of substitutive (complementary) products has a negative (positive) effect on purchasing of the focal product. Interestingly, the magnitude of the spillover effects of the mean ratings of covisited (substitutive and complementary) products is significantly larger than the effects on the focal product, especially for complementary products. The spillover effect of ratings is stronger for consumers who use mobile devices versus PCs. We find the negative effect of the mean ratings of substitutive products across different brands on purchasing of a focal product to be significantly higher than within the same brand. Lastly, the effect of the mean ratings is stronger for less experienced consumers and for ratings with lower variance. We discuss implications on leveraging the spillover effect of the online product reviews of related products to encourage online purchases.

    Does Deceptive Marketing Pay? The Evolution of Consumer Sentiment Surrounding a Pseudo-Product-Harm Crisis (J. Business Ethics 2019)

    Song, Reo, Ho Kim, Gene Moo Lee, and Sungha Jang (2019) Does Deceptive Marketing Pay? The Evolution of Consumer Sentiment Surrounding a Pseudo-Product-Harm CrisisJournal of Business Ethics, 158(3), pp. 743-761.

    The slandering of a firm’s products by competing firms poses significant threats to the victim firm, with the resulting damage often being as harmful as that from product-harm crises. In contrast to a true product-harm crisis, however, this disparagement is based on a false claim or fake news; thus, we call it a pseudo-product-harm crisis. Using a pseudo-product-harm crisis event that involved two competing firms, this research examines how consumer sentiments about the two firms evolved in response to the crisis. Our analyses show that while both firms suffered, the damage to the offending firm (which spread fake news to cause the crisis) was more detrimental, in terms of advertising effectiveness and negative news publicity, than that to the victim firm (which suffered from the false claim). Our study indicates that, even apart from ethical concerns, the false claim about the victim firm was not an effective business strategy to increase the offending firm’s performance.

    Note for prospective students, postdoc researchers, and letter-requesting students

    • For current UBC students (Ph.D., Master, Undergraduate): Thank you for your interest in working with me. There could be research opportunities for exceptional cases: students should have a working knowledge of Python programming and Linux environment. I recommend interested students to take these online courses: Data Science, Machine Learning, Linux, SQL, and NoSQL. Before you contact me, please read my research group’s publications and working papers, then describe the specific research topics you want to pursue. In addition, please send me your transcript, resume, and code samples (e.g., GitHub repository if available).
    • For non-UBC students (Ph.D. and Master’s program applicants): Thank you for your interest in working with me. Unfortunately, I don’t have the capacity to work with non-UBC students. For possible collaboration, you first need to go through the formal admission process for UBC and Sauder School of Business before I can consider you as a potential collaborator. You do not need to have a commitment from an advisor prior to admission. Please check the related information on the Ph.D. program and the M.Sc. program. Also, please note that admission decisions are collectively made by Sauder school, not by individual faculty members.
    • For recommendation letters: I write reference letters for outstanding students (A- above) who have taken my class. To apply for a reference letter, please send me your transcript, resume, and personal statement (for graduate schools). Cheers!
    • Regarding postdoc positions or visiting students: Thank you for your interest in working with me as a postdoc or a visiting student. Unfortunately, I don’t have the capacity to host more researchers at the moment. You can find UBC faculty who are taking visiting scholars at https://www.grad.ubc.ca/virs/hosts.
    • Regarding the Visiting Associate Program at Centre for Korean Research, Institute of Asian Research: https://ckr.iar.ubc.ca/ckr-visiting-scholar-program/

    A Friend Like Me: Modeling Network Formation in a Location-Based Social Network (JMIS 2016)

    Lee, Gene Moo*, Liangfei Qiu*, Andrew B. Whinston* (2016) A Friend Like Me: Modeling Network Formation in a Location-Based Social Network, Journal of Management Information Systems 33(4), pp. 1008-1033. (* equal contribution)

    • Best Paper Nomination at HICSS 2016
    • Presented in WITS (Auckland, New Zealand 2014), and WISE (Auckland, New Zeland 2014), HICSS (Kauai, HI 2016)
    • Dissertation Paper #2

    This article studies the strategic network formation in a location-based social network. We build an empirical model of social link creation that incorporates individual characteristics and pairwise user similarities. Specifically, we define four user proximity measures from biography, geography, mobility, and short messages. To construct proximity from unstructured text information, we build topic models using Latent Dirichlet Allocation. Using Gowalla data with 385,306 users, 3 million locations, and 35 million check-in records, we empirically estimate the model to find evidence on the homophily effect on network formation. To cope with possible endogeneity issues, we use exogenous weather shocks as our instrumental variables and find the empirical results are robust: network formation decisions are significantly affected by our proximity measures.

    How would information disclosure influence organizations’ outbound spam volume? Evidence from a field experiment (J. Cybersecurity 2016)

    He, Shu*, Gene Moo Lee*, Sukjin Han, Andrew B. Whinston (2016) How Would Information Disclosure Influence Organizations’ Outbound Spam Volume? Evidence from a Field ExperimentJournal of Cybersecurity 2(1), pp. 99-118. (* equal contribution)

    Cyber-insecurity is a serious threat in the digital world. In the present paper, we argue that a suboptimal cybersecurity environment is partly due to organizations’ underinvestment on security and a lack of suitable policies. The motivation for this paper stems from a related policy question: how to design policies for governments and other organizations that can ensure a sufficient level of cybersecurity. We address the question by exploring a policy devised to alleviate information asymmetry and to achieve transparency in cybersecurity information sharing practice. We propose a cybersecurity evaluation agency along with regulations on information disclosure. To empirically evaluate the effectiveness of such an institution, we conduct a large-scale randomized field experiment on 7919 US organizations. Specifically, we generate organizations’ security reports based on their outbound spam relative to the industry peers, then share the reports with the subjects in either private or public ways. Using models for heterogeneous treatment effects and machine learning techniques, we find evidence from this experiment that the security information sharing combined with publicity treatment has significant effects on spam reduction for original large spammers. Moreover, significant peer effects are observed among industry peers after the experiment.