Tag Archives: compustat

Corporate Social Network Analysis: A Deep Learning Approach

Cao, Rui, Gene Moo Lee, Hasan Cavusoglu. “Corporate Social Network Analysis: A Deep Learning Approach,” Working Paper.

Identifying inter-firm relationships is critical in understanding the industry landscape. However, due to the dynamic nature of such relationships, it is challenging to capture corporate social networks in a scalable and timely manner. To address this issue, this research develops a framework to build corporate social network representations by applying natural language processing (NLP) techniques on a corpus of 10-K filings, describing the reporting firms’ perceived relationships with other firms. Our framework uses named-entity recognition (NER) to locate the corporate names in the text, topic modeling to identify types of relationships included, and BERT to predict the type of relationship described in each sentence. To show the value of the network measures created by the proposed framework, we conduct two empirical analyses to see their impacts on firm performance. The first study shows that competition relationship and in-degree measurements on all relationship types have prediction power in estimating future earnings. The second study focuses on the difference between individual perspectives in an inter-firm social network. Such a difference is measured by the direction of mentions and is an indicator of a firm’s success in network governance. Receiving more mentions from other firms is a positive signal to network governance and it shows a significant positive correlation with firm performance next year.

A Structural Hole Theory-Guided Computational Framework for Opportunity Measurement: A Case of IPO Success

Lee, Myunghwan, Gene Moo Lee, Hasan Cavusoglu, Marc-David L. Seidel. “A Structural Hole Theory-Guided Computational Framework for Opportunity Measurement: A Case of IPO Success”, [Latest version: March 2026]

  • Previous title: Strategic Competitive Positioning: Unsupervised Operationalization of a Structural Hole-based Firm-specific Construct
  • doc2vec model of 10-K reports: Link
  • Presented at UBC MIS Seminar 2018, CIST 2019 (Seattle, WA), KrAIS 2019 (Munich, Germany), DS 2021 (online), KrAIS 2021 (Austin, TX), UT Dallas 2022, KAIST 2022, Korea Univ 2022, INFORMS 2022 (Indianapolis, IN)
  • Funded by Sauder Exploratory Grant 2019
  • Research assistants: Raymond Situ, Sahil Jain

Although opportunities play a central role in firm innovation and performance, prior research lacks a scalable, theory-grounded approach to measuring them. Existing measures are either context-specific or detached from explicit relational mechanisms, limiting their generalizability and interpretability. To address this gap, we propose a structural hole theory-guided computational design framework that enables fine-grained strategic opportunity measures: hole-opening, hole-entering, and non-hole positions. We demonstrate the effectiveness of this framework through a systematic analysis of IPO outcomes using panel data on U.S. public firms. We find that hole-opening positions are associated with higher post-IPO valuations, but a lower likelihood of M&A exits, whereas hole-entering and non-hole positions are linked to lower IPO valuations but higher probabilities of M&A outcomes. These patterns highlight distinct opportunity roles embedded in firms’ structural positions. We conclude the paper by discussing the broad applicability of the theory-guided computational framework for opportunity measurement in various IS research contexts.

Predicting Litigation Risk via Machine Learning

Lee, Gene Moo*, James Naughton*, Xin Zheng*, Dexin Zhou* (2020) “Predicting Litigation Risk via Machine Learning,” Working Paper. [SSRN] (* equal contribution)

This study examines whether and how machine learning techniques can improve the prediction of litigation risk relative to the traditional logistic regression model. Existing litigation literature has no consensus on a predictive model. Additionally, the evaluation of litigation model performance is ad hoc. We use five popular machine learning techniques to predict litigation risk and benchmark their performance against the logistic regression model in Kim and Skinner (2012). Our results show that machine learning techniques can significantly improve the predictability of litigation risk. We identify two best-performing methods (random forest and convolutional neural networks) and rank the importance of predictors. Additionally, we show that models using economically-motivated ratio variables perform better than models using raw variables. Overall, our results suggest that the joint consideration of economically-meaningful predictors and machine learning techniques maximize the improvement of predictive litigation models.