Identifying inter-firm relationships is critical in understanding the industry landscape. However, due to the dynamic nature of such relationships, it is challenging to capture corporate social networks in a scalable and timely manner. To address this issue, this research develops a framework to build corporate social network representations by applying natural language processing (NLP) techniques on a corpus of 10-K filings, describing the reporting firms’ perceived relationships with other firms. Our framework uses named-entity recognition (NER) to locate the corporate names in the text, topic modeling to identify types of relationships included, and BERT to predict the type of relationship described in each sentence. To show the value of the network measures created by the proposed framework, we conduct two empirical analyses to see their impacts on firm performance. The first study shows that competition relationship and in-degree measurements on all relationship types have prediction power in estimating future earnings. The second study focuses on the difference between individual perspectives in an inter-firm social network. Such a difference is measured by the direction of mentions and is an indicator of a firm’s success in network governance. Receiving more mentions from other firms is a positive signal to network governance and it shows a significant positive correlation with firm performance next year.
The theory of network opportunity emergence holds that as the overall industry network structure becomes centralized, opportunities emerge for new entrants. However, new entrants must correctly strategically position themselves in the market to be properly valued. This creates tensions for entrepreneurial ventures considering going public around how to craft their strategic posture to take advantage of differentiating opportunities in the market structure while still being familiar enough to customers and investors. In this paper, we propose a theory of IPO strategic posture to unpack these dynamics. We empirically test our theory using a machine learning approach called doc2vec to create a similarity matrix of all existing U.S. publicly traded companies based upon self-provided business descriptions provided in their 10-K annual reports. This enables us to measure existing companies’ similarities of strategic postures and identify where industry-level structural holes emerge. We then use these structural hole signatures of potential market entry opportunities to predict how new companies strategically posture an IPO. We then follow the trajectories of those newly listed companies to see how their strategic posture impacts growth and ultimate survival. We conclude with a discussion of how the institutional pressures of the venture capital industry create pressure for ventures to self-present their IPO strategic postures as too distinct for their own long-term survival.
As firms increasingly depend on Information Technology (IT) in their business strategies and value creation activities, risks associated with IT have become one of the top concerns for managers and investors. This study examines the relation between IT-related risk factor information in Item 1A of the 10-K annual reports and a firm’s stock price crash risk, a firm-specific propensity to stock price crashes. Using the text-mining approach of Latent Dirichlet Allocation topic modeling to identify IT-related risk factors, we find that IT risk emerges as one of the firms’ key risk factors and that IT risk is positively associated with a firm’s future stock price crash risk. We further separate IT risk factors into cybersecurity risk that potentially leads to a loss or leak of data, and IT value risk that relates to a firm’s reliance on IT for its competitive advantage and value creation activities. We find that cybersecurity risk continues to affect crash risk, but IT value risk does not, consistent with their different risk natures. We also find that the readability, novelty, and the order of appearance of the IT risk factor information, specifically cybersecurity risk, in Item 1A enhance the information content of risk factors and strengthen their relation with stock price crash risk.
Presented at UKC (2017), KISTI (2017), WITS (2017), Rutgers Business School (2018)
There are increasing needs for understanding and fathoming of the business management environment through big data analysis at the industrial and corporative level. The research using the company disclosure information, which is comprehensively covering the business performance and the future plan of the company, is getting attention. However, there is limited research on developing applicable analytical models leveraging such corporate disclosure data due to its unstructured nature. This study proposes a text-mining-based analytical model for industrial and firm-level analyses using publicly available company disclosure data. Specifically, we apply LDA topic model and word2vec word embedding model on the U.S. SEC data from the publicly listed firms and analyze the trends of business topics at the industrial and corporate levels.
Using LDA topic modeling based on SEC EDGAR 10-K document, whole industrial management topics are figured out. For comparison of different pattern of industries’ topic trend, software and hardware industries are compared in recent 20 years. Also, the changes in management subject at the firm level are observed with a comparison of two companies in the software industry. The changes in topic trends provide a lens for identifying decreasing and growing management subjects at industrial and firm-level. Mapping companies and products(or services) based on dimension reduction after using word2vec word embedding model and principal component analysis of 10-K document at the firm level in the software industry, companies and products(services) that have similar management subjects are identified and also their changes in decades.
For suggesting a methodology to develop an analytical model based on public management data at the industrial and corporate level, there may be contributions in terms of making the ground of practical methodology to identifying changes of management subjects. However, there are required further researches to provide a microscopic analytical model with regard to the relation of technology management strategy between management performance in case of related to the various pattern of management topics as of frequent changes of management subject or their momentum. Also, more studies are needed for developing competitive context analysis model with product(service)-portfolios between firms.