Tag Archives: big data

Learning Faces to Predict Matching Probability in an Online Dating Market

Kwon, Soonjae, Sung-Hyuk Park, Gene Moo Lee, Dongwon Lee (2021) “Learning Faces to Predict Matching Probability in an Online Dating Market”. Work-in-progress.

  • Under review for a conference presentation.
  • Based on an industry collaboration

With the increasing use of online matching markets, predicting the matching probability among users is crucial for better market design. Although previous studies have constructed visual features to predict the matching probability, facial features extracted by deep learning have not been widely used. By predicting user attractiveness in an online dating market, we find that deep learning-enabled facial features can significantly enhance prediction accuracy. We also predict the attractiveness at various evaluator groups and explain their different preferences based on the theory of evolutionary psychology. Furthermore, we propose a novel method to visually interpret deep learning-enabled facial features using the latest deep learning-based generative model. Our work contributes to IS researchers utilizing facial features using deep learning and interpreting them to investigate underlying mechanisms in online matching markets. From a practical perspective, matching platforms can predict matching probability more accurately for better market design and recommender systems for maximizing the matching outcome.

My thoughts on AI, Big Data, and IS Research

Last update: June 10th, 2021

Recently, I had a chance to share my thoughts on how Big Data Analytics and AI will impact Information Systems (IS) research. Thanks to ever-growing datasets (public and proprietary) and powerful computational resources (cloud API, open-source projects), AI and Big Data will be important in IS research in the foreseeable future. If you are an aspiring IS researcher, I believe that you should be able to embrace this and take advantage of this.

First, AI and Big Data are powerful “tools” for IS research. It could be intimidating to see all the fancy new AI techniques. But they are just tools to analyze your data. You don’t need to reinvent the wheel to use them. There are many open-source projects in Python and R that you can use to analyze your data. Also, many cloud services (e.g., Amazon Rekognition, Google Cloud ML, Microsoft Azure ML) allow you to use pre-trained AI models at a modest cost (that your professors can afford). What you need is some working knowledge in programming languages like Python and R. And a high-level understanding of the idea behind algorithms.

Don’t shy away from hands-on programming. Using AI and Big Data tools may not be a competitive advantage in the long run because of the democratization of AI tools. However, I believe it will be the new baseline. So you need to have it in your research toolbox. Specifically, I believe that IS researchers should have a working knowledge of Python/R programming and Linux environment. I recommend these online courses: Data ScienceMachine LearningLinuxSQL, and NoSQL.

Second, AI and Big Data Analytics are creating a lot of interesting new “phenomenon” in personal lives, firms, and societies. How AI and robots will be adopted in the workplace and how that will affect the labor market? Are we losing our jobs? Or can we improve our productivity with AI tools? How AI will be used in professional services by the experts? What are the unintended consequences (such as biases, security, privacy, misinformation) of AI adoptions in the organization and society? And how can we mitigate such issues? There are so many new and interesting research questions.

In order to conduct relevant research, I think that IS researchers should closely follow the emerging technologies. Again, it could be hard to keep up with all the advances. I try to keep up to date by reading industry reports (from McKinsey and Deloitte) and listening to many podcasts (e.g., Freakonomics Radio, a16 Podcasts by Andreessen Horowitz, Lex Fridman Podcast, Stanford’s Entrepreneurial Thought Leaders, HBR’s Exponential View by Azeem Azhar).

I hope this post may help new IS researchers shape their research strategies. I will try to keep updating this post. Cheers!



Trustworthy Face? The Effect and Drivers of Comprehensive Trust in Online Job Market Platform

Kwon, Jun Bum, Donghyuk Shin, Gene Moo Lee, Jake An, Sam Hwang (2020) “Trustworthy Face? The Effect and Drivers of Comprehensive Trust in Online Job Market Platform”. Work-in-progress.

The abstract will appear here.

Robots Serve Humans: Does AI Robot Adoption Enhance Operational Efficiency and Customer Experience?

Lee, Myunghwan, Gene Moo Lee, Donghyuk Shin, Sang-Pil Han (2020) “Robots Serve Humans: Does AI Robot Adoption Enhance Operational Efficiency and Customer Experience?Working Paper.

  • Presented at WITS (2020), KrAIS (2020), UBC (2021)
  • Research assistants: Raymond Situ, Gallant Tang

Service providers have been adopting various robotics technologies to improve operational efficiency and increase customer satisfaction. Robotics technologies bring new restaurant experiences to customers by taking orders, cooking, and serving. While the impact of industrial robots has been well documented in the literature, little is known about the impact of customer-facing service robot adoption. To fill this gap, this work-in-progress study aims to analyze the impact of service robot adoption on restaurant service quality using 4,612 restaurants and their online customer reviews. We analyzed the treated effect of robot adoption using a difference-in-differences approach with propensity score and exact matching. Estimation results show that restaurant robot adoption has a positive impact on customer satisfaction, specifically on perceived food quality and perceived value. This study provides both academic and practical implications on the emerging AI robotics techniques.

A Scaling Perspective in AI Startups

Schulte-Althoff, Matthias, Daniel Fuerstenau, Gene Moo Lee, Hannes Rothe, Robert Kauffman (2021) “A Scaling Perspective in AI Startups”. Working Paper. [ResearchGate]

  • Presented at HICSS 2021 (SITES mini-track)

Digital startups’ use of AI technologies has significantly increased in recent years, bringing to the fore specific barriers to deployment, use, and extraction of business value from AI. Utilizing a quantitative framework regarding the themes of startup growth and scaling, we examine the scaling behavior of AI, platform, and service startups. We find evidence of a sublinear scaling ratio of revenue to age-discounted employment count. The results suggest that the revenue-employee growth pattern of AI startups is close to that of service startups, and less so to that of platform startups. Furthermore, we find a superlinear growth pattern of acquired funding in relation to the employment size that is largest for AI startups, possibly suggesting hype tendencies around AI startups. We discuss implications in the light of new economies of scale and the scope of AI startups related to decision-making and prediction.

Corporate Social Network Analysis: A Deep Learning Approach

Cao, Rui, Gene Moo Lee, Hasan Cavusoglu (2020) “Corporate Social Network Analysis: A Deep Learning Approach,” Working Paper.

Identifying inter-firm relationships is critical in understanding the industry landscape. However, due to the dynamic nature of such relationships, it is challenging to capture corporate social networks in a scalable and timely manner. To address this issue, this research develops a framework to build corporate social network representations by applying natural language processing (NLP) techniques on a corpus of 10-K filings, describing the reporting firms’ perceived relationships with other firms. Our framework uses named-entity recognition (NER) to locate the corporate names in the text, topic modeling to identify types of relationships included, and BERT to predict the type of relationship described in each sentence. To show the value of the network measures created by the proposed framework, we conduct two empirical analyses to see their impacts on firm performance. The first study shows that competition relationship and in-degree measurements on all relationship types have prediction power in estimating future earnings. The second study focuses on the difference between individual perspectives in an inter-firm social network. Such a difference is measured by the direction of mentions and is an indicator of a firm’s success in network governance. Receiving more mentions from other firms is a positive signal to network governance and it shows a significant positive correlation with firm performance next year.

IS Papers on Big Data, Analytics, and AI

Last update: Sept 30, 2021

My research involves Big Data Analytics and AI in Information Systems literature. This post tries to keep track of the editorial and seminal articles on the topic of Big Data, Data Science, Analytics, and AI in the Information Systems and Management literature. The papers are listed in chronological order:

  1. Bapna, Goes, Gopal, Marsden (2006) Moving from Data-Constrained to Data-Enabled Research: Experiences and Challenges in Collecting, Validating and Analyzing Large-Scale e-Commerce Data, Statistical Science 21(2): 116-130.
  2. Shmueli and Koppius (2011) Predictive Analytics in Information Systems Research, MIS Quarterly 35(3): 553-572
  3. Chen, Chiang, Storey, (2012) Business Intelligence and Analytics: From Big Data to Big Impact, MIS Quarterly 36(4): 1164-1188
  4. Lin, Lucas Jr., Shmueli (2013) Research Commentary: Too Big to Fail: Large Samples and the p-Value Problem, Information Systems Research 24(4): 906-917.
  5. Agarwal, Dhar (2014) Editorial – Big Data, Data Science, and Analytics: The Opportunity and Challenge for IS Research, Information Systems Research 25(3): 443-448
  6. Varian (2014) Big Data: New Tricks for Econometrics, Journal of Economic Perspectives 28(2): 3-28
  7. Goes (2014) Editor’s Comments: Big Data and IS Research, MIS Quarterly 38(3): iii-viii
  8. Saar-Tsechansky (2015) Editors’ Comments: The Business of Business Data Science in IS Journals, MIS Quarterly 39(4): iii-vi
  9. AMJ Editors (2016) From the Editors: Big Data and Data Science Methods for Management Research, Academy of Management Journal 59(5): 1493-1507
  10. Abbasi, Sarker, Chiang (2016) Big Data Research in Information Systems: Toward an Inclusive Research Agenda, Journal of the Association for Information Systems 17(2): i-xxxii
  11. Rai (2016) Editor’s Comments: Synergies Between Big Data and Theory, MIS Quarterly 40(2): iii-ix
  12. Baesens, Bapna, Marsden, Vanthienen, Zhao (2016) Transformational Issues of Big Data and Analytics in Networked Business, MIS Quarterly 40(4): 807-818
  13. Athey (2017) Beyond Prediction: Using Big Data for Policy Problems, Science 355(6324): 483-485
  14. Chiang, Grover, Liang, Zhang (2018) Special Issue: Strategic Value of Big Data and Business Analytics, Journal of Management Information Systems 35(2): 383-387
  15. Delen, Ram (2018) Research challenges and opportunities in business analytics, Journal of Business Analytics 1(1): 2-12.
  16. Maass, Parsons, Puraro, Storey, Woo (2018) Data-Driven Meets Theory-Driven Research in the Era of Big Data: Opportunities and Challenges for Information Systems Research, Journal of the Association for Information Systems 19(12): 1253-1273
  17. Yang, Adomavicius, Burtch, Ren (2018) Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining, Information Systems Research 29(1): 4-24.
  18. Berente, Seidel, Safadi (2019) Research Commentary: Data-Driven Computationally Intensive Theory Development, Information Systems Research 30(1), 50-64.
  19. Johnson, Gray, Sarker (2019) Revisiting IS Research Practice in the Era of Big Data, Information and Organization 29(1): 41-56
  20. Grover, Lindberg, Benbasat, Lyytinen (2020) The Perils and Promises of Big Data Research in Information Systems, Journal of the Association for Information Systems 21(2): 268-291.
  21. Shmueli (2021) INFORMS Journal of Data Science (IJDS) Editorial #1: What is an IJDS paper?, INFORMS Journal of Data Science.
  22. Burton-Jones, Boh, Oborn, Padmanabhan (2021) Editor’s Comments: Advancing Research Transparency at MIS Quarterly: A Pluralistic Approach, MIS Quarterly 45(2): iii-xviii.
  23. Berente, Gu, Recker, Santhanam (2021) Special Issue Editor’s Comments: Managing Artificial Intelligence, MIS Quarterly 45(3): 1433-1450.
  24. Jain, Padmanabhan, Pavlou, Raghu (2021) Editorial for the Special Section on Humans, Algorithms, and Augmented Intelligence: The Future of Work, Organizations, and Society, Information Systems Research 32(3): 675-687.


Targeting Pre-Roll Ads using Video Analytics

Park, Sungho, Gene Moo Lee, Donghyuk Shin, Sang-Pil Han. “Targeting Pre-Roll Ads using Video Analytics”, Under Reject ana Resubmit, Management Science. [Submitted: April 25, 2021]

  • Funded by Sauder Exploratory Research Grant 2020
  • Presented at Southern Methodist University (2020), University of Washington (2020), INFORMS (2020), WITS (2020), HKUST (2021), Maryland (2021), American University (2021)
  • Research assistants: Raymond Situ, Miguel Valarao

Pre-roll video ads continue to rise at an unparalleled pace, creating new opportunities and challenges. They are more immersive than conventional banner ads and must be viewed at least partially before the content video is played. On the other hand, the prevailing skippable format of pre-roll video ads that allows viewers to skip ads after five seconds generates opportunity costs for advertisers and online platforms when the ad is skipped. Against this backdrop, we propose a novel video analytics method for improving pre-roll video ad performance by extracting multi-modal (audio, video, text) properties from both video ads and content videos using deep learning and signal processing techniques, and then analyzing their effect on video ad completion. The findings indicate that the ad-content congruence in various modalities is essential in explaining viewers’ ad completion. Specifically, visual congruence (i.e., celebrity overlap in ad and content) and textual congruence (i.e., topic similarity of ad and content) play important roles as viewers may shape ex-ante expectations of the congruence based on visual cues (i.e., thumbnail images) and previous experience (i.e., watched content clips from the same program) before watching the content video. We also discover, through predictive analyses, that video ad completion can be reliably predicted by features derived from the proposed method. Surprisingly, there is no discernible loss of predictive power when analyzing only the first five seconds of ads and content videos rather than their entire length, resulting in significant cost savings when processing large video datasets.

Price Competition and Inactive Search

Koh, Yumi, Gea M. Lee, Gene Moo Lee (2021) “Price Competition and Inactive Search”. Working Paper. [Latest version: April 28, 2021] [SSRN]

We propose a model of price competition in which firms select prices conditional on privately-observed production costs and a subset of consumers can choose to search sequentially given price dispersion. We investigate how competition affects the consumers’ choice of whether to purchase immediately from a randomly-selected first firm or engage in sequential search. We establish two types of equilibria, random equilibrium and searching equilibrium, based on the consumers’ search decision in equilibrium. We show that sequential search can be completely or at least partially inactivated in the market with a sufficiently large number of competing firms.

Structural Hole-based Measures of Firm’s Strategic Competitive Positioning

Lee, Myunghwan, Gene Moo Lee, Hasan Cavusoglu, Marc-David L. Seidel. “Structural Hole-based Measures of Firm’s Strategic Competitive Positioning”, Working Paper.

The theory of network opportunity emergence holds that as the overall industry network structure becomes centralized, opportunities emerge for new entrants. However, new entrants must correctly strategically position themselves in the market to be properly valued. This creates tensions for entrepreneurial ventures considering going public around how to craft their strategic posture to take advantage of differentiating opportunities in the market structure while still being familiar enough to customers and investors. In this paper, we propose a theory of IPO strategic posture to unpack these dynamics. We empirically test our theory using a machine learning approach called doc2vec to create a similarity matrix of all existing U.S. publicly traded companies based upon self-provided business descriptions provided in their 10-K annual reports. This enables us to measure existing companies’ similarities of strategic postures and identify where industry-level structural holes emerge. We then use these structural hole signatures of potential market entry opportunities to predict how new companies strategically posture an IPO. We then follow the trajectories of those newly listed companies to see how their strategic posture impacts growth and ultimate survival. We conclude with a discussion of how the institutional pressures of the venture capital industry create pressure for ventures to self-present their IPO strategic postures as too distinct for their own long-term survival.