Tag Archives: big data

AI Voice in Online Video Platforms: A Multimodal Perspective on Content Creation and Consumption

Zhang, Xiaoke, Mi Zhou, Gene Moo Lee “AI Voice in Online Video Platforms: A Multimodal Perspective on Content Creation and Consumption,” Working Paper.

Previous title: How Does AI-Generated Voice Affect Online Video Creation? Evidence from TikTok
Presentations: INFORMS DS (2022), UBC (2022), WITS (2022), Yonsei (2023), POSTECH (2023), ISMS MKSC (2023), CSWIM (2023), KrAIS Summer (2023), Dalhousie (2023), CIST (2023), Temple (2024), Santa Clara U (2024), Wisconsin Milwaukee (2024)
Best Student Paper Nomination at CIST 2023; Best Paper Runner-Up Award at KrAIS Summer Workshop 2023
Media coverage: [UBC News] [Global News]
API sponsored by Ensemble Data
SSRN version: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4676705

Major user-generated content (UGC) platforms like TikTok have introduced AI-generated voice to assist creators in complex multimodal video creation. AI voice in videos represents a novel form of partial AI assistance, where AI augments one specific modality (audio), whereas creators maintain control over other modalities (text and visuals). This study theorizes and empirically investigates the impacts of AI voice adoption on the creation, content characteristics, and consumption of videos on a video UGC platform. Using a unique dataset of 554,252 TikTok videos, we conduct multimodal analyses to detect AI voice adoption and quantify theoretically important video characteristics in different modalities. Using a stacked difference-in-differences model with propensity score matching, we find that AI voice adoption increases creators’ video production by 21.8%. While reducing audio novelty, it enhances textual and visual novelty by freeing creators’ cognitive resources. Moreover, the heterogeneity analysis reveals that AI voice boosts engagement for less-experienced creators but reduces it for experienced creators and those with established identities. We conduct additional analyses and online randomized experiments to demonstrate two key mechanisms underlying these effects: partial AI process augmentation and partial AI content substitution. This study contributes to the UGC and human-AI collaboration literature and provides practical insights for video creators and UGC platforms.

Ideas are Easy but Execution is Everything: Measuring the Impact of Stated AI Strategies and Capability on Firm Innovation Performance

Lee, Myunghwan, Gene Moo Lee **(2022) “Ideas are Easy but Execution is Everything: Measuring the Impact of Stated AI Strategies and Capability on Firm Innovation Performance”, Work-in-Progress.**

Presented at UBC (2022), INFORMS DS (2022)
AI classification scheme: https://misr.sauder.ubc.ca/robotics/
Research assistant: Raymond Situ

Contrary to the promise that AI will transform various industries, there are conflicting views on the impact of AI on firm performance. We argue that existing AI capability measures have two major limitations, limiting our understanding of the impact of AI in business. First, existing measures on AI capability do not distinguish between stated strategies and actual AI implementations. To distinguish stated AI strategy and actual AI capability, we collect various AI-related data sources, including AI conferences (e.g., NeurIPS, ICML, ICLR), patent filings (USPTO), inter-firm transactions related to AI adoption (FactSet), and AI strategies stated in 10-K annual reports. Second, while prior studies identified successful AI implementation factors (e.g., data integrity and intelligence augmentation) in a general context, little is known about the relationship between AI capabilities and in-depth innovation performance. We draw on the neo-institutional theory to articulate the firm-level AI strategies and construct a fine-grained AI capability measure that captures the unique characteristics of AI-strategy. Using our newly proposed AI capability measure and a novel dataset, we will study the impact of AI on firm innovation, contributing to the nascent literature on managing AI.

Seeing the Unseen: The Effects of Implicit Representation in an Online Dating Platform

Kwon, Soonjae, Gene Moo Lee, Dongwon Lee, Sung-Hyuk Park (2024) “Seeing the Unseen: The Effects of Implicit Representation in an Online Dating Platform,” Working Paper.

Previous title: Learning Faces to Predict Matching Probability in an Online Dating Market
Presentations: DS (2021), AIMLBA (2021), WITS (2021), ICIS (2022)
Preliminary version in ICIS 2022 Proceedings
Based on an industry collaboration

This study investigates the effects of implicit preference-based representation on user engagement and matching outcomes in two-sided platforms, focusing on an online dating context. We develop a novel approach using explainable AI and generative AI to create personalized representations that reflect users’ implicit preferences. Through extensive matching simulations, we demonstrate that implicit representation significantly enhances both user engagement and matching outcomes across various recommendation algorithms. Our findings reveal heterogeneous effects driven by positive cross-side and same-side network effects, which vary depending on the gender distribution within the platform. This research contributes to understanding implicit representation in two-sided platforms and offers insights into the transformative potential of generative AI in digital ecosystems.

My thoughts on AI, Big Data, and IS Research

Last update: May 31, 2024

Back in 2021, I had a chance to share my thoughts on how Big Data Analytics and AI will impact Information Systems (IS) research. Thanks to ever-growing datasets (public and proprietary) and powerful computational resources (cloud API, open-source projects), AI and Big Data will be important in IS research in the foreseeable future. If you are an aspiring IS researcher, I believe that you should be able to embrace this and take advantage of this.

First, AI and Big Data are powerful “tools” for IS research. It could be intimidating to see all the fancy new AI techniques. But they are just tools to analyze your data. You don’t need to reinvent the wheel to use them. There are many open-source projects in Python and R that you can use to analyze your data. Also, many cloud services (e.g., Amazon Rekognition, Google Cloud ML, Microsoft Azure ML) allow you to use pre-trained AI models at a modest cost (that your professors can afford). What you need is some working knowledge in programming languages like Python and R. And a high-level understanding of the idea behind algorithms.

Don’t shy away from hands-on programming. Using AI and Big Data tools may not be a competitive advantage in the long run because of the democratization of AI tools. However, I believe it will be the new baseline. So you need to have it in your research toolbox. Specifically, I believe that IS researchers should have a working knowledge of Python/R programming and Linux environment. I recommend these online courses: AI Fundamentals, Data Science, Machine Learning, Linux, SQL, and NoSQL.

Second, AI and Big Data Analytics are creating a lot of interesting new “phenomena” in personal lives, firms, and societies. How AI and robots will be adopted in the workplace and how will that affect the labor market? Are we losing our jobs? Or can we improve our productivity with AI tools? How will experts use AI in professional services? What are the unintended consequences (such as biases, security, privacy, and misinformation) of AI adoptions in the organization and society? And how can we mitigate such issues? There are so many new and interesting research questions.

To stay relevant, I think that IS researchers should closely follow emerging technologies. Again, it could be hard to keep up with all the advances. I try to keep up to date by reading industry reports (from McKinsey and Deloitte) and listening to many podcasts (e.g., Freakonomics Radio, a16 Podcasts by Andreessen Horowitz, Lex Fridman Podcast, Stanford’s Entrepreneurial Thought Leaders, HBR’s Exponential View by Azeem Azhar).

For UBC current and prospective students, here are some resources:

Student clubs: UBC BizTech, BOLT UBC,
Degree programs: BCOM Analytics Concentration, MBAN, MBA TAL Track

For educators, I have shared my teaching experience using AI in May 2024. You can find the slide deck here.

I hope this post may help people shape their research, teaching, and career strategies. I will try to keep updating this post. Cheers!

Trustworthy Face? The Effect and Drivers of Comprehensive Trust in Online Job Market Platform

Kwon, Jun Bum, Donghyuk Shin, Gene Moo Lee, Jake An, Sam Hwang (2020) “Trustworthy Face? The Effect and Drivers of Comprehensive Trust in Online Job Market Platform”. Work-in-progress.

To present at the 2020 Conference on Artificial Intelligence, Machine Learning, and Business Analytics
Funded by UBC Centre for Innovative Data in Economics Research (CIDER)

The abstract will appear here.

Service Robots and Workforce Transformation: Evidence from Restaurant Operations

Lee, Myunghwan, Gene Moo Lee, Donghyuk Shin, Wooje Cho, Sang-Pil Han (2025) “Service Robots and Workforce Transformation: Evidence from Restaurant Operations”, Working Paper.

Presented at WITS (2020), KrAIS (2020), UBC (2021), DS (2022)
Research assistants: Raymond Situ, Gallant Tang

The introduction of AI-powered service robots, those capable of order taking, table delivery, and busser support, is significantly altering the workflow dynamics within the restaurant industry, fundamentally reshaping operations. Although these robots hold considerable promise for enhancing customer experiences and operational efficiency, their integration can introduce complex and potentially unintended consequences. Successful integration demands a careful balance among customer acceptance, automation efficiency, and worker adaptation. Yet critical questions remain insufficiently explored, particularly how the adoption of robots affects the workforce structures. This study addresses this gap by theorizing and empirically examining the impact of robotic integration on the composition of labor, with emphasis on part-time workers, who represent a significant portion of the restaurant workforce. Increased automation may reduce the number of part-time positions, but among those who remain, service robots may augment their roles by supporting or replacing routine tasks, allowing workers to focus on higher-touch interactions. This dual effect—numerical displacement alongside functional augmentation— illustrates a nuanced form of inequality in which the benefits of automation accrue unevenly even within the same labor group. Such shifts could either exacerbate labor inequalities or create opportunities for workforce adaptation and upskilling. From a systematic analysis of operational and customer review data from 3,636 restaurants, our results uncover asymmetric and unintended consequences of robotic integration on labor costs, workforce distribution, and overall restaurant performance. By shedding light on the intersection of automation, workforce restructuring, and customer reception, our findings contribute to the nascent discourse on the digital transformation of retail operations. The insights offered have important implications for managers and policymakers navigating the evolving landscape of AI-driven automation in customer-facing industries.

What Fuels Growth? A Comparative Analysis of the Scaling Intensity of AI Start-ups

Schulte-Althoff, Matthias, Daniel Fuerstenau, Gene Moo Lee, Hannes Rothe, Robert Kauffman. “What Fuels Growth? A Comparative Analysis of the Scaling Intensity of AI Start-ups”. Working Paper. [ResearchGate]

Previous title: “A Scaling Perspective on AI startup”
Presented at HICSS 2021 (SITES mini-track), Copenhagen Business School 2021, FU Berlin 2021, University of Cologne 2021, University of Bremen 2021, Humboldt Institute for Internet and Society 2021, WITS 2022

We examine how firm revenue scales with labor for revenue-per-employee (RPE) and is moderated by firm-level AI investment. We compare AI start-ups, in which AI provides a competitive advantage, with digital platforms and service start-ups. We use propensity score matching to explain the scaling of start-ups and find evidence for sublinear scaling intensity for revenue as a function of labor. Our study suggests similar scaling intensities between AI and service start-ups, while platform start-ups produce higher scaling intensities. We show that an increase in employee counts is associated with major revenue increases for platform start-ups, while increases were modest for service and AI start-ups.

Corporate Social Network Analysis: A Deep Learning Approach

Cao, Rui, Gene Moo Lee, Hasan Cavusoglu. “Corporate Social Network Analysis: A Deep Learning Approach,” Working Paper.

Presented at UBC (2020), WITS (2020), DS (2021)
Based on Rui Cao’s Master’s Thesis
Research assistants: Anthony Chiodo, Daniel Lin, Miliban Keyim, and Sara Watts.
Corporate Social Network Visualization: https://misr.sauder.ubc.ca/corporate_network/index_full.html

Identifying inter-firm relationships is critical in understanding the industry landscape. However, due to the dynamic nature of such relationships, it is challenging to capture corporate social networks in a scalable and timely manner. To address this issue, this research develops a framework to build corporate social network representations by applying natural language processing (NLP) techniques on a corpus of 10-K filings, describing the reporting firms’ perceived relationships with other firms. Our framework uses named-entity recognition (NER) to locate the corporate names in the text, topic modeling to identify types of relationships included, and BERT to predict the type of relationship described in each sentence. To show the value of the network measures created by the proposed framework, we conduct two empirical analyses to see their impacts on firm performance. The first study shows that competition relationship and in-degree measurements on all relationship types have prediction power in estimating future earnings. The second study focuses on the difference between individual perspectives in an inter-firm social network. Such a difference is measured by the direction of mentions and is an indicator of a firm’s success in network governance. Receiving more mentions from other firms is a positive signal to network governance and it shows a significant positive correlation with firm performance next year.

IS Papers on Big Data, Analytics, and AI

Last update: Jan 22, 2025.

My research involves Big Data Analytics and AI in Information Systems literature. This post tries to keep track of the editorial and seminal articles on the topic of Big Data, Data Science, Analytics, and AI in the Information Systems and Management literature. The papers are listed in chronological order:

Bapna, Goes, Gopal, Marsden (2006) Moving from Data-Constrained to Data-Enabled Research: Experiences and Challenges in Collecting, Validating and Analyzing Large-Scale e-Commerce Data, Statistical Science 21(2): 116-130.
Shmueli and Koppius (2011) Predictive Analytics in Information Systems Research, MIS Quarterly 35(3): 553-572
Chen, Chiang, Storey, (2012) Business Intelligence and Analytics: From Big Data to Big Impact, MIS Quarterly 36(4): 1164-1188
Lin, Lucas Jr., Shmueli (2013) Research Commentary: Too Big to Fail: Large Samples and the p-Value Problem, Information Systems Research 24(4): 906-917.
Agarwal, Dhar (2014) Editorial – Big Data, Data Science, and Analytics: The Opportunity and Challenge for IS Research, Information Systems Research 25(3): 443-448
Varian (2014) Big Data: New Tricks for Econometrics, Journal of Economic Perspectives 28(2): 3-28
Goes (2014) Editor’s Comments: Big Data and IS Research, MIS Quarterly 38(3): iii-viii
AMJ Editors (2016) From the Editors: Big Data and Data Science Methods for Management Research, Academy of Management Journal 59(5): 1493-1507
Abbasi, Sarker, Chiang (2016) Big Data Research in Information Systems: Toward an Inclusive Research Agenda, Journal of the Association for Information Systems 17(2): i-xxxii
Rai (2016) Editor’s Comments: Synergies Between Big Data and Theory, MIS Quarterly 40(2): iii-ix
Baesens, Bapna, Marsden, Vanthienen, Zhao (2016) Transformational Issues of Big Data and Analytics in Networked Business, MIS Quarterly 40(4): 807-818
Athey (2017) Beyond Prediction: Using Big Data for Policy Problems, Science 355(6324): 483-485
Chiang, Grover, Liang, Zhang (2018) Special Issue: Strategic Value of Big Data and Business Analytics, Journal of Management Information Systems 35(2): 383-387
Delen, Ram (2018) Research challenges and opportunities in business analytics, Journal of Business Analytics 1(1): 2-12.
Maass, Parsons, Puraro, Storey, Woo (2018) Data-Driven Meets Theory-Driven Research in the Era of Big Data: Opportunities and Challenges for Information Systems Research, Journal of the Association for Information Systems 19(12): 1253-1273
Yang, Adomavicius, Burtch, Ren (2018) Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining, Information Systems Research 29(1): 4-24.
Berente, Seidel, Safadi (2019) Research Commentary: Data-Driven Computationally Intensive Theory Development, Information Systems Research 30(1), 50-64.
Johnson, Gray, Sarker (2019) Revisiting IS Research Practice in the Era of Big Data, Information and Organization 29(1): 41-56
Grover, Lindberg, Benbasat, Lyytinen (2020) The Perils and Promises of Big Data Research in Information Systems, Journal of the Association for Information Systems 21(2): 268-291.
Shmueli (2021) INFORMS Journal of Data Science (IJDS) Editorial #1: What is an IJDS paper?, INFORMS Journal of Data Science.
Ram, Goes (2021) Focusing on Programmatic High Impact Information Systems Research, not Theory, to Address Grand Challenges, MIS Quarterly 45(1): 479-483.
Burton-Jones, Boh, Oborn, Padmanabhan (2021) Editor’s Comments: Advancing Research Transparency at MIS Quarterly: A Pluralistic Approach, MIS Quarterly 45(2): iii-xviii.
Berente, Gu, Recker, Santhanam (2021) Special Issue Editor’s Comments: Managing Artificial Intelligence, MIS Quarterly 45(3): 1433-1450.
Jain, Padmanabhan, Pavlou, Raghu (2021) Editorial for the Special Section on Humans, Algorithms, and Augmented Intelligence: The Future of Work, Organizations, and Society, Information Systems Research 32(3): 675-687.
Padmanabhan, Fang, Sahoo, Burton-Junes (2022) Editor’s Comments: Machine Learning in Information Systems Research, MIS Quarterly 46(1): iii-xix.
Abbasi, Parsons, Pant, Sheng, Sarker (2024) Pathways for Design Research on Artificial Intelligence. Information Systems Research 35(2):441-459.

When Does Congruence Matter for Pre-roll Video Ads? The Effect of Multimodal, Ad-Content Congruence on the Ad Completion

Park, Sungho, Gene Moo Lee, Donghyuk Shin, Sang-Pil Han. “When Does Congruence Matter for Pre-roll Video Ads? The Effect of Multimodal, Ad-Content Congruence on the Ad Completion“, Working Paper [Last update: Jan 29, 2023]

Previous title: Targeting Pre-Roll Ads using Video Analytics
Funded by Sauder Exploratory Research Grant 2020
Presented at Southern Methodist University (2020), University of Washington (2020), INFORMS (2020), AIMLBA (2020), WITS (2020), HKUST (2021), Maryland (2021), American University (2021), National University of Singapore (2021), Arizona (2022), George Mason (2022), KAIST (2022), Hanyang (2022), Kyung Hee (2022), McGill (2022)
Research assistants: Raymond Situ, Miguel Valarao

Pre-roll video ads are gaining industry traction because the audience may be willing to watch an ad for a few seconds, if not the entire ad, before the desired content video is shown. Conversely, a popular skippable type of pre-roll video ads, which enables viewers to skip an ad in a few seconds, creates opportunity costs for advertisers and online video platforms when the ad is skipped. Against this backdrop, we employ a video analytics framework to extract multimodal features from ad and content videos, including auditory signals and thematic visual information, and probe into the effect of ad-content congruence at each modality using a random matching experiment conducted by a major video advertising platform. The present study challenges the widely held view that ads that match content are more likely to be viewed than those that do not, and investigates the conditions under which congruence may or may not work. Our results indicate that non-thematic auditory signal congruence between the ad and content is essential in explaining viewers’ ad completion, while thematic visual congruence is only effective if the viewer has sufficient attentional and cognitive capacity to recognize such congruence. The findings suggest that thematic videos demand more cognitive processing power than auditory signals for viewers to perceive ad-content congruence, leading to decreased ad viewing. Overall, these findings have significant theoretical and practical implications for understanding whether and when viewers construct congruence in the context of pre-roll video ads and how advertisers might target their pre-roll video ads successfully.

Gene Moo Lee, Ph.D.

Associate Professor of Information Systems, UBC Sauder School of Business