Korean J Financ Stud Search

CLOSE


Korean J Financ Stud > Volume 54(6); 2025 > Article
ETF는 유동성을 추종하는가, 아니면 창출하는가?: 한국 시장의 실증적 증거*

Abstract

This study examines the causal relationship between exchange-traded fund (ETF) ownership and stock liquidity in the Korean equity market. Using a panel dataset of firms listed on the KOSPI and KOSDAQ, we analyze the effect of ETF ownership on several liquidity measures, including the Amihud illiquidity measure, Lesmond transaction cost measure, and high-low spread. The results indicate that increases in ETF ownership significantly improve stock liquidity even after controlling for firm characteristics and fixed effects. We further test for potential reverse causality—whether stock liquidity attracts ETF ownership—and find only limited evidence supporting this channel. Both the magnitude and statistical significance of the reverse effect are substantially weaker than those of the direct impact of ETF ownership on liquidity. Overall, these findings suggest that ETFs are not merely passive investors but play an active role in enhancing stock market liquidity. This study provides new evidence on the relationship between ETF ownership and stock liquidity in the Korean market and highlights the implications of ETF growth for market microstructure.

요약

본 연구는 한국 주식시장에서 상장지수펀드(ETF) 보유지분과 주식 유동성 간의 인과관계를 분석한다. KOSPI 및 KOSDAQ 상장 기업의 패널 데이터를 이용하여, ETF 보유지분이 Amihud 비유동성, Lesmond 거래비용, 고저가 스프레드(high-low spread) 등 다양한 유동성 지표에 미치는 영향을 분석하였다. 분석 결과, 기업 특성과 고정효과를 통제한 후에도 ETF 보유지분의 증가는 주식 유동성을 유의하게 개선하는 것으로 나타났다. 또한 주식 유동성이 ETF 보유지분 증가를 유발하는지에 대한 잠재적인 역인과관계를 추가로 검증하였으나, 이를 지지하는 증거는 제한적이었다. 역방향 효과의 크기와 통계적 유의성은 ETF 보유지분이 유동성에 미치는 직접적인 영향에 비해 상당히 미약한 것으로 확인되었다. 종합적으로 본 연구의 결과는 ETF가 단순한 소극적 투자자에 머무르는 것이 아니라 주식시장 유동성을 제고하는 적극적인 역할을 수행함을 시사한다. 본 연구는 한국 시장에서 ETF 보유지분과 주식 유동성 간의 관계에 대한 새로운 실증적 증거를 제공하며, ETF 시장의 성장이 시장미시구조에 미치는 시사점을 제시한다.

1. Introduction

Exchange-Traded Funds (ETFs) have experienced remarkable growth in global capital markets over the past two decades. Characterized by low costs, high transparency, and intraday tradability, ETFs have rapidly become a major investment vehicle for both institutional and retail investors. The Korean ETF market has followed a similar trajectory. Since the launch of the first KOSPI 200 ETF in 2002, the market has expanded at an exceptional pace and now plays a central role in domestic equity trading and asset allocation.
The growth trajectory of the Korean ETF market highlights its increasing economic significance. Total ETF net assets rose from 19.7 trillion KRW in 2014 to 162.9 trillion KRW in 2024, representing nearly a tenfold increase. As of 2025, more than 1,000 ETFs are listed in Korea, and the average daily trading value reached approximately 3.4 trillion KRW in 2024, reflecting the rapidly expanding role of ETFs as a major liquidity-providing channel within the Korean equity market. The breadth of ETF offerings has also evolved substantially, ranging from broad-market index funds to thematic, strategic, leveraged/inverse, and derivative-linked ETFs, further integrating ETFs into the core functioning of the Korean stock market.
Korea’s institutional environment differs meaningfully from that of developed markets and may shape how ETF activity interacts with the underlying securities. The equity market operates under T+2 settlement rules, ±30% daily price limits, periodic short-selling restrictions, and intraday volatility-interruption (VI) mechanisms. These features influence trading incentives, arbitrage activity, and market microstructure. Compared with markets such as the United States—where price constraints are minimal and short-selling mechanisms are more flexible—these institutional differences may condition how effectively ETF-related arbitrage forces transmit to constituent stocks. Consequently, studying ETF ownership and liquidity in Korea provides valuable evidence from a distinct regulatory and microstructural setting, enriching the broader international literature.
Beyond market structure, the intrinsic design of ETFs further motivates examining their impact on liquidity. Unlike mutual funds, which execute trades only at end-of-day net asset values (NAV), ETFs are traded continuously on exchanges, and their creation-redemption mechanism enables authorized participants (APs) to arbitrage deviations between ETF prices and NAV. This structure can affect liquidity in two opposing ways. On one hand, arbitrage activity and market-making can narrow bid-ask spreads and increase depth, improving liquidity in the underlying securities. On the other hand, mechanical index-driven trading or noise transmitted from ETF flows may crowd out information-based trading, impair price efficiency, or amplify funding shocks. Identifying which mechanism dominates in practice is therefore an empirical question.
Prior studies offer mixed evidence. Some document that ETF ownership improves market quality by enhancing liquidity and price efficiency (Glosten et al., 2021; Pan & Zeng, 2019), while others show that ETF ownership increases non-fundamental volatility or weakens liquidity by displacing informed trading (Ben-David et al., 2018; Israeli et al., 2017). However, most of these studies focus on the U.S. and other developed markets. Little is known about whether the same mechanisms apply in Korea, where retail investor participation is substantial and the institutional environment differs significantly from global counterparts.
Moreover, recent research highlights the importance of reverse causality. ETFs often select stocks that already exhibit high liquidity due to index-design rules and fund mandates emphasizing investability. As a result, the observed positive relationship between ETF ownership and liquidity may reflect ex ante selection rather than ETF-induced liquidity changes. This issue is particularly salient in the Korean market, where index construction methodologies incorporate strict liquidity and tradability screens. Distinguishing whether ETF ownership improves liquidity (ETF → Liquidity) or whether ETFs simply select already liquid stocks (Liquidity → ETF) is essential for understanding the true economic impact of ETF growth.
This study addresses these gaps by examining the causal relationship between ETF ownership and stock liquidity in the Korean market. We test both the forward causal channel (ETF → Liquidity) and the reverse channel (Liquidity → ETF) using panel Granger-causality methods. To mitigate endogeneity concerns, we further employ an instrumental variable (IV) strategy based on industry-level shocks to ETF ownership. By incorporating multiple liquidity measures, causal inference techniques, and robustness analyses, we provide comprehensive evidence on whether ETF ownership merely reflects pre-existing liquidity characteristics or actively shapes liquidity dynamics in constituent stocks.
Our findings contribute to the literature by offering one of the first detailed causal analyses of ETF ownership and liquidity within the unique institutional and microstructural context of the Korean equity market. The results offer valuable implications for asset managers, market makers, index providers, and regulators seeking to understand the broader market effects of Korea’s rapidly expanding ETF sector.

2. Literature Review

The impact of ETF ownership on the liquidity of individual stocks has emerged as a central topic in recent capital market research. Due to their structural nature of index tracking, ETFs can exert systematic influences on the underlying asset market through periodic rebalancing and mechanical buy-sell processes. These effects are not limited to individual stock liquidity but may extend to broader aspects such as price efficiency, the speed of information incorporation, and overall trading behavior—underscoring the growing economic significance of this topic.
Empirical findings in the existing literature are mixed. Several studies argue that higher ETF ownership enhances market quality by promoting price efficiency and improving liquidity measures through arbitrage between ETF prices and their underlying net asset values (NAV). Glosten et al. (2021) show that in the U.S. market, ETF-underlying arbitrage narrows bid-ask spreads, reduces transaction costs, and deepens market liquidity, particularly for stocks with higher ETF ownership. Pan and Zeng (2019) present both theoretical and empirical evidence that ETF arbitrage accelerates price adjustment and increases order book depth. Koch and Ruenzi (2022), using international equity data, find that ETF inflows accelerate information incorporation and mitigate mispricing, suggesting that ETF ownership can contribute to market efficiency.
Conversely, other studies contend that ETF ownership may deteriorate market quality. Ben-David, Franzoni, and Moussawi (2018) document that the growth of ETFs amplifies stock price movements unrelated to fundamentals, thereby increasing non-fundamental volatility. Israeli, Lee, and Sridharan (2017) find that higher ETF ownership is associated with higher trading costs and weaker incentives for information-based trading, ultimately impairing price efficiency. In the Korean market, Choi and Choi (2021) report that ETF inflows lead to wider spreads and higher transaction costs, indicating a potential deterioration in liquidity.
Recent research further emphasizes that the relationship between ETF ownership and liquidity may not be unidirectional, highlighting the possibility of reverse causality. In other words, rather than ETFs altering liquidity, it may be that already liquid stocks are more likely to be selected into ETF portfolios. Madhavan and Sobczyk (2016) show that ETF managers tend to include large-cap, high-turnover, and narrow-spread stocks to minimize transaction costs and market impact. Israeli et al. (2017) and Huang et al. (2022) argue that much of the positive correlation between ETF ownership and liquidity arises from such ex ante selection effects. In particular, Huang et al. (2022) provide empirical evidence that stocks receiving ETF inflows already exhibit superior trading volume and order book depth prior to inclusion. Petajisto (2017) and Glosten et al. (2021) similarly note that ETF constituent stocks often possess higher liquidity even before increases in ETF ownership, underscoring the need to test both causal directions—ETF → Liquidity and Liquidity → ETF—in empirical analyses.
Complementary evidence from Asian markets further illustrates that ETF effects may not be uniform or one-directional. Using Chinese market data, Chen et al. (2020) decompose ETF flows into information-based and non-information-based components and show that information-driven ETF trading improves price efficiency. Wang and Ma (2025) likewise document that ETF ownership significantly enhances stock liquidity in China, attributing the effect to market-making activities and arbitrage between the primary and secondary markets. In contrast, Chen et al. (2024), employing daily ETF ownership to construct ETF flow measures, find that increases in ETF holdings intensify ETF-underlying arbitrage but also transmit noise information to constituent stocks, thereby reducing price efficiency. These findings imply that ETFs in emerging Asian markets—particularly China—can simultaneously generate liquidity-enhancing effects and information-distorting effects, suggesting that ETF influence is multifaceted rather than strictly directional.
This evidence is particularly relevant for Korea, where retail investor participation is high and institutional features differ from those of developed markets. Despite the rapid growth of the Korean ETF market, academic research examining ETF ownership and liquidity remains relatively limited. This study aims to fill this gap by conducting a detailed empirical investigation of the causal relationship between ETF ownership and stock liquidity in the Korean equity market. To this end, we employ multiple liquidity proxies, address endogeneity concerns using an instrumental variable (IV) framework, and explicitly test for potential reverse causality. This comprehensive approach enables us to evaluate whether ETF ownership merely reflects pre-existing liquidity characteristics or actively shapes the liquidity dynamics of constituent stocks.

3. Research Methodology

3.1 Data Construction

This study examines all equity ETFs listed in the Korean market during the period from January 1, 2016, to December 31, 2024. To eliminate survivorship bias, the sample includes ETFs that were delisted during the observation period. Data on ETFs and individual stocks used in this study are obtained from FnGuide, a leading financial database in Korea.
Table 1 summarizes the number of ETFs included in the analysis at each year-end from 2016 to 2024, along with the descriptive statistics of the market capitalization of the constituent stocks held by those ETFs. During the sample period, the number of domestic equity ETFs increased approximately threefold, from 122 to 355, reflecting the rapid expansion of the market. Meanwhile, the average market capitalization of ETF-held stocks gradually declined, suggesting a growing inclusion of mid- and small-cap stocks rather than a concentration in large-cap stocks. Furthermore, the maximum values and standard deviations indicate persistent heterogeneity in the market capitalization of ETF constituents.
<Table 1>
Descriptive Statistics of ETFs
This table presents the descriptive statistics of the ETFs included in the analysis at each year-end, along with the market capitalization of the individual constituent stocks held by these ETFs.
Year Number Mean Std Min 25% 50% 75% Max
2016 122 2,728 8,007 22 86 233 1,276 48,976
2017 155 3,118 10,370 21 107 275 1,305 72,048
2018 200 2,551 8,416 23 92 283 1,299 71,614
2019 223 2,793 10,316 24 89 358 1,383 93,311
2020 238 1,985 6,218 34 98 262 1,203 56,464
2021 266 1,831 6,132 38 92 200 1,039 61,216
2022 282 1,538 5,189 35 69 133 1,043 51,832
2023 315 1,716 6,231 30 77 171 1,163 65,612
2024 355 1,603 5,541 74 72 172 935 54,912
Building on prior research on the role of ETFs in information efficiency, this study uses the aggregate ETF ownership in each stock as a proxy for ETF activity. This measure directly captures the net level of ETF engagement across multiple ETFs for a given stock (Glosten et al., 2021; Israeli et al., 2017).
ETFi,t =  j=1nSharesi,j,tTotal Shares Outstandingi,t
ETF ownership is calculated for each stock on a monthly basis. In the equation above, i denotes an individual stock, t represents the month, and j refers to each ETF. To compute ETF ownership, the end-of-month portfolio disclosure files (PDFs) of each ETF are used to determine the number of shares held in each constituent company. The total ETF-held shares for stock i at time t (j=1nSharesi,j,t) are obtained by summing the number of shares held across all domestic equity ETFs. Dividing this total by the stock’s total number of shares outstanding (Tatal Shares Outstandingi,t) yields the ETF ownership.
This study employs three representative illiquidity measures to empirically analyze the relationship between ETF ownership and stock liquidity. In all measures, higher values indicate greater illiquidity, which corresponds to lower liquidity. Specifically, Cost represents the effective transaction cost estimated from daily return data using the limited dependent variable (LDV) model proposed by Lesmond et al. (1999). This measure captures the marginal trader’s effective bid-ask spread by estimating the threshold returns required to trigger a trade. The Amihud illiquidity measure (ALLIQ) captures the price impact of a given trading volume by dividing the absolute value of stock returns by the trading amount (Amihud, 2002). Lastly, the High-Low Spread (HLS) estimates transaction costs using the difference between daily high and low prices—larger values imply higher trading costs and lower liquidity (Corwin & Schultz, 2012). Detailed definitions of these variables are provided in Appendix A.
Following prior studies, several control variables are included in the analysis. First, we control for firm size (lnMVEi-1), measured as the natural logarithm of firm i’s market capitalization at the end of month t-1, since larger firms typically exhibit greater price efficiency due to better information environments and broader analyst coverage. Second, we include institutional ownership (INSTi-1), representing the ownership share held by institutional investors with disclosure obligations for holdings exceeding 5% in the Korean market. Institutional investors’ monitoring and information advantages can influence both price formation and liquidity. Third, we control for return volatility (RVi-1) and stock turnover (TURNi-1), reflecting that higher volatility tends to worsen price efficiency, while greater turnover improves it. Volatility is measured as the annualized standard deviation of daily returns for firm i during month t-1, and turnover is calculated as the average monthly share turnover over months t-2 to t-1. Fourth, we include the book-to-market ratio (BTMi-1), computed as the book value of equity divided by market value at the end of month t-1, to control for the effects of financial distress and growth opportunities on stock mispricing (Fama & French, 1992; Lakonishok et al., 1994). Finally, to account for time trends and unobservable firm-specific characteristics, we incorporate year-month fixed effects and firm fixed effects in all regression models.
Table 2 presents the summary statistics and correlations of ETF ownership and key variables. To mitigate the influence of outliers, all continuous variables are winsorized at the 1st and 99th percentiles cross-sectionally within each month. Panel A shows that all liquidity measures (ALLIQ, HLS, and Cost) exhibit right-skewed distributions. ETF ownership is also concentrated near zero with a long right tail, indicating that extreme illiquidity conditions occur only occasionally and that ETF holdings tend to be concentrated in a limited number of stocks. Panel B shows that ETF ownership is negatively correlated with the liquidity measures. This suggests a consistent pattern in which higher ETF ownership is associated with improved liquidity (lower illiquidity). The following analysis further investigates the relationship between ETF ownership and liquidity.
<Table 2>
Descriptive Statistics and Correlation Matrix of Key Variables
This table reports the descriptive statistics and correlation coefficients of ETF ownership and the main variables. The sample period spans 108 months, from January 2016 to December 2024.
Panel A: Summary Statistics

mean std min 25% 50% 75% max
ALLIQ 0.0008 0.0019 0.0000 0.0001 0.0002 0.0007 0.0317
Cost 0.5198 0.2839 0.0605 0.3163 0.4581 0.6573 1.8693
HLS 0.0196 0.0227 0.0000 0.0020 0.0140 0.0280 0.2084
ETF Ownership 0.0053 0.0071 0.0000 0.0004 0.0016 0.0087 0.0432
INST 0.0155 0.0612 0.0000 0.0000 0.0000 0.00000 0.3907
lnMVE 12.7104 1.3571 10.0054 11.7592 12.4699 13.409 17.1598
BTM 1.4235 1.264 0.048 0.5412 1.0522 1.9038 9.7642
RV 0.1342 0.061 0.0344 0.0926 0.1219 0.1607 0.4378
TURN 0.0139 0.0242 0.0001 0.0024 0.0056 0.0136 0.2403

Panel B: Correlation

ALLIQ Cost HLS ETF Ownership INST lnMVE BTM RV TURN

ALLIQ 1
Cost 0.0424 1
HLS -0.0387 0.1130 1
ETF Ownership -0.2424 -0.2856 -0.0115 1
INST 0.0277 -0.0675 -0.0268 0.0714 1
lnMVE -0.3549 -0.3461 -0.0489 0.6633 0.0565 1
BTM 0.2089 -0.0059 -0.0801 -0.1521 -0.0102 -0.1164 1
RV -0.1608 0.2067 0.1461 -0.0523 -0.1255 -0.1198 -0.3285 1
TURN -0.1527 0.2455 0.1417 -0.0825 -0.0885 -0.1236 -0.2169 0.382 1

3.2 Granger Causality Test

In analyzing the relationship between ETF ownership and individual stock liquidity, simple regression analysis alone is insufficient to clearly identify the temporal order between variables. In particular, to distinguish whether ETF ownership affects liquidity (forward causality) or whether liquidity drives changes in ETF ownership (reverse causality), it is necessary to test the predictive power between the two variables over time. To this end, this study applies the Dumitrescu-Hurlin (2012) panel Granger causality test.
The null hypothesis (H0) of this test states that the independent variable x does not Granger-cause the dependent variable y for all cross-sectional units i, while the alternative hypothesis (H1) posits that Granger causality exists for at least some units. The optimal lag length (p) for each firm is determined using the Akaike Information Criterion (AIC). The model used for the test is specified as follows:
yi,t=αi+k=1pγi(k)yi,tk+k=1pβi(k)xi,tk+εi,t
Before estimating the Granger causality and panel VAR models, we examine whether the variables exhibit unit roots. We perform two panel unit root tests—the Im, Pesaran, and Shin (IPS) test and the Fisher-type ADF test. The results strongly reject the null hypothesis of non-stationarity for all variables at the 1% significance level, indicating that the series are sufficiently stationary for causality analysis.
Here, Ho:βi(k)=0i,k indicates that x does not Granger-cause y for any cross-sectional unit.
W¯= 1N k=1NWi
The Dumitrescu-Hurlin test proceeds as follows:
  • (1) compute the Granger test F-statistic Wi for each cross-sectional unit i;

  • (2) calculate the average of these statistics across all units;

  • (3) transform the average into a standardized normal test statistic Z̅ accounting for the asymptotic distribution properties determined by the sample size (T), number of cross-sectional units (N), and lag length (p); and

  • (4) determine whether to reject the null hypothesis based on the p-value of Z̅

While the panel-level test is effective in capturing the average causal tendency across the sample, it has the limitation of not directly reflecting heterogeneity among individual units. Therefore, as a complementary approach, this study also performs firm-level Granger causality tests for each company separately. For these individual tests, the number and proportion of firms rejecting the null hypothesis at the 10%, 5%, and 1% significance levels are reported.
By jointly presenting the results from both the panel-level and firm-level analyses, this study captures not only the overall average pattern of causality but also firm-specific variations. This dual approach enhances the validity of the subsequent bidirectional panel regressions.
Through this procedure, the firm-level tests identify which stocks exhibit Granger causality within the sample, while the panel-level test confirms the general direction and statistical significance of causality across the market. In other words, the unit-level analysis reveals asymmetric relationships under specific firm characteristics or conditions, whereas the panel-level analysis provides a generalized market-wide trend. Combining the two results enables this study to assess not only whether the relationship between ETF ownership and liquidity is significant across the overall sample but also whether it is concentrated among certain stocks or broadly distributed across the market. This integrated approach strengthens the interpretability and causal credibility of the coefficients estimated in the subsequent bidirectional panel regression analysis.

3.3 Panel Regression Models

Prior studies have reported mixed findings, suggesting that ETF ownership can either improve or deteriorate stock liquidity (Israeli et al., 2017; Ben-David et al., 2018; Pan and Zeng, 2019), while others have raised the possibility of reverse causality, where liquidity itself may influence ETF inclusion (Petajisto, 2017; Glosten et al., 2021). Accordingly, this study considers both causal directions—”ETF → Liquidity” and “Liquidity → ETF”—to more rigorously examine the bidirectional relationship between ETF ownership and stock liquidity.
To achieve this, two separate fixed-effects panel regression specifications are estimated, each corresponding to one directional hypothesis. The first model evaluates whether ETF ownership predicts subsequent changes in liquidity, whereas the second model assesses whether liquidity conditions predict changes in ETF ownership.
Liquidityi,t=α+β1ETFi,t1+γ'Controlsi,t1+μi+τt+δk+εi,t
ETFi,t=α+β1Liquidityi,t1+γ'Controlsi,t1+μi+τt+δk+εi,t
All explanatory variables are lagged by one month to mitigate potential simultaneity bias. The dependent variable is the monthly liquidity measure (Liquidityi,t) in the first model and ETF ownership (ETFi,t) in the second model. Liquidity measures include the Amihud illiquidity measure (ALLIQ), trading cost proxy (Cost), and High-Low spread (HLS), all defined such that higher values indicate greater illiquidity. Control variables include firm size (lnMVE), book-to-market ratio (BTM), stock turnover (TURN), return volatility (RV), and institutional ownership (INST). All continuous variables are winsorized at the 1st and 99th percentiles to mitigate the influence of outliers.
μi denotes firm fixed effects (Firm FE), which control for time-invariant firm-specific characteristics (e.g., long-term structure, information environment). τt represents year-month fixed effects (Time FE), controlling for macroeconomic, policy, or market-wide shocks that affect all firms simultaneously. δk indicates fixed effects (Industry FE) based on firm i’s industry affiliation, accounting for average differences across industries.
The separate estimation of these two specifications allows direct comparison of the strength, sign, and economic relevance of forward versus reverse predictive relationships. This structure aligns with the preceding Dumitrescu-Hurlin panel causality results and facilitates interpretation by isolating each directional mechanism rather than jointly modeling the system.
In the first model, a negative coefficient on ETFi,t-1 indicates that higher ETF ownership is associated with lower transaction costs and narrower spreads, whereas a positive coefficient implies deterioration in liquidity. Thus, β1 <0 is consistent with ETFs contributing to liquidity provision, while β1 >0 aligns with non-fundamental demand and arbitrage activity exerting trading pressure.
In the second model, a negative coefficient on Liquidityi,t-1 suggests that ETF allocations tend to increase for stocks with better trading conditions, whereas a positive coefficient implies ETF accumulation in relatively illiquid constituents. This direction provides insight into whether ETF portfolio construction is driven by liquidity considerations (i.e., arbitrage feasibility and turnover efficiency) or benchmark replication dynamics.

4. Empirical Analysis

4.1 Causality Test

Prior to conducting the panel Granger causality analysis, we examine the stationarity properties of all variables using the IPS (Im-Pesaran-Shin) panel unit root test (Im, Pesaran, and Shin, 2003) and the Fisher-type ADF panel unit root test (Maddala and Wu, 1999). Both tests reject the null hypothesis of a unit root for all variables at the 1% or 5% significance levels, confirming that the panel series are stationary. The detailed unit root test results are reported in Appendix Table A1.
Given the stationarity of the variables, we proceed with the Dumitrescu-Hurlin (2012) panel Granger causality framework, which allows for heterogeneity in causal relationships across cross-sectional units and is well suited for large and unbalanced panel data.
The consistently large Z-bar statistics further suggest that this predictability reflects an economically meaningful interaction rather than a noise-driven or incidental relationship. In other words, ETF trading and ownership adjustments influence transaction costs, spreads, and other market-friction components, while the existing liquidity environment simultaneously affects ETF tracking activity, primary and secondary market creation and redemption, and passive capital allocation behavior—indicating the presence of a feedback mechanism.
Taken together, these results justify the bidirectional panel regression framework introduced later in the analysis and provide a foundation to investigate whether the economic magnitude and structural characteristics of the relationship differ across the two causal directions.
Table 4 presents the results of the Granger causality tests analyzing the relationship between ETF ownership and stock liquidity. The analysis is conducted in two directions—whether ETF ownership precedes liquidity (ETF → Liquidity) or liquidity precedes ETF ownership (Liquidity → ETF)—using three liquidity proxies: Cost, ALLIQ, and HLS. Panel A reports the results of the univariate Granger tests using only variable pairs, while Panel B presents the results of the multivariate VAR-based Granger tests that include key control variables such as market capitalization, book-to-market ratio (BTM), turnover, return volatility, and institutional ownership.
<Table 3>
Dumitrescu-Hurlin Panel Granger Causality Test Results
This table presents the results of the Dumitrescu-Hurlin (2012) panel Granger causality test. The test examines the direction of causality between changes in ETF ownership and various liquidity measures. The null hypothesis (H0) states that the independent variable does not Granger-cause the dependent variable for any cross-sectional unit. The analysis uses firm-level monthly panel data, and the optimal lag length for each firm is selected based on the Akaike Information Criterion (AIC). The table reports the Z-bar statistics and the corresponding p-values computed from the standard normal distribution.
ETF → Liquidity Liquidity → ETF

Liquidity Variable Mean F-stat N Firms Z-bar p-value Mean F-stat N Firms Z-bar p-value
Cost 1.9842 1385 25.9004 0.0000 2.0746 1384 28.2687 0.0000
ALLIQ 1.9637 1382 25.3329 0.0000 1.8163 1381 21.4494 0.0000
HLS 1.9777 1368 25.5700 0.0000 1.8134 1375 21.3273 0.0000
<Table 4>
Granger Causality Test: Directional Analysis Between ETF Ownership and Liquidity
This table summarizes the results of the Granger causality tests examining the directional relationship between ETF ownership and stock liquidity. The analysis is conducted in two directions—forward (ETF ownership → Liquidity) and reverse (Liquidity → ETF ownership)—using three liquidity proxies: Cost, ALLIQ, and HLS. Panel A reports the results of the univariate Granger tests without control variables, while Panel B presents the results of the multivariate VAR-based Granger tests, controlling for firm size, book-to-market ratio (BTM), turnover, return volatility, and institutional ownership. Each cell shows the number and proportion of firms (relative to the total sample) for which the null hypothesis of no causality is rejected at the 10%, 5%, and 1% significance levels.
ETF → Liquidity Liquidity → ETF

10% 5% 1% 10% 5% 1%

panel A: Univariate Granger Causality Test
Cost 328 / 1442 (22.75%) 192 / 1442 (13.31%) 62 / 1442 (4.30%) 298 / 1442 (20.67%) 205 / 1442 (14.22%) 77 / 1442 (5.34%)
ALLIQ 305 / 1442 (21.15%) 194 / 1442 (13.45%) 72 / 1442 (4.99%) 254 / 1442 (17.61%) 151 / 1442 (10.47%) 52 / 1442 (3.61%)
HLS 294 / 1442 (20.39%) 193 / 1442 (13.38%) 80 / 1442 (5.55%) 203 / 1442 (14.08%) 115 / 1442 (7.98%) 50 / 1442 (3.47%)

panel B: Multivariate VAR-based Granger Causality Test

Cost 306 / 1297 (23.59%) 200 / 1297 (15.42%) 91 / 1297 (7.02%) 283 / 1297 (21.82%) 184 / 1297 (14.19%) 77 / 1297 (5.94%)
ALLIQ 250 / 1297 (19.28%) 172 / 1297 (13.26%) 80 / 1297 (6.17%) 239 / 1297 (18.43%) 146 / 1297 (11.26%) 61 / 1297 (4.70%)
HLS 278 / 1297 (21.43%) 190 / 1297 (14.65%) 70 / 1297 (5.40%) 223 / 1297 (17.19%) 147 / 1297 (11.33%) 58 / 1297 (4.47%)
In Panel A, the proportion of firms showing significant causality in the forward direction is generally higher than in the reverse direction. For example, in the case of the Cost measure, at the 10% significance level, the forward direction accounts for 22.75% (328 out of 1,442 firms), compared to 20.67% (298 out of 1,442 firms) in the reverse direction—a difference of about 2 percentage points. Similarly, for ALLIQ and HLS, the forward direction shows consistently higher ratios: 21.15% vs. 17.61% and 20.39% vs. 14.08%, respectively.
In Panel B, even after controlling for endogeneity through the inclusion of additional covariates, this pattern remains robust. For Cost, at the 10% level, the forward direction accounts for 23.59% (306 out of 1,297 firms), higher than 21.82% (283 out of 1,297 firms) in the reverse direction. For ALLIQ, the figures are 19.28% vs. 18.43%, and for HLS, 21.43% vs. 17.19%, with the forward direction generally showing higher proportions except for HLS.
Overall, these results indicate the presence of bidirectional causality between changes in ETF ownership and stock liquidity. In other words, changes in ETF ownership significantly precede subsequent changes in liquidity, while liquidity levels also help explain future changes in ETF ownership. However, the statistical significance of the reverse causality tends to be relatively weaker, with the forward direction (ETF → Liquidity) being more consistent and pronounced. This finding suggests that ETFs are not merely passive investment vehicles but play an active structural role in shaping market liquidity.

4.2 Relationship Between ETF Ownership and Liquidity

To examine the impact of ETF ownership on the liquidity of individual stocks, this study conducts a panel regression analysis using various liquidity measures as dependent variables. The dependent variables include the Lesmond transaction cost measure (Cost), the Amihud (2002) illiquidity measure (ALLIQ), and the High-Low spread (HLS). All regression models include firm fixed effects and time fixed effects to control for unobserved firm-specific heterogeneity and common time-varying factors, respectively.
Table 5 presents the regression results analyzing the effect of ETF ownership on stock liquidity, using three liquidity measures—Cost, ALLIQ, and HLS—as dependent variables. All regression models include year-month fixed effects, firm fixed effects, and industry fixed effects, and standard errors are clustered at the firm level.
<Table 5>
The Impact of ETF Ownership on the Liquidity of the Underlying Stocks
This table presents the regression results analyzing the effect of lagged ETF ownership on stock liquidity. All regression models include year-month fixed effects, firm fixed effects, and industry fixed effects. Standard errors are clustered at the firm level. t-statistics are reported in parentheses, and statistical significance at the 1%, 5%, and 10% levels is denoted by ***, **, and *, respectively.
Dependent Variable: Liquidity Measure
Cost ALLIQ HLS
Intercept -0.3827**(-2.03) -0.0221(-0.22) -0.1204**(-2.46)
ETFt-1 -0.0444***(-3.14) -0.0156**(-2.19) -0.0219***(-4.62)
INSTt-1 -0.0003(-0.21) 0.007***(3.23) -0.0006(-1.13)
lnMVEt-1 -0.3894***(-10.01) -0.3003***(-7.26) 0.0523***(3.7)
BTMt-1 -0.0304(-1.57) 0.1176***(4.86) -0.0509***(-6.71)
TURNt-1 0.1341***(14.66) -0.0633***(-10.55) 0.0688***(16.36)
RVt-1 0.1298***(7.89) -0.0781***(-4.4) 0.0327***(5.86)
Firm FE Yes Yes Yes
Time FE Yes Yes Yes
Industry FE Yes Yes Yes
Observations 111,033 111,033 111,033
R-squared 0.0427 0.0592 0.0128
The regression results show that the lagged ETF ownership variable (ETFt-1) consistently exhibits negative coefficients across all three liquidity measures, with statistically significant estimates. Specifically, higher ETF ownership is associated with lower values of Cost, ALLIQ, and HLS, indicating an overall improvement in liquidity. In particular, in the HLS model, the coefficient on ETFt-1 is -0.0219***(-4.62), highly significant at the 1% level. The coefficients for Cost (-0.0444, t = -3.14) and ALLIQ (-0.0156, t = -2.19) are also significantly negative at the 1% and 5% levels, respectively. These results suggest that stocks with higher ETF ownership tend to exhibit narrower spreads, lower indirect transaction costs, and improved market trading efficiency.
This implies that ETFs are not merely passive portfolio-tracking instruments but can structurally influence the liquidity of their underlying assets through mechanisms such as creation and redemption processes and market makers’ inventory adjustments. The expansion of ETF ownership thus serves as a channel for alleviating liquidity constraints at the individual stock level, carrying important implications for the capital market in terms of reducing transaction costs and enhancing price discovery efficiency.
To verify that the impact of ETF ownership on stock liquidity reflects a causal rather than a mere correlational relationship, a reverse regression analysis was additionally conducted. In this analysis, liquidity measures are used as independent variables, while ETF ownership serves as the dependent variable, allowing for an examination of whether liquidity conditions induce changes in ETF ownership—that is, the possibility of reverse causality.
Table 6 reports the regression results examining whether stock liquidity affects ETF ownership. The results indicate that liquidity conditions are related to ETF ownership only in a limited and non-uniform manner. While all three liquidity measures—Cost, ALLIQ, and HLS—exhibit negative coefficients, statistical significance is observed only for Cost and HLS. Specifically, the coefficient on Cost is −0.0208 (t = −3.00), and the coefficient on HLS is -0.014***(-5.26), both significant at the 1% level. In contrast, the coefficient on ALLIQ is negative but not statistically significant (−0.0068, t = −1.64). These findings suggest that stocks with higher transaction costs and wider spreads tend to have lower ETF ownership, whereas broader composite liquidity conditions do not significantly predict ETF holdings once firm, time, and industry fixed effects are controlled for.
<Table 6>
The Impact of Liquidity on ETF Ownership of the Underlying Stocks
This table presents the regression results analyzing the effect of liquidity measures on ETF ownership. The dependent variable is ETF ownership, and the main independent variables are the three liquidity measures—Cost, ALLIQ, and HLS—each used separately. All regression models include firm fixed effects, time fixed effects, and industry fixed effects, and standard errors are clustered at the firm level. t-statistics are reported in parentheses, and statistical significance at the 1%, 5%, and 10% levels is denoted by ***, **, and *, respectively.
Dependent Variable: ETF Ownership

Cost ALLIQ HLS
Intercept -0.1626(-1.06) -0.1554(-1.01) -0.157(-1.02)
ETFt-1 -0.0208***(-3.0) -0.0068(-1.64) -0.014***(-5.26)
INSTt-1 0.0036*(1.87) 0.0036*(1.89) 0.0036*(1.87)
lnMVEt-1 0.6719***(13.5) 0.6784***(13.5) 0.6818***(13.67)
BTMt-1 0.0098(0.49) 0.0114(0.57) 0.0096(0.48)
TURNt-1 -0.0481***(-8.84) -0.0517***(-9.73) -0.0493***(-9.32)
RVt-1 -0.0056(-0.43) -0.0094(-0.72) -0.0083(-0.64)
Firm FE Yes Yes Yes
Time FE Yes Yes Yes
Industry FE Yes Yes Yes
Observations 111,033 111,033 111,033
R-squared 0.1165 0.1158 0.1161
Tables 5 and 6 jointly summarize the evidence on the bidirectional relationship between ETF ownership and stock liquidity. Table 5 focuses on the forward direction, examining the effect of ETF ownership on stock liquidity, while Table 6 analyzes the reverse direction, testing whether liquidity conditions influence ETF ownership. Since both sets of regressions employ standardized variables for the dependent and key independent variables, the estimated coefficients can be directly compared in terms of their sign and relative magnitude.
The forward regression results in Table 5 show that ETF ownership is consistently associated with improvements in stock liquidity across multiple measures, including HLS, Cost, and ALLIQ. Taken together, the results suggest that the effect of ETF ownership on stock liquidity is more pronounced and robust than the reverse effect of liquidity on ETF ownership. This asymmetry supports the interpretation that ETF ownership plays an active role in shaping the liquidity of underlying stocks rather than merely reflecting pre-existing liquidity conditions.
Figure 1, which visualizes the comparison between Tables 5 and 6, illustrates the relationship between ETF ownership and liquidity in both the forward direction (ETF → Liquidity) and the reverse direction (Liquidity → ETF). As shown in the figure, in the forward analysis, ETF ownership consistently exhibits negative coefficients across all key liquidity measures (Cost, ALLIQ, and HLS), suggesting that higher ETF ownership is associated with reduced transaction costs and narrower spreads—indicative of improved liquidity.
<Fig. 1>
Directional Effects: ETF ↔ Liquidity
This figure illustrates the standardized coefficients (with 95% confidence intervals) derived from the regression results in Tables 5 and 6, comparing the effects of ETF ownership on liquidity (ETF → Liquidity) and the reverse relationship (Liquidity → ETF) across the three liquidity measures—Cost, ALLIQ, and HLS.
kjfs-54-6-471-g001.jpg
In contrast, the reverse analysis shows smaller coefficient magnitudes and lower explanatory power, implying that liquidity plays a relatively limited role in driving subsequent changes in ETF ownership.
These findings support the notion that the observed relationship is not merely the result of ETFs being concentrated in already liquid stocks. Rather, ETFs appear to function as a causal mechanism that enhances the liquidity of their underlying assets through creation-redemption processes and secondary market trading activities. In other words, ETFs act not only as passive tracking vehicles but also as liquidity providers and amplifiers within the broader market microstructure.

4.3 Endogeneity Test

When analyzing the relationship between ETF ownership and liquidity, there is a potential concern that ETF ownership may be endogenously determined by investor trading demand or firm-specific characteristics. To mitigate this endogeneity issue, this study employs an instrumental variable (IV) approach, using the industry-level average ETF ownership for other firms within the same industry during the same month (Indavg_ETFi,t-1) as the instrument.
This variable captures the overall ETF demand within the industry while remaining exogenous to the liquidity of individual firms, thereby providing valid exogenous variation for identifying the causal effect of ETF ownership on liquidity.
ETFi,t1^=α+λIndavgEFTi,t1+γ'Controlsi,t1+μi+τt1+δk+εi,t1
In the first-stage regression, this instrumental variable is used to estimate the predicted values of individual firms’ ETF ownership. Specifically, the dependent variable is the monthly ETF ownership of each firm, while the independent variables include the instrumental variable (Indavg_ETFi,t-1) along with the same set of control variables used in the baseline OLS analysis—market capitalization, book-to-market ratio, turnover, return volatility, and institutional ownership.
Liquidityi,t=α+β1ETF^i,t1+γ'Controlsi,t1+μi+τt+δk+εi,t
In the second-stage regression, the predicted ETF ownership from the first stage (ETF^i,t1) is used as the main independent variable, while the three liquidity measures—Cost, ALLIQ, and HLS—are used separately as dependent variables. The same control variables and fixed effects (firm, time, and industry) as in the baseline OLS regressions are included to ensure consistency of estimation.
Through this two-stage least squares (2SLS) estimation procedure, the analysis exploits exogenous variation in ETF ownership to distinguish causal effects from simple correlations, following the standard instrumental-variable framework in applied econometrics (Angrist and Pischke, 2009; Wooldridge, 2010). To assess the validity and strength of the instrumental variable used in the 2SLS estimation, we report a set of standard identification and diagnostic tests in Table 7.
<Table 7>
IV Diagnostic Tests
This table reports diagnostic statistics assessing the identification strength of the instrumental variable used in the 2SLS estimation. The Kleibergen-Paap rk LM statistic tests for under-identification, while the Cragg-Donald Wald F statistic and the heteroskedasticity-robust Kleibergen- Paap rk Wald F statistic evaluate instrument strength. Stock-Yogo (2005) weak-instrument critical values are reported for reference.
Test Category Statistic Value Null Hypothesis Interpretation
Under-identification Test Kleibergen-Paap rk LM statistic 4056.40 (0.000) Instrument is not relevant (model is under-identified) Reject H0 → The model is identified and the instrument is relevant
Weak Instrument Tests Cragg-Donald Wald F statistic 190.98 Instrument is weak F ≫ Stock-Yogo threshold → No weak instrument concern

Note: The Stock-Yogo (2005) weak-instrument critical value corresponding to a 10% maximal IV size is 16.38.

Across all specifications, the Kleibergen-Paap rk LM statistics strongly reject the null hypothesis of under-identification (p < 0.01), indicating that the models are properly identified. The Cragg-Donald Wald F statistics substantially exceed the Stock-Yogo weak-instrument critical values, suggesting that the instrument is sufficiently strong and not subject to weak-instrument concerns. Because the models are exactly identified with a single instrument, over-identification tests such as the Hansen J statistic are not applicable. Taken together, these diagnostic results confirm that the 2SLS specifications satisfy the standard relevance condition required for valid causal inference (Angrist and Pischke, 2009; Wooldridge, 2010). Table 8 reports the corresponding 2SLS estimation results, which quantify the magnitude and direction of the effects of ETF ownership across multiple liquidity measures.
<Table 8>
IV Regression
This table presents the results of the two-stage least squares (2SLS) regression analysis, which addresses potential endogeneity in the relationship between ETF ownership and liquidity by using an instrumental variable (Indavg ETFt-1). In the first-stage regression, the average ETF ownership of other firms within the same industry is used to predict individual firms’ ETF ownership. In the second-stage regression, the predicted ETF ownership (ETF^t1) is used as the key independent variable, while the dependent variables are the three liquidity measures—Cost, ALLIQ, and HLS—respectively. All regression models include firm fixed effects, time fixed effects, and industry fixed effects, and standard errors are double-clustered by firm and time. t-statistics are reported in parentheses, and statistical significance at the 1%, 5%, and 10% levels is denoted by ***, **, and *, respectively.
ETFt-1 Cost ALLIQ HLS

1-stage 2-stage
Intercept -0.0432***(-0.47) 0.0973(1.14) -0.2603***(-3.94) -0.4717***(-16.23)
IndavgETFt-1 0.5563***(17.3)
ETF^t1 -0.0938**(-2.45) -0.0456(-1.26) -0.0309**(-2.12)
INSTt-1 0.0057***(2.86) -0.0025(-1.57) 0.0021(0.83) 0.0005(1.14)
lnMVEt-1 0.6121***(33.57) -0.2213***(-8.0) -0.3456***(-11.07) -0.0037(-0.37)
BTMt-1 -0.0638***(-4.96) 0.0048(0.38) 0.074***(2.84) -0.0461***(-10.62)
TURNt-1 -0.0236***(-4.1) 0.1216***(11.93) -0.1167***(-14.97) 0.0712***(18.35)
RVt-1 0.0157(1.4) 0.1175***(8.89) -0.1023***(-6.74) 0.0931***(20.68)
Firm FE Yes Yes Yes Yes
Time FE Yes Yes Yes Yes
Industry FE Yes Yes Yes Yes
Observations 110,515 110,515 110,515 110,515
R-squared 0.5476 0.2825 0.2386 0.4728
Table 8 reports the results of the instrumental variable (IV) two-stage least squares (2SLS) estimation used to address potential endogeneity concerns in the relationship between ETF ownership and stock liquidity. Since ETF ownership may be jointly determined with liquidity through investor trading preferences, benchmark inclusion effects, or unobserved firm characteristics, simple OLS estimates may suffer from simultaneity or omitted-variable bias. To mitigate this concern, the industry-level average ETF ownership of other firms within the same industry (Indavg ETFt-1) is used as an instrumental variable. This instrument captures industry-wide ETF demand while remaining exogenous to the liquidity of individual firms, allowing identification through exogenous variation.
The first-stage regression results confirm strong instrument relevance: Indavg ETFt-1 exhibits a statistically significant and economically meaningful positive association with firm-level ETF ownership (coefficient = 0.5563, t = 17.30), indicating that the instrument strongly predicts variation in ETF exposure across firms. In the second stage, the predicted ETF ownership ETF^t1 is used as the main explanatory variable, while the three liquidity measures—Cost, ALLIQ, and HLS—serve as dependent variables. Consistent with the baseline findings, the estimated coefficients for ETF^t1 are negative across all specifications, suggesting that higher ETF ownership improves liquidity by reducing transaction frictions.
Specifically, ETF^t1 is negatively associated with Cost (coefficient = -0.0938, t = -2.45) and HLS (coefficient = -0.0309, t = -2.12), and both effects are statistically significant at conventional levels. The coefficient for ALLIQ (-0.0456; t = -1.26) exhibits the same directional effect but lacks statistical significance. The estimated signs for control variables remain consistent with standard liquidity determinants—larger firms and firms with higher turnover tend to exhibit lower illiquidity, whereas higher volatility is associated with wider spreads and elevated transaction costs.
Overall, the 2SLS results reinforce the interpretation that the effect of ETF ownership on stock liquidity reflects a causal relationship rather than a mechanical correlation. The negative and significant estimates for liquidity-cost measures (Cost and HLS), coupled with the strong first-stage relationship and instrument diagnostics, indicate that increases in ETF ownership contribute to lower trading costs and improved liquidity conditions in the underlying securities.

4.4 ETFs Improve Liquidity of Constituent Stocks Through Arbitrage Channel

Because ETF prices in the secondary market are determined by supply and demand, deviations can occur between the ETF’s market price and the net asset value (NAV) of its underlying assets. When such deviations exceed a certain threshold, investors engage in arbitrage trading. Specifically, when the ETF’s secondary market price falls below the NAV of the underlying assets (discount condition), arbitrageurs buy the ETF in the secondary market and redeem it in the primary market to sell the underlying assets for profit. Conversely, when the ETF price exceeds the NAV (premium condition), arbitrageurs sell the ETF in the secondary market and purchase the underlying assets through the creation process to capture arbitrage gains.
These immediate arbitrage activities push ETF prices upward (or downward), thereby aligning them with the value of the underlying assets over time. The arbitrage mechanism involves three essential steps: (1) trading of the underlying assets, (2) creation and redemption of ETF shares in the primary market, and (3) ETF trading in the secondary market. Through these processes, creation and redemption activities stimulate trading in the underlying securities, thereby enhancing their liquidity.
Creationi,t=α+β1ETFi,t1+γ'Controlsi,t1+μi+τt+δk+εi,t
Redemptioni,t=α+β1ETFi,t1+γ'Controlsi,t1+μi+τt+δk+εi,t
Net Creationi,t=α+β1ETFi,t1+γ'Controlsi,t1+μi+τt+δk+εi,t
To empirically test this mechanism, this study constructs three regression models. Considering data availability, the explanatory variables— Creationi,t and Redemptioni,t ​—represent the total number of ETF creations and redemptions for stock i during month t, respectively, while Net Creationi,t denotes the net number of creations. These three variables are calculated as follows.
Creationi,t = j=1JETF Ownershipi,j,t×creationj,t
Redemptioni,t = j=1JETF Ownershipi,j,t×redemptionj,t
Net Creationi,t=Creationi,tRedemptioni,t
Here, Creationi,t represents the total number of ETF creations involving stock i during month t, while creationj,t denotes the number of creations for ETF j during the same period. Similarly, Redemptioni,t indicates the total number of ETF redemptions involving stock i during month t, and redemptionj,t refers to the number of redemptions for ETF j over the same period.
The results in Table 9 show that stocks with higher ETF ownership in the previous month exhibit significantly greater creations, redemptions, and net creations in the following month (with positive and significant coefficients on ETFt-1 across all three models). This implies that for stocks with larger ETF ownership, arbitrage participants (APs) more frequently engage in basket purchases and sales, as well as primary market inventory adjustments, when price deviations occur. These findings ultimately confirm the mechanism of “ETF Ownership → Primary Market Activity → Liquidity Improvement.”
<Table 9>
ETFs Influence Equity Liquidity Through the Arbitrage Channel
This table presents the results of panel regression analyses that examine the impact of ETF primary market activities on the liquidity of the underlying assets, using Creation and Redemption as key explanatory variables. All regression models include firm fixed effects, time fixed effects, and industry fixed effects, and standard errors are double-clustered by firm and time. t-statistics are reported in parentheses, and statistical significance at the 1%, 5%, and 10% levels is denoted by ***, **, and *, respectively.
Creation Redemption Net Creation
Intercept -3.2034***(-13.57) 1.3535***(7.76) -1.8499***(-6.38)
ETFt-1 0.0571***(5.38) 0.4178***(33.09) 0.4749***(29.21)
INSTt-1 0.0008(0.98) -0.0007(-0.93) 0.0001(0.11)
lnMVEt-1 0.2446***(13.96) -0.107***(-8.12) 0.1377***(6.32)
BTMt-1 0.009(1.3) -0.0074(-1.2) 0.0016(0.18)
TURNt-1 0.9183***(4.8) 0.8604***(5.35) 1.7787***(7.13)
RVt-1 0.1049(0.97) -0.0006(-0.01) 0.1043(0.77)
Firm FE Yes Yes Yes
Time FE Yes Yes Yes
Industry FE Yes Yes Yes
Observations 111,033 111,033 111,033
R-squared 0.0638 0.1254 0.1489
While the stock-level analysis in Table 9 provides evidence that ETF ownership is associated with greater primary market activity, the use of monthly aggregated creation and redemption data may limit the ability to precisely identify the timing and immediacy of arbitrage responses. In practice, authorized participants (APs) react to deviations between ETF market prices and net asset values (NAVs) at much higher frequencies, adjusting ETF inventories in response to short-term mispricing. Accordingly, it is important to verify whether primary market activity is directly triggered by ETF mispricing at the daily level, rather than reflecting mechanical scaling effects or slow-moving institutional demand.
To address this concern, we conduct an additional ETF-level analysis using daily data to examine the relationship between ETF mispricing and subsequent creation and redemption activity. Specifically, we analyze daily creation-redemption activity using two distinct measures derived from the raw creation/redemption units, denoted as CRi,t. First, to examine the extensive margin, we define Di,tCR as an indicator variable that equals one if any creation or redemption occurs on a given day(i.e., CRi,t ≠0), and zero otherwise. Second, to capture the intensive margin, we compute the natural logarithm of the absolute value of creation or redemption units conditional on non-zero activity(denoted as Magi,tCR=LN|CRi,t|).
We regress these measures on lagged measures of ETF premium and discount(PDi,t), defined as the percentage deviation of the ETF’s closing price from its NAV. To capture potential asymmetries in arbitrage behavior, we decompose mispricing into the absolute value of lagged mispricing interacted with indicators for premium(premiumt-1, equal to one if PDi,t-1 >0) and discount(Discountt-1, equal to one if PDi,t-1 <0) conditions. All regressions include ETF fixed effects and date fixed effects, which control for time-invariant ETF characteristics and common daily market conditions, respectively.
Table 10 reports the results. We find clear evidence of asymmetric arbitrage responses to ETF mispricing. In particular, discount-driven mispricing is associated with significantly larger redemption activity, as indicated by the negative and statistically significant coefficient on the discount interaction term. In contrast, the estimated effects of premium-driven mispricing on creation activity are weaker and, in some specifications, statistically insignificant. This asymmetry is consistent with the ETF arbitrage literature, which suggests that redemption often serves as the dominant adjustment mechanism due to inventory constraints, transaction costs, and short-selling frictions faced by arbitrageurs.
<Table 10>
ETF Mispricing and Creation-Redemption Activity: Evidence from Daily Data
This table reports panel regressions of daily creation-redemption activity (CRi,t) on lagged ETF mispricing. Panel A analyzes the extensive margin using a dummy variable(Di,tCR) indicating the occurrence of creation or redemption. Panel B examines the intensive margin using the log magnitude of non-zero activity(Magi,tCR). All models include ETF and date fixed effects. Standard errors are clustered by ETF. t-statistics are in parentheses. ***, **, and * denote significance at the 1%, 5%, and 10% levels, respectively.
Panel A. Probability of Creation-Redemption Activity (Dependent Variable: Di,tCR)

Model (1) (2)
PDt-1 0.9622***(2.83)
|PDt-1| × Premiumt-1 −0.2976(−1.10)
|PDt-1| × Discountt-1 −2.2215***(−3.29)
Intercept 0.1592***(446.20) 0.1618***(145.18)
ETF FE Yes Yes
Day FE Yes Yes
Observations 593,039 593,039
R-squared (Within) 0.00005 0.00010

Panel B. Magnitude of Creation-Redemption Activity (Dependent Variable: Magi,tCR)

Model (3) (4)

PDt-1 8.8506***(3.12)
|PDt-1| × Premiumt-1 7.1822***(2.70)
|PDt-1| × Discountt-1 −10.520**(−2.09)
Intercept 11.751***(3451.30) 11.754***(1410.00)
ETF FE Yes Yes
Day FE Yes Yes
Observations 93,797 93,797
R-squared (Within) 0.0010 0.0010
Our findings align with the arbitrage mechanism described in previous studies. As noted by Ryou and Kim (2024), market participants, such as liquidity providers, actively engage in arbitrage trading to exploit pricing discrepancies, thereby enhancing market efficiency. Our results empirically confirm this dynamic by showing a significant link between lagged mispricing and the magnitude of creation-redemption activity. These ETF-level results complement the stock-level evidence presented in Table 9. While Table 9 shows that stocks with higher ETF ownership are more exposed to creation and redemption activity in the aggregate, Table 10 demonstrates that such primary market activity is systematically linked to ETF mispricing and arbitrage incentives at the daily frequency. Taken together, the two sets of results provide coherent evidence that ETF ownership improves the liquidity of underlying stocks through an arbitrage channel operating via ETF creation and redemption.

4.5 Robustness and Additional Analyses

To further validate the main findings and address potential identification concerns raised in the literature, this section presents two sets of robustness analyses. First, we examine whether the liquidity-enhancing effect of ETF ownership could be confounded by broader institutional investor activity by incorporating an interaction term between ETF ownership and institutional ownership (INST). If ETF ownership merely proxies for the presence of large institutional investors, its effect on liquidity should weaken or disappear once this interaction is introduced. Second, we investigate whether the observed liquidity effects differ across ETF types. Because ETF products vary in their investment objectives, portfolio concentration, and trading patterns, the magnitude and persistence of ETF-induced liquidity improvements may differ between broad market index ETFs and non-index ETFs. These analyses evaluate the stability of the main results and help clarify the economic mechanisms through which ETFs influence underlying equity liquidity.

4.5.1 Institutional Ownership Interaction

A key concern is whether the estimated liquidity effects attributed to ETF ownership reflect a “pure ETF effect” or instead arise from the broader footprint of institutional investors. Since ETF ownership and institutional ownership often overlap within the same stocks, the ETF coefficient may partially capture institutional trading intensity, monitoring, or liquidity provision unrelated to the ETF arbitrage mechanism. To address this concern, we augment the baseline specification by interacting lagged ETF ownership with lagged institutional ownership and re-estimate the regression models. This interaction allows us to examine whether the liquidity effects of ETF ownership depend on the presence of institutional investors.
The purpose of this test is twofold. First, it assesses whether ETF-related liquidity improvements are primarily driven by institutional ownership. If institutional investors are the dominant source of liquidity provision, the interaction term should be positive and statistically significant, while the main effect of ETF ownership should attenuate. Second, the analysis evaluates whether ETF ownership continues to exert an independent effect on liquidity once its joint interaction with institutional ownership is taken into account. Consistency between these results and the baseline findings would indicate that the ETF effect is not mechanically driven by institutional characteristics, but instead reflects ETF-specific liquidity channels such as arbitrage activity, creation and redemption processes, and trading in the underlying basket of securities.
Table 11 reports the regression results incorporating the interaction between lagged ETF ownership (ETFt-1) and lagged institutional ownership (INSTt-1). The results reveal interesting heterogeneity across different dimensions of liquidity.
<Table 11>
Institutional Ownership Interaction
This table presents the regression results analyzing the effect of lagged ETF ownership on stock liquidity. All regression models include year-month fixed effects, firm fixed effects, and industry fixed effects. Standard errors are clustered at the firm level. t-statistics are reported in parentheses, and statistical significance at the 1%, 5%, and 10% levels is denoted by ***, **, and *, respectively.
Cost ALLIQ HLS
Intercept -0.3827** (-2.03) -0.0219 (-0.21) -0.1204** (-2.45)
ETFt-1 -0.0426*** (-2.96) -0.0055 (-0.72) -0.0208*** (-4.31)
INSTt-1 -0.0001 (-0.08) 0.0081*** (3.45) -0.0005 (-0.88)
ETFt-1 × INSTt-1 -0.0009 (-0.78) -0.0053*** (-3.71) -0.0006 (-1.37)
Control Variables Yes Yes Yes
Firm FE Yes Yes Yes
Time FE Yes Yes Yes
Industry FE Yes Yes Yes
Observations 111,033 111,033 111,033
R-squared 0.0427 0.0606 0.0129
First, for Cost and HLS, the coefficient on lagged ETF ownership remains negative and statistically significant at the 1% level (-0.0426 and -0.0208, respectively), while the interaction terms are not statistically significant. This indicates that for bid-ask spreads and spread-based measures, ETF ownership exerts a direct and independent liquidity-enhancing effect, regardless of the level of institutional ownership. This suggests that the arbitrage mechanism inherent in ETFs mechanically tightens spreads, operating effectively even in the absence of high institutional participation.
In contrast, for ALLIQ, the direct effect of ETF ownership is insignificant, whereas the interaction term (ETFt-1× INSTt-1) is negative and statistically significant at the 1% level (-0.0053). This finding implies that regarding price impact (ALLIQ), the benefit of ETF ownership is conditional on the presence of institutional investors. In other words, while ETF ownership alone may not be sufficient to reduce price impact, it creates a significant synergy effect when combined with institutional participation, effectively deepening market depth and reducing transaction costs associated with large trades.
Overall, these results suggest that ETF ownership improves stock liquidity through two distinct channels: a direct channel driven by arbitrage activity and a complementary channel with institutional investors. Thus, the ETF effect is not merely a proxy for institutional ownership but plays a specific and multifaceted role in market microstructure.

4.5.2 ETF Type Heterogeneity: Market Index vs. Non-Market Index ETFs

While the main analysis treats ETF ownership as an aggregated measure, ETFs are heterogeneous investment vehicles whose liquidity implications may vary by product type. Broad market index ETFs typically engage in passive tracking and large-scale, systematic basket trading, whereas non-market index ETFs—such as sector, thematic, or strategy-based ETFs—often exhibit higher turnover, narrower underlying coverage, and different investor clientele. These structural differences raise the possibility that ETF-induced liquidity improvements may be concentrated in particular ETF segments rather than being uniform across the market.
To examine this heterogeneity, we decompose ETF ownership into two components: ownership derived from market index ETFs and ownership derived from non-market index ETFs. For each subsample, we estimate the baseline liquidity regression while maintaining identical fixed effects and control variables. This approach allows us to assess whether the observed liquidity improvements are primarily driven by large, broad-based index ETFs or whether non-index ETFs also contribute meaningfully to underlying stock liquidity.
Specifically, an ETF is classified as a market index ETF if it tracks a broad-based equity benchmark that represents the overall investment opportunity set of the underlying market. These benchmarks include major Korean and global market indices such as the KOSPI 200, KOSDAQ 150, KRX 300, KRX 100, KOSPI 50, and MSCI Korea indices. These indices are constructed using mechanical, rules-based methodologies—typically based on market capitalization and liquidity—and do not impose sectoral, thematic, or factor-based tilts. Total return (TR), futures-based, and leverage variants of these benchmarks are classified as market index ETFs, as they preserve the same underlying market definition.
In contrast, ETFs are classified as non-market index ETFs if they track sector-specific, thematic, style-based, or strategy-oriented benchmarks, or if they focus on a subset of the market such as specific industries, investment themes, factor exposures, or customized portfolios. Importantly, ETFs whose fund names explicitly indicate active management (e.g., containing the term “Active”) are classified as non-market index ETFs even when their stated benchmark references a broad market index. This conservative classification ensures that the market index ETF group captures only passive, broad-based exposure to aggregate market liquidity, while excluding products with discretionary portfolio construction or narrow investment mandates.
Compared to the baseline results reported earlier, Table 12 shows that the liquidity effects of ETF ownership differ across ETF types. While the baseline analysis documents statistically significant liquidity improvements when ETF ownership is treated in aggregate, the disaggregated results indicate that these effects are mainly driven by market index ETFs.
<Table 12>
ETF Type Heterogeneity
This table presents the regression results analyzing the effect of lagged ETF ownership on stock liquidity. All regression models include year-month fixed effects, firm fixed effects, and industry fixed effects. Standard errors are clustered at the firm level. t-statistics are reported in parentheses, and statistical significance at the 1%, 5%, and 10% levels is denoted by ***, **, and *, respectively.
Market index Non-market index

Panel A. Effect of ETF Ownership on Stock Liquidity
Dependent Variable: Liquidity Measure

Cost ALLIQ HLS Cost ALLIQ HLS
ETFt-1 -0.0984*** (-5.54) -0.065*** (-5.29) -0.0249*** (-3.85) 0.0037 (0.31) -0.0191*** (-2.79) -0.0055 (-1.27)
R-squared 0.0478 0.0542 0.0146 0.0335 0.1110 0.0112
Observations 88,467 92,379
Control Variables Yes
FE Firm, Time, Industry

Panel B. Effect of Liquidity on ETF Ownership
Dependent Variable: ETF Ownership

Cost ALLIQ HLS Cost ALLIQ HLS

Liquidityt-1 -0.0312*** (-5.08) -0.0299*** (-7.33) -0.0092*** (-4.01) 0.0014 (0.14) -0.0132* (-1.88) -0.0081** (-2.11)
R-squared 0.0523 0.0518 0.0499 0.0824 0.0826 0.0825
Observations 88,467 92,379
Control Variables Yes
FE Firm, Time, Industry
For market index ETFs, the estimated coefficients remain negative and statistically significant across all liquidity measures, consistent with the baseline findings. The size and stability of these coefficients indicate that the liquidity-improving role of ETF ownership is most pronounced when it reflects exposure through broad, market-representative index products. In contrast, for non-market index ETFs, the estimated coefficients are smaller and often statistically insignificant. Nevertheless, the coefficients generally retain negative signs across specifications, suggesting that the direction of the liquidity effect is aligned with the baseline results, albeit with weaker economic and statistical strength.
A similar pattern appears in the reverse regressions. Liquidity measures are negatively associated with ETF ownership for market index ETFs, although these effects are smaller than the forward effects from ETF ownership to liquidity. For non-market index ETFs, the reverse relationships are weaker and largely insignificant, while still exhibiting negative coefficient signs in most cases.
Overall, Table 12 indicates that ETF ownership is associated with improved stock liquidity, with the strongest and most consistent effects concentrated in market index ETFs. The reverse effect from liquidity to ETF ownership is comparatively limited, and the main directional relationship between ETF ownership and stock liquidity remains intact.

5. Conclusion

This study empirically examines the impact of ETF ownership on the liquidity of individual stocks in the Korean equity market using monthly data spanning January 2016 to December 2024. As the ETF market has expanded rapidly in both scale and trading activity, understanding how ETFs affect the microstructure of their underlying securities has become increasingly important. Against this backdrop, this study provides a comprehensive analysis of the relationship between ETF ownership and stock liquidity by employing multiple liquidity measures and complementary empirical approaches.
The main findings can be summarized as follows. First, increases in ETF ownership are consistently associated with statistically significant reductions in all three liquidity measures, indicating that higher ETF ownership improves stock liquidity by narrowing bid-ask spreads and lowering transaction costs. Importantly, this result remains robust when potential endogeneity concerns are addressed using a two-stage least squares (2SLS) framework with industry-level average ETF ownership as an instrumental variable. These IV results support a causal interpretation of the relationship rather than a simple correlation driven by omitted firm characteristics or investor demand. Second, panel Granger causality tests reveal a bidirectional relationship between ETF ownership and liquidity. While liquidity conditions also help predict future changes in ETF ownership, the forward channel from ETF ownership to subsequent liquidity improvements is stronger and more consistently observed across measures, reinforcing the interpretation that ETFs play an active role in shaping market liquidity. Third, analysis of primary market activity shows that stocks with higher ETF ownership experience significantly greater creation and redemption activity in the following period. This pattern is consistent with liquidity provision through ETF arbitrage, basket trading by authorized participants, and inventory management by market makers.
Taken together, these findings suggest that ETFs are not merely passive index-tracking instruments, but instead function as active conduits of liquidity transmission to the underlying stock market. By facilitating arbitrage, creation and redemption processes, and coordinated trading in underlying baskets, ETFs contribute to structurally lower trading costs and more stable execution conditions at the stock level. These results have important implications for market participants and regulators, as the growing footprint of ETFs may reshape liquidity provision mechanisms in modern equity markets.
Future research could extend this analysis by exploring heterogeneity across ETF types, market regimes, and industry characteristics. In addition, examining the effects of ETF ownership on other dimensions of market quality—such as informational efficiency, return comovement, and price impact transmission—would further enhance our understanding of the broader role of ETFs in financial markets.

References

1. Amihud, Y., 2002, Illiquidity and stock returns:Cross-section and time-series effects, Journal of Financial Markets, Vol. 5 (1), pp. 31-56.
crossref
2. Angrist, J. D., and J. S. Pischke, 2009, Mostly Harmless Econometrics: An Empiricist's Companion, Princeton University Press.

3. Ben-David, I., F. Franzoni, and R. Moussawi, 2018, Do ETFs increase volatility?, The Journal of Finance, Vol. 73 (6), pp. 2471-2535.
crossref pdf
4. Chen, J., M. Liu, and S. Yang, 2024, ETF arbitrage and stock pricing efficiency, Finance Research Letters, pp. 105234.

5. Chen, Z., Y. Li, and H. Zhang, 2020, ETF flows and market efficiency in China, Accounting & Finance, Vol. 60 (5), pp. 4505-4537.

6. Choi, J., and D. Choi, 2021, The impact of ETF flows on market liquidity: Evidence from Korea, Korean Journal of Financial Studies, Vol. 50 (2), pp. 183-217.

7. Corwin, S. A., and P. Schultz, 2012, A simple way to estimate bid-ask spreads from daily high and low prices, The Journal of Finance, Vol. 67 (2), pp. 719-760.
crossref
8. Cragg, J. G., and S. G. Donald, 1993, Testing identifiability and specification in instrumental variable models, Econometric Theory, Vol. 9 (2), pp. 222-240.
crossref
9. Dumitrescu, E. I., and C. Hurlin, 2012, Testing for Granger non-causality in heterogeneous panels, Economic Modelling, Vol. 29 (4), pp. 1450-1460.
crossref
10. Fama, E. F., and K. R. French, 1992, The cross-section of expected stock returns, The Journal of Finance, Vol. 47 (2), pp. 427-465.
crossref
11. Glosten, L., S. Nallareddy, and Y. Zou, 2021, ETF activity and informational efficiency of underlying securities, Management Science, Vol. 67 (1), pp. 22-47.
crossref
12. Huang, D., M. O'Hara, and Z. Zhong, 2022, Liquidity, ETFs, and market quality, Journal of Financial Economics, Vol. 143 (3), pp. 1113-1137.

13. Im, K. S., M. H. Pesaran, and Y. Shin, 2003, Testing for unit roots in heterogeneous panels, Journal of Econometrics, Vol. 115 (1), pp. 53-74.
crossref
14. Israeli, D., C. M. C. Lee, and S. A. Sridharan, 2017, Is there a dark side to exchange traded funds? An information perspective, Review of Accounting Studies, Vol. 22 (3), pp. 1048-1083.
crossref pdf
15. Kleibergen, F., and R. Paap, 2006, Generalized reduced rank tests using the singular value decomposition, Journal of Econometrics, Vol. 133 (1), pp. 97-126.
crossref
16. Koch, A., and S. Ruenzi, 2022, ETFs and information efficiency, Journal of Financial Economics, Vol. 145 (2), pp. 398-420.

17. Lakonishok, J., A. Shleifer, and R. W. Vishny, 1994, Contrarian investment, extrapolation, and risk, The Journal of Finance, Vol. 49 (5), pp. 1541-1578.
crossref
18. Lesmond, D. A., J. P. Ogden, and C. A. Trzcinka, 1999, A new estimate of transaction costs, The Review of Financial Studies, Vol. 12 (5), pp. 1113-1141.
crossref
19. Maddala, G. S., and S. Wu, 1999, A comparative study of unit root tests with panel data and a new simple test, Oxford Bulletin of Economics and Statistics, Vol. 61 (S1), pp. 631-652.
crossref pdf
20. Madhavan, A., and A. Sobczyk, 2016, Price efficiency and the cross section of ETF returns, Journal of Investment Management, Vol. 14 (2), pp. 1-15.

21. Pan, K., and Y. Zeng, 2019, ETF arbitrage and return predictability, Journal of Financial Economics, Vol. 134 (3), pp. 665-695.

22. Petajisto, A., 2017, Inefficiencies in the pricing of exchange-traded funds, Financial Analysts Journal, Vol. 73 (1), pp. 24-54.
crossref
23. Ryou, H.-Y., and E. C. Kim, 2024, Investment Strategies Utilizing ETF Price Deviations:Focusing on Korea Equity ETFs, Korean Journal of Financial Engineering, Vol. 23 (3), pp. 51-68.

24. Stock, J. H., and M. Yogo, 2005, Testing for weak instruments in linear IV regression, In Identification and Inference for Econometric Models:Essays in Honor of Thomas Rothenberg, Cambridge University Press,

25. Wang, Y., and L. Ma, 2025, ETF ownership and liquidity in China, Asia-Pacific Journal of Accounting & Economics, Forthcoming,

26. Wooldridge, J. M., 2010, Econometric Analysis of Cross Section and Panel Data,2nd ed. MIT Press.

Appendices

Appendix

Appendix A. Measures of Liquidity
(1) Amihud Illiquidity (ALLIQ)
Following Amihud (2002), this measure captures the price impact per unit of trading volume, indicating how much prices move in response to trading activity. We construct this measure at the monthly frequency using daily data within each month. It is defined as:
ALLIQi,t=1Di,td=1Di,t|Ri,d,t|Voli,d,t
where Di,t is the number of trading days for stock i during month t, Ri,d,t is the daily return of stock i on day d in month t, and Voli,d,t represents the daily trading value (price ×volume). This measure is calculated as the monthly average of the absolute daily return divided by the daily trading value. A higher value indicates lower liquidity.
(2) Lesmond Transaction Cost (Cost)
Following Lesmond et al. (1999), we estimate the effective transaction cost using a limited dependent variable (LDV) model. This measure relies on the premise that marginal investors will only trade if the value of new information exceeds the transaction costs. The “true” but unobserved return is assumed to follow the market model:
Ri,t*=βiRm,t+εi,t
where Rm,t is the market return and εi,t is the error term capturing firm-specific shocks, which is assumed to be normally distributed:
εi,tN(0,σi2)
However, the observed daily return (Ri,t)is zero if the true return lies within the transaction cost bounds. Let α1,i (α1,i < 0) denote the selling cost threshold and α2,i2,i> 0) denote the buying cost threshold. The observed returns are defined as follows:
Ri,t=Ri,t*α1,iifRi,t<α1,i(Sell)
Ri,t=0 if α1,iRi,t*α2,i(No Trade)
Ri,t=Ri,t*α2,iifRi,t*>α2,i(Buy)
The transaction cost proxy (Cost) for firm i is calculated as the round-trip trading cost:
costi=α2,iα1,i
We estimate the parameters α1,i, α2,i, βi and σi by maximizing the log-likelihood function for each stock-month
(3) High-Low Spread (HLS)
Following Corwin and Schultz (2012), we estimate the effective bid-ask spread based on the principle that the daily high prices are buyer-initiated and low prices are seller-initiated. This measure, referred to as the High-Low Spread, captures the spread component by isolating it from the stock price volatility. It relies on the fact that the variance of transaction prices increases proportionally with the time horizon, whereas the spread remains constant. The estimator is defined as:
HLSi,t =2(eαi,t1)1+eαi,t 
where eαi,t is derived from the high-low price ranges:
αi,t=2βi,tβi,t322γi,t322
Here, βi,t represents the sum of squared daily logarithmic high-low ranges over two consecutive days, and γi,t is the squared logarithmic high-low range over the same two-day interval:
βi,t=j=01[ln( Hi,t+jLi,t+j )]2, γi,t=[ln( max(Hi,t, Hi,t+1)min(Li,t,Li,t+1) )]2
where Hi,t and Li,t denote the high and low prices of stock i on day t. Following Corwin and Schultz (2012), negative estimates are set to zero. We calculate the monthly measure for each firm by averaging the daily spread estimates within the month. A higher value indicates a wider spread and lower liquidity.


ABOUT
BROWSE ARTICLES
EDITORIAL POLICY
FOR CONTRIBUTORS
Editorial Office
6F, Korea Financial Investment Association Building
143, Uisadangdaero, Yeongdeungpo-gu, Seoul 07332, Korea
Tel: +82-2-783-2615    Fax: +82-2-783-6539    E-mail: office@e-kjfs.org                

Copyright © 2026 by Korean Securities Association.

Developed in M2PI

Close layer
prev next