|Part of a series on|
|An aspect of fiscal policy|
Optimal capital income taxation is a subarea of optimal tax theory which studies the design of taxes on capital income such that a given economic criterion like utility is optimized.
Some have theorized that the optimal capital income tax is zero. Starting from the conceptualization of capital income as future consumption, the taxation of capital income corresponds to a differentiated consumption tax on present and future consumption. Consequently, a capital income tax results in the distortion of individuals' saving and consumption behavior as individuals substitute the more heavily taxed future consumption with current consumption. Due to these distortions, zero taxation of capital income might be optimal, a result postulated by the Atkinson–Stiglitz theorem (1976) and the Chamley–Judd zero capital income tax result (1985/1986).
Subsequent work on optimal capital income taxation has elucidated the assumptions underlying the theoretical optimality of a zero capital income tax. Additionally, diverse arguments for a positive optimal capital income tax have been advanced.
The assertion that a zero capital income taxes may be optimal is based on two individual economic intuitions: (1) the Atkinson–Stiglitz theorem and (2) the result derived by Chamley (1986) and Judd (1985) based on a dynamic Ramsey model. While Mankiw, Weinzierl and Yagan (2009) invoke the Diamond–Mirrlees production efficiency theorem (DMPET) as third intuition for no capital income taxation, their arguments are disputed by Diamond and Saez (2011).
The Atkinson–Stiglitz theorem states that if non-linear taxes on earnings are available as policy tool, differential taxation of first- and second period consumption is not optimal if all consumers have preferences weakly separable between consumption and labor. Furthermore, consumers need to have homogeneous subutility functions of consumption. When applied to capital income taxation, the Atkinson–Stiglitz theorem argues that since present and future consumption are equally complementary to leisure due to weakly separable preferences (and hence there is no Corlett–Hague motive for capital income taxation), capital income taxes do not alleviate the tax distortions caused by labor income taxation while additionally distorting capital income. Thus, capital income taxation, i.e. differentiated consumption taxation, is more costly (and therefore less optimal) than pure non-linear labor income taxation.
The Chamley–Judd zero capital income tax result—developed in Chamley (1986) and Judd (1985)—states that in a dynamic Ramsey model featuring agents with infinite lives, an asymptotically zero tax on capital income is optimal. The result is based upon the intuition that the growth of the tax wedge between current and future consumption is related to the growth of the time horizon. So as to avoid unlimited growth in tax compounding as the horizon extends, the optimal average capital tax rate approximates zero. The result can also be interpreted in Corlett–Hague terms: As the horizon grows to infinity, both present and future consumption become equally complementary to leisure as their elasticities become constant; since, according to the Corlett–Hague rule the taxation of commodities should depend on their complementarity to leisure, present and future consumption should be taxed at equal tax rates. Although Chamley (1986) and Judd (1985) rely on steady-state properties of constant consumption and labor and, consequently, also a constant elasticity of consumption in order to argue that current and future consumption are equally complementary to leisure, Judd (1999) shows that a steady state is a sufficient but not a necessary condition for the zero capital income tax result.
The Chamley–Judd model can also be invoked when arguing that the taxation of existing wealth is superior to the taxation of future capital income due to the tax on current wealth being lump-sum as opposed to the tax on future capital income distorting intertemporal decisions. This argumentation can be found in the composition of taxation in overlapping generation models, e.g. Auerbach, Kotlikoff and Skinner (1983).
While criticisms of the Chamley–Judd model vary, a central theme attacks its critical assumption regarding infinite lives, which can also be interpreted as dynastic linkages. This assumption has been notably challenged both by general criticism leveled by behavioral economics against the standard model of intertemporal decision-making used in the Chamley–Judd model and by empirical analyses of bequests, which do not support the rigorous dynasty model required by the Chamley–Judd model.
A 2020 study in the American Economic Review found the Chamley-Judd model's conclusion that capital should not be taxed in the long run "does not follow from the very models used to derive it."
A number of arguments relating to concerns for efficiency and equity may be found in the literature supporting the taxation of capital income, including (1) Corlett-Hague motives, (2) increases in consumption inequality over the life cycle, (3) heterogeneous preferences, (4) correlation between returns on savings and ability, (5) incomplete or imperfect insurance markets, (6) borrowing or liquidity constraints, (7) human capital distortions, (8) economic rents, and (9) avoidance of arbitrage between capital income and labor income taxation.
For a government, distinguishing between capital and labor incomes can be difficult. This shortcoming becomes critical when individuals shift from labor income to capital income so as to take advantage of tax differentials, as evidenced in Finland by Pirttilä and Selin (2011) and in the United States by Gordon and MacKie-Mason (1995) and more recently Gordon and Slemrod (2000). The difficulty in distinguishing labor and capital income might be the most important reason for governments' reluctance to engage in the full tax exemption of capital income. Specifically, Christiansen and Tuomala (2008) find a positive optimal tax on capital income due to the presence of the ability to shift income, while Reis (2007) demonstrates that the Chamley–Judd result does not hold when tax authority cannot effectively distinguish entrepreneurial labor income and capital income.
Both the Atkinson–Stiglitz theorem and the Ramsay model used to derive the Chamley–Judd zero capital income tax result assume perfect capital markets. In practice, however, individuals are often borrowing-constrained, i.e. they cannot save. By taxing capital income and transferring it to the borrowing-constrained individuals, the capital market imperfection—the liquidity constraints—is alleviated at the cost of distorting saving. Equivalently, taxing saving may reduce the implicit subsidy on saving created by the borrowing constraints and thus restore efficiency in saving. Furthermore, Aiyagari (1995) and Chamley (2001) show that capital income taxation is desirable when consumption is positively correlated with savings in a model featuring borrowing-constrained agents with infinite lives and uncertainty.
According to the second theorem of welfare economics any Pareto-efficient allocation can be achieved through the appropriate redistribution of endowments, which in the context of optimal taxation refers to the taxation of individuals' earnings ability. If - unlike the assumption in the model - the returns on saving are not equal for everyone, but are positively correlated with ability instead, capital income contains new information about an individuals' ability and should be taxed for redistributive reasons.
As demonstrated by Judd (1999), a zero capital income tax is no longer neutral regarding human capital investments if these partially consist of costs which cannot be deducted against the tax rate of future returns on saving. Then, the reduction of labor income tax distortions on investments in human capital provides a motive for capital income taxation being optimal. By increasing the relative price of future consumption and causing the substitution of financial for human savings, capital taxes act as an implicit subsidy for human capital investments at the cost of creating a distortion in financial serving.
The zero optimal capital tax relies on the assumption of preference homogeneity. Both Mirrlees (1976) and Saez (2002) argue that high-ability might have higher saving rates due to different preferences. If this is the case, then capital income taxation is optimal for income redistribution as the individual savings level reveals information about individuals' ability, thereby facilitating the redistribution of income from high-ability to low-ability individuals. This argument is empirically borne out by research on the correlation between individuals' willingness and earnings ability.
As argued by Abel, if investment is fully deductible, the capital tax has no adverse impact on investment and is non-distorting, and under restrictive assumptions all tax should fall on capital, and none on labor. Given that capital income is concentrated among high income earners, if the social welfare function is inequality averse, then the optimal capital tax may be arbitrarily close to 100%, as increases to the capital tax rate lowers inequality but imposes no deadweight loss; this is in contrast to the standard assumption in optimal labor tax research in which inequality can be reduced by increased progression of the tax system, but at the cost of imposing a deadweight loss via a distorting reduction in labor supply - thus for any given level of post tax and transfer income inequality, reducing the progression of the labor tax system and increasing the capital tax rate in the context of instant depreciation can lead to welfare gains. However if there are relative income effects or if the degree of inequality aversion if sufficiently high, the optimal marginal labor tax will still be positive.
Apps, Patricia F. and Rees, Ray (2012) argues against the direction of tax reform recommended by the Mirrlees Review, saying that the appropriate direction for tax reform is towards more progressive taxation of both labour earnings and capital income, although not necessarily under the same rate scale.