Services Trade Policies and Economic Integration: New Evidence for Developing Countries

This paper provides the first quantitative evidence on the restrictiveness of services policies in 2016 for a sample of developing countries, based on recently released regulatory data collected by the World Bank and WTO. We use machine learning to recreate to a high degree of accuracy the OECD’s Services Trade Restrictiveness Index (STRI), which takes account of nonlinearities and dependencies across measures. We use the resulting estimates to extend the OECD STRI approach to 23 additional countries, producing what we term a Services Policy Index (SPI). Converting the SPI to ad valorem equivalent terms shows that services policies are typically much more restrictive than tariffs on imports of goods, in particular in professional services and telecommunications. Developing countries tend to have higher services trade restrictions, but less so than has been found in research using data for the late 2000s. We show that the SPI has strong explanatory power for bilateral trade in services at the sectoral level, as well as for aggregate goods and services trade.


Introduction
Services play an important role in economic development. Because services account for a significant share of total output in even very poor countries, the operation of services sectors matters for overall economic performance. The importance of services for development is augmented as a result of their role as inputs into production for a broad cross-section of industries, including agriculture as well as manufacturing. The cost, quality and variety of services available in an economy helps determine the productivity of 'downstream' sectors. Services also matter for the achievement of the sustainable development goals (SDGs): improving access to health, education, and finance or enhancing connectivity through investment in information and communications technologies and transport and logistics networks all involve services activities. 1 Restrictive trade and investment policies may impact negatively on firms using services as inputs, reduce the competitiveness of services exporters and increase prices and/or lower the quality of services available to households. 2 Trade in services is like trade in goods in allowing specialization according to comparative advantage, inducing competitive pressures and knowledge spillovers, but differs in that often it requires the cross border movement of providers, whether legal entities (firms) or natural persons (services suppliers). A consequence is that trade in services involves a much broader range of policy instruments than trade in goods (Francois and Hoekman, 2010).
Well-known data weaknesses hamper analysis of how policies towards imports and exports of services, foreign direct investment and, more generally, regulation affects the operation of services sectors. Although data on services activities in developing economies has been improving, in part as the result of periodic firm-level surveys that have resulted in large panel datasets (e.g., the World Bank enterprise surveys), comparable information on external service-sector policies of developing countries is very limited. Information on policies often is patchy at best. Time series data on relevant policy variables generally are not available on a cross-country, comparable basis. This situation began to change in the late 2000s with a World Bank project to collect information on services trade and investment policies and to create services trade restrictiveness indicators (STRIs) that constitute a numerical summary of applied services policies believed to affect trade flows (Borchert et al., 2014). These STRIs in turn have been used to estimate sectoral ad valorem tariff equivalents for 103 countries (Jafari and Tarr, 2017). The OECD has gone further than the World Bank by compiling STRIs for its member countries as well as major emerging economies that span a broader range of policies and services sectors, including both discriminatory and regulatory measures. The OECD STRI is available on an annual basis starting in 2014, and covers 45 countries.
A problem for applied policy research on developing country services trade policies is that the OECD STRI database covers only a small number of emerging countries, while the World Bank STRI datawhich cover 103 countriesare limited to one year, 2008. 3 As a result, extant empirical research on developing country services trade policies has been constrained to cross-section analysis, using increasingly outdated information. The World Bank has been collaborating with the WTO secretariat to update the information on developing countries. A first result of this joint venture was the recent publication on the jointly managed Integrated Trade Intelligence Portal (I-TIP) website of a database of applied services trade policies for the year 2016. These data span many emerging and developing economies as well as OECD member countries. To date, the World Bank and WTO have not released 2016 STRIs calculated using the policy data made available through I-TIP. In this paper, we utilize the 1 See https://sustainabledevelopment.un.org/topics/sustainabledevelopmentgoals for more detail on SDG targets. 2 See, e.g., Borchert et al. (2011;2016), Balchin et al. (2016), Fiorini and Hoekman (2018), and Helble and Shepherd (2019). 3 The World Bank data are at https://datacatalog.worldbank.org/dataset/services-trade-restrictions-database but were not accessible at the time of writing this paper (last accessed December 8, 2019).
World Bank-WTO information on 2016 services policies to generate new indicators of services policy restrictiveness in eight services sectors for a 23 countries not included in the OECD STRI. 4 The new data provide an opportunity to analyze services trade policies using information that post-dates the 2008 global financial crisis. In addition to describing the pattern of services trade restrictiveness across regions and income groups, we use the 2016 indicators to analyze their role as determinants of trade and real incomes and the potential effects of several liberalization scenarios, both unilateral (on a nondiscriminatory basis) and through preferential trade agreements.
A challenge in generating indicators of services trade policy from information on applied measures is the need to appropriately weigh and aggregate policies on a sector-by-sector basis. A contribution of this paper is to apply a machine-learning algorithm to the policy data to construct indicators that are broadly consistent with the STRI methodology used by the OECD in that they correlate well with the OECD STRIs. Because the full detail of the methodology used to produce the OECD indices is proprietary and not published, it is not possible to simply apply the OECD methodology to generate STRIs that are strictly comparable to those reported in the OECD database.
The plan of the paper is as follows. In Section 1, we discuss briefly the new data on 2016 services policies published by the WTO. Sections 2 and 3 describe the methodology used to generate services policy indicators (SPIs) from this information and present the resulting policy indicators and associated ad valorem equivalents. Section 4 validates the SPIs by assessing their ability to act as statistically significant predictors of trade flows using a standard structural gravity model of total trade and specific services sectors. Section 5 conducts counterfactual services policy reform experiments using the gravity model. Section 6 concludes.

New Data on Services Policies
In November 2019, the World Bank and WTO released an update to their jointly managed I-TIP platform containing extensive data on services policies in a large number of countries. In its raw state, the dataset includes 121 countries, 25 sectors and three modes of supply: cross-border trade in services (Mode 1 in WTO speak), Mode 3 (establishment of a commercial presence in a foreign countryessentially foreign direct investment in a services sector), and Mode 4 (temporary cross-border movement of services suppliers). The data exclude Mode 2, where trade occurs through movement of consumers to a foreign country (e.g., tourism) as this is generally unrestricted.
The dataset pertains to policies observed in 2016 that potentially affect services trade. It has nearly a quarter of a million observations (244,949), distinguishing up to 445 different measures, both sector specific and horizontal. If attention is restricted to countries and sectors for which information is reported fully at the level of these individual measures, the country coverage of the falls to 68 countries and 24 sectors. 5 I-TIP data are freely downloadable from the WTO website. Although the WTO provides no comprehensive guide to data collection or treatment methodology, Borchert et al. (2018) discuss the measures captured by the coding exercise. The source for 45 of the 68 countries is the OECD STRI database, so that I-TIP adds information on 23 countries not covered by the OECD (Appendix Table 1 lists the countries). As with the 2008 iteration of the World Bank STRI, questionnaires administered to law firms in the countries of interest generated the raw data, treated by the World Bank and WTO team 4 The OECD produces STRIs for OECD member countries and nine (mostly large) emerging economies: Brazil, China, Colombia, Costa Rica, India, Indonesia, Malaysia, the Russian Federation, and South Africa. See https://qdd.oecd.org/subject.aspx?Subject=063bee63-475f-427c-8b50-c19bffa7392d. The additional countries that are the focus of this paper include one (Rwanda) for which data were produced by Shepherd et al. (2019b) with assistance from the OECD Secretariat. This brings the total to 24. See Appendix 1. 5 Many of the measures are coded for only a handful of countries, precluding use in empirical analysis in a cross-country setting. As it is important for empirical analysis to have data availability across all relevant data points, we limit consideration to the countries and sectors we have identified as satisfying that criterion. to ensure consistency and correctness. Table 1, taken directly from Borchert et al. (2018), lists the general categories of measures included in the database.

Constructing an Index of Services Policies from I-TIP Data
There are two key analytical decisions in designing an STRI given the choice to collect data on particular measures: weighting those measures, and aggregating them into an index. The first problem can be solved in different ways, such as application of purely statistical methods (e.g., factor analysissee Dihel and Shepherd, 2007) or by using external expert judgment, as in the OECD STRI, which is based on a weighting and aggregation system driven by expert input (Grosso et al. 2015). Once weights have been assigned, the aggregation problem can be likened to a dimension reduction problem in the applied mathematics literature, in the sense that the objective is to produce a single index from a potentially large number of individual measures and a set of weights.
As noted above, the selection of I-TIP data we use span 455 individual policy measures in 68 countries and 24 sectors. The challenge is to produce an overall index of services policy by sector, and then in the aggregate, using those data. Our starting point is an analytical choice to favor economic impact: the resulting index must be strongly correlated with trade in services in the context of a standard model (Van der Marel and Shepherd, 2013). Another basic premise that guides our approach is that there is no such thing as a "perfect" STRI. Small changes to introduce nuances in weighting and aggregation are unlikely to lead to major differences in analytical findings. As long as an indicator sits well with the analytical and qualitative literature on services policies in particular countries, has explanatory power for trade flows and is robust, we consider it satisfies the general criteria of a "good" index in this context. The OECD has published the STRI annually since 2014. There is an active research program based on it, showing the index is robustly linked with trade in services (e.g., Nordås and Rouzet, 2017) and investigating questions such as the extent and effects of regulatory heterogeneity (Nordås, 2018) and the services content of regional integration in the EU (Benz and Gonzalez, 2019). 6 The OECD policy databases are freely available online, along with a simulation tool that allows users to obtain counterfactual STRIs based on discrete policy changes. 7 Rather than reinvent the wheel and develop our own version of an STRI, we take the OECD STRI as a good benchmark for analysis. Aside from the substantive arguments, doing so is appropriate for the simple reason is that over 60% of the I-TIP data come from the OECD database.
The problem then is to reproduce the OECD STRI for the 25 countries included in I-TIP but not in the OECD database, in circumstances where the weighting and aggregation codes have not been published. A particular issue is that services policies can sometimes be interdependent: for instance, if foreign providers are completely locked out of a market, it is irrelevant to policy restrictiveness that the business environment for firms in the market is very liberal. It is therefore crucial to take account of interaction effects as well as the raw weights attached to particular provisions. A further challenge we face is that if attention is restricted to cases where data are fully available, the I-TIP source sometimes only contains a subset of the full range of measures used by OECD to construct its STRI. Our aim is to reduce the dimensionality of our dataset, from 445 measures to one single index, while retaining as much of the complexity of the OECD approach as possible in circumstances where we do not directly observe the weights and aggregation procedure.
This problem is well suited to a basic machine learning application. We construct a dataset containing OECD STRIs by sector, then all horizontal and sector specific measures from I-TIP for all 68 countries for which full data are available. For the analysis to be feasible, we limit consideration to those sectors that correspond well between the two databases, taking simple averages of measures where necessary. This reduces the number of sectors we can work with to eight: accounting, legal, commercial banking, insurance, air transport, road freight transport, distribution, and telecom. We believe these sectors represent a large share of services activity in most country. Although we lose some of the nuance in the I-TIP data-which distinguishes sectors at a micro level, such as insurance versus reinsurance, or air passenger transport versus air cargo transport-we believe this approach is justifiable given our overall objectives as set out above.
We split the sample into three groups. We randomly assign 75% of observations for which there is an OECD STRI to a "training" subsample, with the remaining 25% assigned to a "prediction" subsample. Finally, those countries and sectors where no OECD STRI is available are assigned to an "out of sample prediction" subsample.

Developing Services Policy Indices with Simple Machine Learning
Our general approach is to use an elastic net as a prediction tool, where the objective is to use the data available to produce the most accurate prediction possible of the OECD STRI. The elastic net solves the following problem, where ̂ is the vector of parameters of interest: 6 The body of evidence using the World Bank STRI is smaller, likely reflecting the one-time nature of the exercise which limits researchers to cross-sectional analysis See e.g., Borchert et al. (2014), Hoekman and Shepherd (2017), Beverelli et al. (2017) and Su et al. (2019) for analyses using the World Bank STRI. 7 The 2008 World Bank regulatory database was also public, although the website is not available as of writing. However, producing counterfactuals is much more involved, as there is no equivalent of the OECD online tool.
The first term is the standard ordinary least squares (OLS) loss function. is a penalty term that shrinks parameter estimates towards zero in two ways, with a higher parameter resulting in greater shrinkage. The first term in square brackets penalizes coefficients that are large in absolute value, while the second performs shrinkage based on the square of the parameter value. With = 0, the elastic net collapses to standard OLS. With nonzero and = 1, it is the least absolute shrinkage and selection operator (LASSO), while with = 0, it is ridge regression. The essence of the procedure is that is iterated for given values of , with zero coefficients dropped from the model progressively due to the shrinkage effect. Iteration continues until a model is selected based on its cross-validation performance, i.e. the ability of a model estimated on the training subsample only to produce close estimates of the values in the prediction subsample. By proceeding in this way, we can identify a subset of variables that have the best explanatory power in terms of the observed OECD STRI, and then use the estimated values from the elastic net regression to predict values out of sample, where no OECD STRI exists.
The elastic net is well suited to prediction problems with large numbers of potential predictors, even exceeding the number of observations, and deals well with situations where they are closely correlated. To power the tool, we construct a set of explanatory variables that is all sectoral responses, all horizontal measures, and a set of sector dummies. We then also create interactions to allow for nonlinear effects and dependencies. Specifically, we interact all measures with all other measures, and we create a triple interaction between all horizontal measures, all sector specific measures, and the sectoral dummies. The I-TIP dataset contains missing entries for many response variables, presumably because they are believed to only be relevant to certain sectors. To facilitate the empirical analysis, we therefore code these missing values as zero, which means that they do not have any restrictive impact on trade in sectors where World Bank and WTO analysts have made an a priori determination of no effect. This approach is equivalent to interacting those response variables with a set of sectoral dummies.
Proceeding in this way gives a dataset of 544 observations, which is eight sectors for 68 countries. It is only feasible to proceed with this smaller number of sectors as some of the sectors where I-TIP reports data do not correspond to any identified sector in the OECD STRI database. By interacting all of the potential explanatory variables, we have 16,974 variables. Many of those variables are constant within subsamples, often zero, and so are automatically dropped from the model. In practice, the elastic net works with a starting set of 1,606 variables. A standard regression technique like OLS cannot handle this problem given the number of observations, but the elastic net can, because the optimization problem has kinks due to the absolute value and square terms. Since OLS is unavailable, we therefore use two other dimension reduction techniques on the sectoral and horizontal measures to give a point of comparison, but ignoring interaction terms: principal factor analysis, and a simple mean. As a robustness check, we also set = 1, which yields LASSO estimates, and = 0, which yields ridge estimates.
Given that the problem in this case is prediction, not inference, we do not report coefficient estimates. For the training sample (272 observations), the elastic net retains 59 variables, a mix of measures in levels and interactions, and selects = 0.25. The LASSO retains 55 variables, while the ridge estimator retains the full set of informative variables, namely 1,606. Table 2 summarizes the performance of the three machine learning methods, looking separately at the training and prediction subsamples.
The three methods perform quite similarly on the training subsample: model fit is tight considering the relatively small amount of information used. The mean value of the OECD STRI is 0.279, so a mean squared error of only 0.005 using the elastic net indicates that model fit is good. Comparing the two parts of Table 2 shows that of the three machine learning methods, the elastic net has the best performance: R 2 is highest both on the training and prediction subsamples. We therefore prefer the elastic net version of our synthetic STRI, but we note that it is relatively close in performance to the other two models.  Table 3 reports the correlations at the sectoral level among the various measures computed as described above. The elastic net again is the strongest performer on this overall criterion, although the other two machine learning methods also perform well. The comparator indices, constructed using principal factor analysis and a simple mean, have a negative correlation with the OECD index, and thus represent a radically different way of summarizing the data. The evidence in Table 3 suggests that the OECD's approach to weighting and aggregating measures results in an output that is substantially different from what can be obtained by naïve methods. But our three simple machine learning applications, using limited data, do a remarkable job of reproducing the OECD index. Moreover, our preferred method, the elastic net, produces predicted values that lie exclusively between zero and unity, as does the original OECD index. The alternative approaches do not have this property, nor would a simple OLS regression model. Figure 1 shows the correlation between the elastic net index, which we name the Services Policies Index (SPI), and the OECD STRI at the sector level. The association is not perfect, as would be expected with any statistical approach to reproduction of an existing index, but the figure shows that our SPI fits the original data well, which gives us confidence that out of sample estimates for the countries not in the OECD database should perform well, in particular given the similarity of the R 2 measures for the training and prediction sub-samples, as noted above.

Figure 1: Correlation between the STRI and SPI, sector level
To avoid terminological confusion in the remainder of the paper, we refer consistently to the OECD STRI as the STRI. Our constructed indices based on I-TIP data are referred to as Services Policy Indices (SPIs). The difference in terminology highlights that we are simply mimicking the OECD's original approach using a broader dataset. Ownership of the full methodology used to produce the OECD's indices lies with that organization, and we use a simple data-driven technique to extend database coverage.

Descriptive Evidence on Services Policies in the Developing World
Having shown that our machine learning approach provides an acceptable approximation to the OECD's STRIs, and having used it to produce an SPI that closely mimics the STRI, we present some descriptive evidence on services policies in the developing world. Our approach, based on the I-TIP data, expands country coverage by 23 middle-income countries where there is full and complete data availability across all measures. 8 Figures 2 and 3 show average values of the elastic net SPI by developing region, with the OECD considered separately. Interpretation of these results requires caution, because the I-TIP data only cover a small number of countries in each region (see Appendix 1). Nonetheless, some indicative results emerge from the data. Figure 2 considers business and financial services in four subsectors. While most developing regions are more restrictive than the OECD in these subsectors, the differences are not always large in absolute terms, although detailed modeling would be required to establish what these differences equate to in terms of economic impacts. South Asia and the Middle East and North Africa are typically the most restrictive developing regions, while policies tend to be more liberal in the other regions. Sub-Saharan African economies have relatively liberal policies compared with other developing regions, and is typically one of the closest regions to the OECD average in these subsectors. Looking across sectors, average restrictiveness is highest in legal services. Figure 3 considers the remaining four sectors. The pattern is generally similar, although South Asia is relatively more liberal, and East Asia and the Pacific appears more restrictive relative to other developing regions. The OECD is again generally more liberal in most sectors, while Sub-Saharan African countries perform relatively well compared with other developing regions. In the absence of time series data for all 68 countries in our sample, it is difficult to draw strong conclusions as to the direction of policy change. The fact that average scores follow a relatively narrow distribution suggests policy convergence may be taking place with respect to the OECD. The extent of convergence obviously differs substantially by sector, and would need to be confirmed by subsequent work, but would be indicative of an important shift in applied services policies relative to bound policies under the GATS. An important question for future research will be to examine the political economy dynamics underlying any observed changes in policies over time. It is to be hoped that the I-TIP database will be expanded to include the original World Bank STRI data concorded with the I-TIP horizontal and sectoral measures. Once these data become available, our simple machine learning methodology can produce close correlates to the OECD STRI for 2008 in addition to 2016. With such a long gap between observations, the data should provide clearer evidence of policy change and possible convergence.
While index scores are of interest in their own right, it is important to have some gauge of the extent to which they affect the incentives facing economic operators. A convenient concept is the ad valorem equivalent (AVE), namely the rate of ad valorem tariff protection that would, if applied, effect the same degree of market insulation as the bundle of regulations summarized by the SPI. The next section estimates gravity models of trade at the sectoral and aggregate levels. At the expense of a parameter assumption, it is straightforward to derive AVEs from this kind of model, as in Benz (2017) and Shepherd et al. (2019aShepherd et al. ( , 2019b. Concretely, we apply the estimates from column 2 of Tables 5-8 to convert gravity model estimates of the elasticity of bilateral trade with respect to the SPI based on the STRI sectors covered by the available trade data. Using the notation developed in the next section, the calculation is straightforward: As for the counterfactual exercises in section 5, we assume that the trade elasticity is equal to 8.25, which is a midpoint of recent estimates. Appendix 2 reports full results. These are summarized in Figures  3 and 4. Of course, the general pattern within sectors is the same as for the SPI results, as there is a simple, though nonlinear, relationship between the two. We therefore focus on the relative distortions that are present across sectors. The most restrictive sectors based on our AVEs are telecom, legal, and air transport.

Figure 4: Average AVEs by developing region and OECD, business and financial services
Electronic copy available at: https://ssrn.com/abstract=3827224

Figure 5: Average AVEs by developing region and OECD, transport, distribution, and telecom
In a qualitative sense these findings accord well with previous work based on the 2008 World Bank STRI, such as Jafari and Tarr (2017), who also find that professional services and telecom (primarily fixed line) are the sectors with the highest AVEs. The main takeaway from this exercise is that AVEs in services sectors are high relative to applied rates of tariff protection in goods markets. An AVE of 10%, 20%, or 30% represents a significant restriction to consumers and firms accessing services from foreign suppliers.

Validating the SPI with Trade Data
We have already shown that our SPI closely mirrors the OECD's STRI, which helps establish its validity as a measure of services policies. An important additional step in validating the SPIs is demonstrating their ability to act as statistically significant predictors of trade flows. We therefore estimate a standard gravity model of total trade (goods and services combined), as it is established that services policies not only affect trade in services, but also trade in other goods that use services as inputs (Hoekman and Shepherd, 2017;Shepherd, 2019). We use a structural gravity model in line with current best practice, as embodied in Anderson et al. (2018). Estimation is by Poisson Pseudo Maximum Likelihood (PPML), which means that estimates are robust to heteroskedasticity, take account of zero flows, and produce fixed effects (by exporter and by importer) that correspond exactly to the quantities prescribed by theory in Anderson and Van Wincoop (2003)-type models (Fally, 2015).
To formalize the above statements, the standard gravity model takes the following form, considering a single year and single sector cross-section only: (1) = − Where: Xij is exports from country i to country j; the F terms are exporter and importer fixed effects; tij is bilateral trade costs; is a parameter capturing the sensitivity of demand to cost; and eij is an error term satisfying standard assumptions. Numerous theoretical frameworks are consistent with this model, including as the Armington-type model of Anderson and Van Wincoop (2003), the Ricardian model of Eaton and Kortum (2002), and the heterogeneous firms model of Chaney (2008). Arkolakis et al. (2012) and Costinot and Rodriguez-Clare (2014) show that a wide class of quantitative trade models, including the canonical ones just cited, have the same macro-level implications for the relationship between trade flows and trade costs even though their micro-level predictions are quite different.
Trade costs t are specified in the usual iceberg form. These costs are unobserved, but can be specified in terms of observable proxies. For present purposes, we include standard gravity model controls based on geography and history, along with tariffs, a preferential trade agreement (PTA) dummy, and an indicator of service sector restrictiveness (STRI for presentational purposes), as well as an interaction between the STRI and a dummy for countries that are members of an Economic Integration Agreement (EIA), the services equivalent of a PTA for goods. Formally: (2) − = 1 * + 2 * * + 3 log(1 + ) + 4 + 5 log( ) + 6 + 7 + 8 + 9 + 10 + Table 4 provides variable definitions and sources, along with those for equation (1). With the exception of trade flows, the data sources are largely standard. Equation 1 should in principle cover all directions of trade, i.e. including trade from country i to country i, or intra-national trade. Inclusion of intra-national trade data is crucial in order for PPML to produce theory-consistent fixed effects estimates (Fally, 2015). International trade data do not include this term, so we use the Eora multi-region input-output table to do the job. 9 Eora covers 183 countries and 26 sectors through a single harmonized input-output table.
We use data for 2015 only, the latest available year, corresponding most closely to the year of our SPI data (2016).
As noted above, our SPI data start from 24 sectors defined in the World Bank/WTO dataset, which we concord to 8 sectors in the OECD STRI classification. We then further concord those data to four Eora sectors by taking simple averages of the relevant indices: distribution, finance and business services, telecom, and transport. It is not possible to estimate gravity models at a more detailed level as the Eora database in harmonized form is necessarily highly aggregated. We note in passing that a substantial number of the sectoral categories in the original World Bank/WTO dataset may be meaningful to professionals within a given sector or for historical reasons, but they will prove difficult to map to economic data in a systematic way. Examples are reinsurance and internet services, which are typically not separately captured either by trade or production data, and fixed line telephony, which is now superseded by mobile telephony in most countries. 9 See https://worldmrio.com/.
Electronic copy available at: https://ssrn.com/abstract=3827224 Dummy variable equal to one if one country in a pair was in a colonial relationship with the other. CEPII.

Common colonizer
Dummy variable equal to one if the two countries were colonized by the same power.

Common language
Dummy variable equal to one if both countries in a pair have a language in common, spoken by at least 9% of the population. CEPII.

Contiguous
Dummy variable equal to one if the two countries share a common land border. CEPII.

EIA
Dummy variable equal to one of the two countries are members of the same Economic Integration Agreement. Egger and Larch (2008). Exports Gross exports from country i to country j in sector s (2015). Eora.

Intl
Dummy variable equal to one if country i and country j are different. Authors.

SPI
Services Policies Index (Elastic Net, Lasso, Principal Factors, and Simple Mean). Authors.
Log(Distance) Logarithm of distance between country i and country j.

CEPII. Log(Tariff)
Logarithm of 1 + applied tariff rate. TRAINS PTA Dummy variable equal to one if country i and country j are part of the same preferential trade agreement in 2015.

Egger and Larch (2008). Same Country
Dummy variable equal to one if the two countries were ever part of the same country. CEPII.

STRI OECD Services Trade Restrictiveness Index. OECD
A second point that requires explanation is the interaction term between services policies and EIA membership. The services policies in I-TIP apply on a most favored nation (non-preferential) basis, which is why we map them to MFN policies from the OECD data. The OECD has collected preferential data for services trade within the EU, but there is no systematic dataset covering preferential services policies around the world. However, many countries are members of trade agreements that potentially provide substantially improved market access conditions for their service providers relative to the MFN benchmark. By interacting MFN policies with a dummy for joint EIA membership, we seek to capture that effect. Our expectation is that the coefficient on MFN policies will be negative (trade reducing), while the coefficient on the interaction term will be positive (showing that trade reduction is attenuated by regional integration). Benz et al. (2018) show conclusively in the case of the EU that intra-bloc services policies are far more liberal than those pertaining to non-EU countries. Table 5 reports gravity model regression results for the distribution sector. Column 1 includes the OECD STRI, and as expected, the policy variable has a negative coefficient, while the interaction term with EIA membership has a positive one, with both estimates statistically significant at the 10% level. The baseline data therefore support the view above that the measures captured by the STRI tend to restrict trade, in line with Nordas and Rouzet (2017), with that effect attenuated by joint membership of a trade agreement covering services. The same patterns of signs and magnitudes applies for the four SPIs, elastic net, LASSO, ridge, and principal factors. The simple mean has no statistically significant coefficients. We therefore conclude that the most naïve of our testbed of SPI measures does not have significant predictive value for trade, but that other measures that attempt to summarize the available data more systematically do have such power. Table 6 repeats the exercise for financial and business services. Results are similar to those for distribution. The elastic net, LASSO, and ridge SPIs perform somewhat better than the STRI in that the levels term and the interaction term both have coefficients with the expected signs and magnitudes, and are statistically significant at the 5% level or better. This is likely due to increased sample size for the SPIs. Column 1 contains data on 183 exporters and 45 importers, while the remaining columns all use 183 exporters and 68 importers. The principal factors SPI does not have any statistically significant coefficients, while the simple mean SPI has a negative and 1% statistically significant coefficient in levels, but a statistically insignificant coefficient for the interaction term. The most naïve measures of services policies again have at best limited explanatory power, in contrast to more sophisticated measures like the STRI and the SPIs. Table 7 reports results for telecom services. The pattern of findings is again quite similar: the STRI, as well as the elastic net, LASSO, and ridge SPIs, all have explanatory power for bilateral trade flows in this sector, although none of the interaction terms except for the LASSO model has a statistically significant coefficient, which suggests that regional integration may not be a strong force for global trade in this sector. By contrast, the principal factors and simple mean SPIs have positive and 1% statistically significant coefficients, which is contrary to expectations.
Finally, Table 8 presents results for the transport sector. The STRI, elastic net SPI, and ridge SPI all have 5% statistically significant coefficients or better in levels and on the interaction term. By contrast, the principal factors SPI and the simple mean SPI do not have any statistically significant coefficients. Results for this sector therefore accord well with those from the other sectors. Note: All models are estimated by PPML Robust standard errors adjusted for clustering by country pair in parentheses below parameter estimates. Statistical significance is indicated as follows: * (10%), ** (5%), and *** (1%). Note: All models are estimated by PPML. Robust standard errors adjusted for clustering by country pair are in parentheses below parameter estimates. Statistical significance: * (10%), ** (5%), and *** (1%). Note: All models are estimated by PPML. Robust standard errors adjusted for clustering by country pair in parentheses below parameter estimates. Statistical significance: * (10%), ** (5%), and *** (1%). Taken together, these results indicate that the OECD STRI has much greater explanatory power for bilateral trade flows in services than naïve measures like a principal factor or simple mean. Moreover, our three SPIs generally exhibit very similar performance to the OECD STRI, albeit with a substantially larger sample due to greater importer coverage. The difference in observations is just over 50%, so there are clear advantages to these extended measures based on data collected by the World Bank/WTO but aggregated into indices based on our machine learning-based reproduction of the OECD's approach. Given the strong and consistent explanatory power of the STRI and its derivative SPIs, the bar for producing a "better" indicator of services trade restrictions is very high. In the absence of substantial additional benefits, it is far from obvious that further work in this area-in the sense of changing weights or adopting different aggregation schemes-passes a cost benefit test, given the substantial time and resources that need to be devoted to dealing with the problems of weighting and aggregation discussed above.
While any indicator of services trade restrictiveness should be a strong predictor of bilateral services trade, recent work has shown that because of the input-output relationships that exist between services and other sectors, it is also likely that services policies affect total trade (i.e., goods and services). 10 We test this hypothesis and the predictive power of our SPIs compared with the STRI using aggregate Eora data summed across all 26 goods and services sectors in the database. The specification is the same as in the preceding tables, except that we use a dummy for PTA rather than EIA membership, to capture goods agreements as well as services agreements, and we include the log of the applied tariff rate as an additional explanatory variable. We aggregate the STRI and our SPIs by taking simple averages across sectors. Table 9 reports the results. We again use the full sample, but as tariff data are not available for all country pairs, the number of observations is lower than in the previous tables. As in the regressions using sectoral services trade, the STRI, elastic net and ridge SPIs have the expected negative coefficients, and are statistically significant at the 1% level. In addition, all three variables also have positive coefficients on the interaction term with the EIA variable, again statistically significant at the 1% level. The simple mean SPI also displays this pattern of coefficients, but the principal factor SPI has unexpected signs. Note: Robust s.e. adjusted for clustering by country pair. Statistical significance: * (10%), ** (5%), *** (1%).
We conclude that in addition to being a strong predictor of sectoral services trade, the OECD STRI is also a strong predictor of total trade, which is consistent of the important role services play as inputs into the production of exports in other sectors. Moreover, the performance of the elastic net SPI mimics that of the OECD STRI closely but with a significantly expanded sample. These results, along with those presented above, suggest that our choice to use simple machine learning techniques to produce SPIs that mimic the OECD STRI in an efficient way results in measures that are relatively parsimonious in their use of data, but have similar explanatory power for the outcomes of interest.

Services Liberalization by Developing Countries: Trade and Income Impacts
The previous section developed and validated new measures of services policies in 23 countries not covered by the OECD STRI, in a way that generates SPIs that are as close as possible to what the STRI would be if extended directly to those countries. The resulting measures are strongly predictive of bilateral trade in services at the sectoral level, as well as of aggregate trade. Their performance is very close to that observed for the OECD STRI, but with significantly expanded country coverage. To demonstrate the usefulness of data on services policies, this section conducts a counterfactual experiment using the gravity model. Since we have estimated the model in a theory consistent way, these experiments are straightforward to implement, albeit at the expense of some changes in data set up.
The gravity models we have estimated fall into the general class described by Arkolakis et al. (2012) in that they satisfy the following primitive assumptions: 1. Dixit-Stiglitz preferences. 7. The import demand system is CES.
As noted above, these assumptions are satisfied by numerous commonly used gravity models, such Anderson and Van Wincoop (2003), Eaton and Kortum (2002), and Chaney (2008). A remarkable feature of this class of models is that they can all be solved very straightforwardly in terms of relative changes. Arkolakis et al. (2012) and Costinot and Rodriguez-Clare (2014) show that all models in this class have the same macro-level implications for the relationship between trade flows and trade costs even though their micro-level predictions are quite different. Building on these insights, Baier et al. (2019) develop a simple algorithm for solving for counterfactual changes in bilateral trade given a change in trade costs and an assumption for the trade elasticity. We adopt their model here, using a Stata package made publicly available by the authors. Concretely, their approach uses exact hat algebra (Dekle et al., 2008) to solve for counterfactual trade (and other endogenous variables, such as wages, prices, and expenditure), which gives the following expression for changes in trade: Where: w is the wage rate, P is a CES price aggregate, and E is expenditure. Hat notation means that for any variable v, ̂≡ ′ where a prime indicates variable v's counterfactual value. Arkolakis et al. (2012) show that once counterfactual values of trade have been calculated, it is straightforward to calculate the corresponding change in real income (welfare, Y): To run counterfactuals in this way requires a square dataset, with the number of importers equal to the number of exporters. In additional results available on request, we show that the regressions in Table 9 perform in a qualitatively and quantitatively similar way with the smaller dataset (4624 observations for our SPIs). Using the square dataset, the parameters of interest have coefficients equal to -2.544 (elastic net SPI) and 2.233 (elastic net SPI interacted with EIA dummy), both of which are statistically significant at the 1% level. As discussed above, our preferred SPI due to its out of sample predictive power is the elastic net.
A key assumption that affects the level but not the pattern of estimated trade and welfare effects is the value of the trade elasticity . Anderson and Van Wincoop (2004) report gravity-based estimates equivalent to a trade elasticity of between 5 and 10. Other work has narrowed that range considerably. Eaton and Kortum (2002) find a value of 8.28, while recent work by Caliendo and Parro (2015) reports an average value across sectors of 8.22. Given the availability of recent, high quality estimates, we do not re-estimate the parameter directly, but instead assume =8.25, which is the midpoint of the Eaton and Kortum (2002) and Caliendo and Parro (2015) estimates.
Our chosen counterfactual is a partial liberalization scenario where we look at the trade and welfare impacts of reducing tariffs and services restrictiveness separately by similar proportions. We consider 10% cuts in each. While a 10% cut in applied tariffs has a concrete policy interpretation, a 10% cut in a country's SPI score is harder to interpret, and could take many forms depending on the exact measures that are changed. However, as the OECD's online simulation tool for the STRI shows, it is quite possible for analysts and policymakers to translate these kinds of percentage changes into concrete differences in regulation, albeit with more latitude as to final form than in the case of tariffs. We simulate the model using the approach set out above, based on a gravity model re-estimated using a square dataset of 68 exporters and importers (estimation results available on request). Table 10 reports results from the counterfactual. It is apparent that the trade and welfare impacts of reducing the restrictiveness of services policies by 10% is greater in most cases than a similar proportional reduction in applied tariffs. Note these results take full account of preferential trade arrangements through the interaction term with the EIA dummy in the case of services, and by data construction in the case of tariffs. Taking a simple average across the 27 developing (non high income) countries in the sample, reducing the restrictiveness of services policies by 10% would boost real income by 0.5%, compared with 0.4% for a 10% cut in applied tariffs. Both figures are modest, but given that the policy changes are relatively small, that should not be surprising. They suggest that developing countries stand to benefit from reforming services policies. We are agnostic as to what those reforms might comprise, as SPIs can be reduced by 10% in many ways.
Another point that emerges from Table 10 is that in both scenarios, trade changes are typically an order of magnitude greater than changes in real GDP. Mathematically, such a result is not surprising given the form of the Arkolakis et al. (2012) formula for welfare changes, but it is important to keep in mind, as policy debates often privilege large trade effects while downplaying that these changes Electronic copy available at: https://ssrn.com/abstract=3827224 primarily involve redistribution of economic resources from producers to consumer. Reforms generally produce much smaller pure gains through the elimination of deadweight losses. The largest economic gains accrue to the countries that are currently the most restrictive. The case of India stands out: it has the second highest aggregate SPI score of any country in our sample, after Indonesia, despite the fact that services are a major source of export earnings, and play a more important role economically than in most other countries at similar income levels. 11 Our results suggest that India's services economy, but also the broader economy, could gain substantially from reform. This point is true even when compared with significant tariff reductions, as India is also relatively protective in goods markets.

Conclusion
This paper provides new quantitative evidence on the state of services policies in 23 non-OECD countries in 2016, based on regulatory data recently released by the World Bank and WTO. Starting from the premise that the OECD STRI represents a proven approach to summarizing the restrictiveness of services policies, we use simple machine learning techniques to estimate SPIs for the new data that correlate very closely with OECD measures within sample, and therefore essentially constitute an extension of the OECD methodology to an additional set of mostly developing countries. Our SPIs have significant explanatory power for bilateral trade flows at the sectoral and aggregate levels. In line with previous research (see Francois and Hoekman, 2010, for a review), a simple quantification exercise shows that the trade and welfare gains from a 10% cut in applied services policies are typically larger than those from similar reduction in import tariffs for goods.
Our SPIs provide the first quantitative snapshot of applied services policies in a significant number of developing countries since the World Bank's STRI in 2008. Averaging by World Bank region shows that while there is variation across sectors and OECD member countries are typically more liberal than developing economies, the differences are not always large in terms of the index scores and AVEs. This finding requires cautious interpretation, as the number of countries is relatively small. The SPIs line up well with those of Borchert et al. (2014) using the World Bank STRI for 2008. The relatively small differences observed in applied policies across regions could be suggestive of a process of policy convergence to more liberal settings, but that can only be determined using data spanning multiple years. It is therefore very desirable that the World Bank and WTO make available the original data used to generate the World Bank 2008 STRIs in comparable format through the I-TIP platform to facilitate this kind of analysis.
A contribution of this paper to the literature is to provide a "proof of concept" for the use of statistical tools, such as machine learning, to capture the complexities, nonlinearities, and dependencies of different services policy measures. This is relevant for at least two reasons. One is that the use of such techniques allow analysts to extend datasets in instances where a given source of information is limited to a subset of countries and the detailed methodology used to calculate published indicators is confidential. This is the case for the OECD STRI, arguably the gold standard at the time of writing given extensive industry consultation and expert input into the weighting of measures across sectors. Insofar as other organizationsin this case the World Bank and WTOcollect similar types of policy data, SPIs that correlate well with the OECD STRIs offer a way to extend the country coverage of services restrictiveness indicators. Although the focus in this paper is on services trade restrictions, the methodology may be useful in other contexts where similar conditions prevail as regards the scope and periodicity of efforts to collect information on policies for a given area.
Another reason the exercise undertaken in this paper is relevant is that the use of statistical tools may help to identify potential ways to reduce data collection costs. The OECD STRI involves the collection of a large amount of data, entailing significant direct and time costs for agencies involved in this kind of work. Further work with machine learning algorithms like those deployed here may identify a subset of measures that in fact do most of the explanatory work in terms of bilateral trade flows. In our view, this is the primary value of generating these kinds of indices, rather than simply summarizing a vast amount of data in a single number. Data collection is distinct from research to fine-tune STRI methodologies and improve the associated weighting and aggregation measures. The latter is very important but should be independent of the policy collection process. Analysts should have the ability to define their own indicators, and it is therefore very welcome that I-TIP has released the 2016 services policy information independently of associated STRIs. Robert Schuman Centre for Advanced Studies Working Papers Although the release of services trade-related policy data in I-TIP is laudatory, as of 2020 the most up to date compilation of such measures will be for 2016, and then only for some 30 developing countries-without any coverage of most low-income countries. It is unknown whether and when a new wave of data will be collected and thus whether over time a panel dataset will emerge. The contrast with other initiatives to compile information on development-relevant policiessuch as the annual World Bank Doing Business reportis striking. A similar effort to generate services policy data on a regular basis for a broad range of countries to complement the information reported for its member countries by the OECD would allow governments to track their policies, compare them to those of other countries, and inform autonomous policy reforms and regional integration processes. We hope this first output of services data collection efforts by the World Bank and WTO will be followed with the regular updates needed to allow assessments of the effects of policies over timeand that the coverage will be extended to more countries.
The resource costs of a systematic effort to collect services policy data are not large. In our experience, assembling the full OECD dataset for one country-sector pair involves one to two weeks of time for a junior legal consultant, along with supervision time from a more senior economist. Focusing on just five major sectors per country and seeking to cover 50 non-OECD countries would therefore involve costs in the range of $400,000 to $750,000, with additional resources required for reporting and publishing, though they would be an order of magnitude less than those required for data compilation. Doubling coverage to 10 sectors would involve an investment of less than $2 million. Average costs could be reduced by making the data collection a bi-annual process. Given how limited services policy data are relative to information on merchandise trade policies, allocating this level of resources to filling the gap would have a very high benefit-cost ratio, especially if one considers the opportunity costs of not having up-to-date information on services policies. These opportunity costs may be high, not least because absence of data means policymakers may be less inclined to devote adequate attention to this important area of policy. 12 If over time application of statistical methods can isolate a smaller number of key measures that have most of the explanatory power in terms of bilateral trade, data collection costs will fall accordinglyand help target attention on the policies that matter most.
One priority in this regard is to incorporate the preferential dimension into measures of services policy restrictiveness. Another is to expand country coverage. In particular, very few African countries are included in I-TIP. Given the salience of regional integration in Africa, it is important to fill in the policy blanks to allow assessments of the utility of dealing with services in the context of pursuing continental free trade. Benz and Gonzalez (2019) have shown that the EU single market for services is much more liberal than any member country's MFN policies. The extent to which other trade agreements effectively liberalize services markets is unclear, but is a vital policy question in an environment where bilateral, plurilateral, and mega-regional agreements are becoming more common. On the one hand, Shepherd et al. (2019a) find little evidence of substantial liberalization in the Canada-EU Trade Agreement (CETA). The same appears to be true for the Comprehensive and Progressive Trans-Pacific Partnership (CPTPP) (Gootiiz and Mattoo, 2017).
A related important question concerns the value of making binding policy commitments in trade agreements, even if these do not entail liberalization. The 'water' in the services policy commitments in trade agreements often is considerable (see, e.g., Borchert, Gootiiz and Mattoo, 2011;Miroudot and Shepherd, 2014;Miroudot and Pertel, 2015;Ciuriak et al. 2017). Research on the value of reducing the difference between bound and applied services policies has shown that this may be an important source of welfare gain, driven by a reduction in policy uncertainty (Lamprecht and Miroudot, 2018;Ciuriak et al. 2019;Egger et al. 2019). 12 Other compilations of policy indicators such as the World Bank Doing Business project attract extensive attention by the press and have become focal points for governments because they are undertaken on an annual basis.
Electronic copy available at: https://ssrn.com/abstract=3827224 Again, such analysis requires good quality, comparable information on applied policies collected regularly. The OECD does this for its membersand is the source for the majority of the 68 countries for which I-TIP reports comprehensive information. Looking forward, we hope the collaboration between the World Bank and WTO will do so as well. If not, other development organizations should fill the gap. Services policies matter too much to continue to be neglected. Note: The table includes only those countries covered by the SPIs that are not included in the OECD STRI. * Rwanda is not in I-TIP but comparable policy data for Rwanda were collected by Shepherd et al. (2019b), permitting its inclusion in the analysis. Kenya is included in I-TIP but data have been augmented by additional information reported in Shepherd et al. (2019b).