Title: | Extensions for Synthetic Controls Analysis |
---|---|
Description: | Extends the functionality of the package 'Synth' as detailed in Abadie, Diamond, and Hainmueller (2011) <doi:10.18637/jss.v042.i13>. Includes generating and plotting placebos, post/pre-MSPE (Mean Squared Prediction Error) significance tests and plots, and calculating average treatment effects for multiple treated units. |
Authors: | Bruno Castanho Silva [aut, cre]
|
Maintainer: | Bruno Castanho Silva <[email protected]> |
License: | GPL-3 |
Version: | 0.3.3 |
Built: | 2025-02-26 03:30:11 UTC |
Source: | https://github.com/bcastanho/sctools |
A set of functions to extend the synthetic controls analyses performed by the package 'Synth'. Includes generating and plotting placebos, significance tests and plots, and calculating average treatment effects for multiple treated units.
It has several goals:
Allow easy generation of placebos
Generate figures for inference on SCM outputs
Extend the existing Synth package
Maintainer: Bruno Castanho Silva [email protected] (ORCID)
Authors:
Michael DeWitt [email protected] (ORCID)
Useful links:
Report bugs at https://github.com/bcastanho/SCtools/issues
This data set has been compiled from data from the World Health Organization (WHO) and the World Bank (WB). The primary purpose was to investigate the effects of policy changes in the Russian Federation enacted in 2003 around alcohol consumption. This is an excellent case study for SCM approaches to be used. You can read more about the policy changes at https://www.theguardian.com/world/2019/oct/01/russian-alcohol-consumption-down-40-since-2003-who
alcohol
alcohol
a data.frame with 5107 rows and 8 columns:
The name of the country
year
Alcohol consumption per capita (liters/person); all types
Three letter country code
Labor force participation rate, total (percent of total population ages 15+)
Mobile cellular subscriptions (per 100 people)
Inflation, consumer prices (annual percent)
Manufacturing, value added (percent of GDP)
The country number
WHO data available at https://apps.who.int/gho/data/node.main.A1039?lang=en.
WB data available at https://data.worldbank.org/.
Constructs a synthetic control unit for each unit in the donor pool of an implementation of the synthetic control method for a single treated unit. Used for placebo tests (see plot_placebos, mspe.test, mspe.plot) to assess the strength and significance of a causal inference based on the synthetic control method. On placebo tests, see Abadie and Gardeazabal (2003), and Abadie, Diamond, and Hainmueller (2010, 2011, 2014).
generate.placebos( dataprep.out, synth.out, Sigf.ipop = 5, strategy = "sequential" ) generate_placebos( dataprep.out, synth.out, Sigf.ipop = 5, strategy = "sequential" )
generate.placebos( dataprep.out, synth.out, Sigf.ipop = 5, strategy = "sequential" ) generate_placebos( dataprep.out, synth.out, Sigf.ipop = 5, strategy = "sequential" )
dataprep.out |
A data.prep object produced by the |
synth.out |
A synth.out object produced by the |
Sigf.ipop |
The Precision setting for the ipop optimization routine. Default of 5. |
strategy |
The processing method you wish to use
"sequential", "multicore" or "multisession". Use "multicore" or "multisession" to parallelize operations
and reduce computing time. Default is |
Data frame with outcome data for each control unit and their respective synthetic control and for the original treated and its control
Mean squared prediction error for the pretreatment period for each placebo
First time unit in time.optimize.ssr
First time unit after the highest value in time.optimize.ssr
Unit number of the treated unit
Dataframe with two columns showing all unit numbers and names from control units
Number of control units
Unit name of the treated unit
Pretreatment MSPE of the treated unit's synthetic control
## Example with toy data from Synth library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for synth dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2)
## Example with toy data from Synth library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for synth dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2)
This function returns 'TRUE' for the object returned from the
generate.placebos
function.
and 'FALSE' for all other objects, including regular data frames.
is_tdf(x)
is_tdf(x)
x |
An object |
'TRUE' if the object inherits from the 'tdf' class.
This function returns 'TRUE' for the object returned from the
multiple.synth
function.
and 'FALSE' for all other objects, including regular data frames.
is_tdf_multi(x)
is_tdf_multi(x)
x |
An object |
'TRUE' if the object inherits from the 'tdf_multi' class.
Plots the post/pre-treatment mean square prediction error ratio for the treated unit and placebos.
mspe.plot( tdf, discard.extreme = FALSE, mspe.limit = 20, plot.hist = FALSE, title = NULL, xlab = "Post/Pre MSPE ratio", ylab = NULL ) mspe_plot( tdf, discard.extreme = FALSE, mspe.limit = 20, plot.hist = FALSE, title = NULL, xlab = "Post/Pre MSPE ratio", ylab = NULL )
mspe.plot( tdf, discard.extreme = FALSE, mspe.limit = 20, plot.hist = FALSE, title = NULL, xlab = "Post/Pre MSPE ratio", ylab = NULL ) mspe_plot( tdf, discard.extreme = FALSE, mspe.limit = 20, plot.hist = FALSE, title = NULL, xlab = "Post/Pre MSPE ratio", ylab = NULL )
tdf |
An object constructed by |
discard.extreme |
Logical. Whether or not placebos with high pre-treatement MSPE should be excluded from the plot. |
mspe.limit |
Numerical. Used if |
plot.hist |
Logical. If |
title |
Character. Optional. Title of the plot. |
xlab |
Character. Optional. Label of the x axis. |
ylab |
Character. Optional. Label of the y axis. |
Post/pre-treatement mean square prediction error ratio is the difference between the observed outcome of a unit and its synthetic control, before and after treatement. A higher ratio means a small pretreatment prediction error (a good synthetic control), and a high post-treatment MSPE, meaning a large difference between the unit and its synthetic control after the intervention. By calculating this ratio for all placebos, the test can be interpreted as looking at how likely the result obtained for a single treated case with a synthetic control analysis could have occurred by chance given no treatement. For more detailed description, see Abadie, Diamond, and Hainmueller (2011, 2014).
Plot with the post/pre MSPE ratios for the treated unit and
each placebo indicated individually. Returned if plot.hist
is
FALSE
.
Histogram of the distribution of post/pre MSPE ratios for
all placebos and the treated unit. Returned if plot.hist
is
TRUE
.
Abadie, A., Diamond, A., Hainmueller, J. (2014). Comparative Politics and the Synthetic Control Method. American Journal of Political Science Forthcoming 2014.
Synthetic : An R Package for Synthetic Control Methods in Comparative Case Studies. Journal of Statistical Software 42 (13) 1–17.
Abadie, A., Diamond, A., Hainmueller, J. (2011). Synth: An R Package for Synthetic Control Methods in Comparative Case Studies. Journal of Statistical Software 42 (13) 1–17.
Abadie A, Diamond A, Hainmueller J (2010). Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program. Journal of the American Statistical Association 105 (490) 493–505.
generate.placebos
, mspe.test
,
plot_placebos
, synth
## Example with toy data from 'Synth' library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for 'Synth' dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2) ## Test how extreme was the observed treatment effect given the placebos: ratio <- mspe.test(tdf) ratio$p.val mspe.plot(tdf, discard.extreme = FALSE)
## Example with toy data from 'Synth' library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for 'Synth' dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2) ## Test how extreme was the observed treatment effect given the placebos: ratio <- mspe.test(tdf) ratio$p.val mspe.plot(tdf, discard.extreme = FALSE)
Computes the post/pre treatement mean square prediction error
ratio for a treated unit in a synthetic control analysis and all placebos
produced with generate.placebos
. Returns a matrix with
ratios and a p-value of how extreme the treated unit's ratio is in
comparison with that of placebos. Equivalent to a significance testing
of a synthetic controls result.
mspe.test(tdf, discard.extreme = FALSE, mspe.limit = 20) mspe_test(tdf, discard.extreme = FALSE, mspe.limit = 20)
mspe.test(tdf, discard.extreme = FALSE, mspe.limit = 20) mspe_test(tdf, discard.extreme = FALSE, mspe.limit = 20)
tdf |
An object constructed by |
discard.extreme |
Logical. Whether or not placebos with high pre-treatement MSPE should be excluded from the count and significance testing. |
mspe.limit |
Numerical. Used if |
Post/pre-treatement mean square prediction error ratio is the difference between the observed outcome of a unit and its synthetic control, before and after treatement. A higher ratio means a small pre-treatment prediction error (a good synthetic control), and a high post-treatment MSPE, meaning a large difference between the unit and its synthetic control after the intervention. By calculating this ratio for all placebos, the test can be interpreted as looking at how likely the result obtained for a single treated case with a synthetic control analysis could have occurred by chance given no treatement. For more detailed description, see Abadie, Diamond, and Hainmueller (2011, 2014).
The p-value of the treated unit post/pre MSPE ratio. It is the proportion of units (placebos and treated) that have a ratio equal or higher that of the treated unit
Dataframe with two columns. The first is the post/pre MSPE ratio for each unit. The second indicates unit names
generate.placebos
, mspe.plot
,
synth
## Example with toy data from 'Synth' library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for 'Synth' dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2) ## Test how extreme was the observed treatment effect given the placebos: ratio <- mspe.test(tdf) ratio$p.val mspe.plot(tdf, discard.extreme = FALSE)
## Example with toy data from 'Synth' library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for 'Synth' dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2) ## Test how extreme was the observed treatment effect given the placebos: ratio <- mspe.test(tdf) ratio$p.val mspe.plot(tdf, discard.extreme = FALSE)
Generates one synthetic control for each treated unit and calculates
the difference between the treated and the synthetic control for each.
Returns a vector with outcome values for the synthetic controls,
a plot of average treatment effects, and if required generates placebos
out of the donor pool to be used in conjunction with plac.dist
.
All arguments are the same used for dataprep
in the Synth
package, except for treated.units
, treatment.time
, and
generate.placebos
.
multiple.synth( foo, predictors, predictors.op, dependent, unit.variable, time.variable, special.predictors, treated.units, control.units, time.predictors.prior, time.optimize.ssr, unit.names.variable, time.plot, treatment.time, gen.placebos = FALSE, strategy = "sequential", Sigf.ipop = 5 ) multiple_synth( foo, predictors, predictors.op, dependent, unit.variable, time.variable, special.predictors, treated.units, control.units, time.predictors.prior, time.optimize.ssr, unit.names.variable, time.plot, treatment.time, gen.placebos = FALSE, strategy = "sequential", Sigf.ipop = 5 )
multiple.synth( foo, predictors, predictors.op, dependent, unit.variable, time.variable, special.predictors, treated.units, control.units, time.predictors.prior, time.optimize.ssr, unit.names.variable, time.plot, treatment.time, gen.placebos = FALSE, strategy = "sequential", Sigf.ipop = 5 ) multiple_synth( foo, predictors, predictors.op, dependent, unit.variable, time.variable, special.predictors, treated.units, control.units, time.predictors.prior, time.optimize.ssr, unit.names.variable, time.plot, treatment.time, gen.placebos = FALSE, strategy = "sequential", Sigf.ipop = 5 )
foo |
Dataframe with the panel data. |
predictors |
Vector of column numbers or column-name character strings that identifies the predictors' columns. All predictors have to be numeric. |
predictors.op |
A character string identifying the method (operator)
to be used on the predictors. Default is |
dependent |
The column number or a string with the column name that corresponds to the dependent variable. |
unit.variable |
The column number or a string with the column name that identifies unit numbers. The variable must be numeric. |
time.variable |
The column number or a string with the column name that identifies the period (time) data. The variable must be numeric. |
special.predictors |
A list object identifying additional predictors and their pre-treatment years and operators. |
treated.units |
A vector identifying the |
control.units |
A vector identifying the |
time.predictors.prior |
A numeric vector identifying the pretreatment periods over which the values for the outcome predictors should be averaged. |
time.optimize.ssr |
A numeric vector identifying the periods of the dependent variable over which the loss function should be minimized between each treated unit and its synthetic control. |
unit.names.variable |
The column number or string with column name identifying the variable with units' names. The variable must be a character. |
time.plot |
A vector identifying the periods over which results are
to be plotted with |
treatment.time |
A numeric value with the value in |
gen.placebos |
Logical. Whether a placebo (a synthetic control) for each unit in the donor pool should be constructed. Will increase computation time. |
strategy |
The processing method you wish to use
"sequential", "multicore" or "multisession" . Use "multicore" or "multisession" to parallelize operations
and reduce computing time. Default is |
Sigf.ipop |
The Precision setting for the ipop optimization routine. Default of 5. |
The function runs dataprep
and synth
for each unit identified in treated.units
. It saves the vector with
predicted values for each synthetic control, to be used in estimating
average treatment effects in applications of Synthetic Controls for
multiple treated units.
For further details on the arguments, see the documentation of
Synth
.
Data frame. Each column contains the outcome values for every time-point for one unit or its synthetic control. The last column contains the time-points.
## Using the toy data from 'Synth': library(Synth) data(synth.data) set.seed(42) multi <- multiple.synth(foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", treatment.time = 1990, special.predictors = list( list("Y", 1991, "mean") ), treated.units = c(2,7), control.units = c(29, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996, gen.placebos = FALSE, Sigf.ipop = 2) ## Plot with the average path of the treated units and the average of their ## respective synthetic controls: multi$p
## Using the toy data from 'Synth': library(Synth) data(synth.data) set.seed(42) multi <- multiple.synth(foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", treatment.time = 1990, special.predictors = list( list("Y", 1991, "mean") ), treated.units = c(2,7), control.units = c(29, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996, gen.placebos = FALSE, Sigf.ipop = 2) ## Plot with the average path of the treated units and the average of their ## respective synthetic controls: multi$p
Takes the output object of multiple.synth
creates a
distribution of placebo average treatment effects, to test the
significance of the observed ATE. Does so by sampling k placebos
(where k = the number of treated units) nboots times, and calculating
the average treatment effect of the k placebos each time.
plac.dist(multiple.synth, nboots = 500) plac_dist(multiple.synth, nboots = 500)
plac.dist(multiple.synth, nboots = 500) plac_dist(multiple.synth, nboots = 500)
multiple.synth |
An object returned by the function
|
nboots |
Number of bootstrapped samples of placebos to take.
Default is |
The plot.
The observed average treatment effect.
Dataframe where each row is the ATT for one bootstrapped placebo sample, used to build the distribution plot.
Proportion of bootstrapped placebo samples ATTs which are more extreme than the observed average treatment effect. Equivalent to a p-value in a two-tailed test.
## Using the toy data from Synth: library(Synth) data(synth.data) set.seed(42) ## Run the function similar to the dataprep() setup: multi <- multiple.synth(foo = synth.data, predictors = c("X1", "X2", "X3"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", treatment.time = 1990, special.predictors = list( list("Y", 1991, "mean"), list("Y", 1985, "mean"), list("Y", 1980, "mean") ), treated.units = c(2,7), control.units = c(29, 13, 17, 32), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996, gen.placebos = TRUE, Sigf.ipop = 2, strategy = 'multicore' ) ## Plot with the average path of the treated units and the average of their ## respective synthetic controls: multi$p ## Bootstrap the placebo units to get a distribution of placebo average ## treatment effects, and plot the distribution with a vertical line ## indicating the actual ATT: att.test <- plac.dist(multi) att.test$p
## Using the toy data from Synth: library(Synth) data(synth.data) set.seed(42) ## Run the function similar to the dataprep() setup: multi <- multiple.synth(foo = synth.data, predictors = c("X1", "X2", "X3"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", treatment.time = 1990, special.predictors = list( list("Y", 1991, "mean"), list("Y", 1985, "mean"), list("Y", 1980, "mean") ), treated.units = c(2,7), control.units = c(29, 13, 17, 32), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996, gen.placebos = TRUE, Sigf.ipop = 2, strategy = 'multicore' ) ## Plot with the average path of the treated units and the average of their ## respective synthetic controls: multi$p ## Bootstrap the placebo units to get a distribution of placebo average ## treatment effects, and plot the distribution with a vertical line ## indicating the actual ATT: att.test <- plac.dist(multi) att.test$p
Creates plots with the difference between observed units and synthetic controls for the treated and control units. See Abadie, Diamond, and Hainmueller (2011).
plot_placebos( tdf = tdf, discard.extreme = FALSE, mspe.limit = 20, xlab = NULL, ylab = NULL, title = NULL, alpha.placebos = 1, ... )
plot_placebos( tdf = tdf, discard.extreme = FALSE, mspe.limit = 20, xlab = NULL, ylab = NULL, title = NULL, alpha.placebos = 1, ... )
tdf |
An object with a list of outcome values for placebos,
constructed by |
discard.extreme |
Logical. Whether or not units with high pre-treatement
MSPE should be excluded from the plot. Takes a default of |
mspe.limit |
Numerical. Used if |
xlab |
Character. Optional. Label of the x axis. |
ylab |
Character. Optional. Label of the y axis. |
title |
Character. Optional. Title of the plot. |
alpha.placebos |
the transparency setting, default of |
... |
optional arguments (currently not used) |
p.gaps Gaps plot indicating difference between the treated unit, the placebos, and their respective synthetic controls.
generate.placebos
, gaps.plot
,
synth
, dataprep
## Example with toy data from Synth library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for synth dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2, strategy='multicore') ## Plot the gaps in outcome values over time of each unit -- ## treated and placebos -- to their synthetic controls p <- plot_placebos(tdf,discard.extreme=TRUE, mspe.limit=10, xlab='Year') p
## Example with toy data from Synth library(Synth) # Load the simulated data data(synth.data) # Execute dataprep to produce the necessary matrices for synth dataprep.out<- dataprep( foo = synth.data, predictors = c("X1"), predictors.op = "mean", dependent = "Y", unit.variable = "unit.num", time.variable = "year", special.predictors = list( list("Y", 1991, "mean") ), treatment.identifier = 7, controls.identifier = c(29, 2, 13, 17), time.predictors.prior = c(1984:1989), time.optimize.ssr = c(1984:1990), unit.names.variable = "name", time.plot = 1984:1996 ) # run the synth command to create the synthetic control synth.out <- synth(dataprep.out, Sigf.ipop=2) ## run the generate.placebos command to reassign treatment status ## to each unit listed as control, one at a time, and generate their ## synthetic versions. Sigf.ipop = 2 for faster computing time. ## Increase to the default of 5 for better estimates. tdf <- generate.placebos(dataprep.out,synth.out, Sigf.ipop = 2, strategy='multicore') ## Plot the gaps in outcome values over time of each unit -- ## treated and placebos -- to their synthetic controls p <- plot_placebos(tdf,discard.extreme=TRUE, mspe.limit=10, xlab='Year') p
Synth Data Synthetic data that can be used to explore SCtools.
synth.data
synth.data
a data.frame with 168 rows and 7 columns:
The experimental unit number
year
name of the experimental unit
outcome of interest
Covariate 1
Covariate 2
Covariate 3