Title: | Optimal Level of Significance for Regression and Other Statistical Tests |
---|---|
Description: | The optimal level of significance is calculated based on a decision-theoretic approach. The optimal level is chosen so that the expected loss from hypothesis testing is minimized. A range of statistical tests are covered, including the test for the population mean, population proportion, and a linear restriction in a multiple regression model. The details are covered in Kim and Choi (2020) <doi:10.1111/abac.12172>, and Kim (2021) <doi:10.1080/00031305.2020.1750484>. |
Authors: | Jae H. Kim <[email protected]> |
Maintainer: | Jae H. Kim <[email protected]> |
License: | GPL-2 |
Version: | 2.2 |
Built: | 2024-11-12 02:39:04 UTC |
Source: | https://github.com/cran/OptSig |
The optimal level of significance is calculated based on a decision-theoretic approach. The optimal level is chosen so that the expected loss from hypothesis testing is minimized. A range of statistical tests are covered, including the test for the population mean, population proportion, and a linear restriction in a multiple regression model. The details are covered in Kim and Choi (2020) <doi:10.1111/abac.12172>, and Kim (2021) <doi:10.1080/00031305.2020.1750484>.
The DESCRIPTION file:
Package: | OptSig |
Type: | Package |
Title: | Optimal Level of Significance for Regression and Other Statistical Tests |
Version: | 2.2 |
Imports: | pwr |
Date: | 2022-06-29 |
Author: | Jae H. Kim <[email protected]> |
Maintainer: | Jae H. Kim <[email protected]> |
Description: | The optimal level of significance is calculated based on a decision-theoretic approach. The optimal level is chosen so that the expected loss from hypothesis testing is minimized. A range of statistical tests are covered, including the test for the population mean, population proportion, and a linear restriction in a multiple regression model. The details are covered in Kim and Choi (2020) <doi:10.1111/abac.12172>, and Kim (2021) <doi:10.1080/00031305.2020.1750484>. |
License: | GPL-2 |
NeedsCompilation: | no |
Packaged: | 2022-07-03 03:23:48 UTC; jh808 |
Date/Publication: | 2022-07-03 12:30:14 UTC |
Repository: | https://jh8080.r-universe.dev |
RemoteUrl: | https://github.com/cran/OptSig |
RemoteRef: | HEAD |
RemoteSha: | 2fbf7b2e65234a14d88d32245210faeb8c5164d2 |
Index of help topics:
Opt.sig.norm.test Optimal significance level calculation for the mean of a normal distribution (known variance) Opt.sig.t.test Optimal significance level calculation for t-tests of means (one sample, two samples and paired samples) OptSig-package Optimal Level of Significance for Regression and Other Statistical Tests OptSig.2p Optimal significance level calculation for the test for two proportions (same sample sizes) OptSig.2p2n Optimal significance level calculation for the test for two proportions (different sample sizes) OptSig.Boot Optimal Significance Level for the F-test using the bootstrap OptSig.BootWeight Weighted Optimal Significance Level for the F-test based on the bootstrap OptSig.Chisq Optimal Significance Level for a Chi-square test OptSig.F Optimal Significance Level for an F-test OptSig.Weight Weighted Optimal Significance Level for the F-test based on the assumption of normality in the error term OptSig.anova Optimal significance level calculation for balanced one-way analysis of variance tests OptSig.p Optimal significance level calculation for proportion tests (one sample) OptSig.r Optimal significance level calculation for correlation test OptSig.t2n Optimal significance level calculation for two samples (different sizes) t-tests of means Power.Chisq Function to calculate the power of a Chi-square test Power.F Function to calculate the power of an F-test R.OLS Restricted OLS estimation and F-test data1 Data for the U.S. production function estimation
The package accompanies the paper: Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach. Abacus. Wiley.
It oprovides functions for the optimal level of significance for the test for linear restiction in a regeression model.
Other basic statistical tests, including those for population mean and proportion, are also covered using the functions from the pwr package.
Jae H. Kim <[email protected]>
Maintainer: Jae H. Kim <[email protected]>
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp # Optimal level of Significance: Under Normality OptSig.F(df1,df2,ncp=NCP,p=0.5,k=1, Figure=TRUE)
data(data1) y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp # Optimal level of Significance: Under Normality OptSig.F(df1,df2,ncp=NCP,p=0.5,k=1, Figure=TRUE)
US production, captal, labour in natrual logs for the year 2005
data("data1")
data("data1")
A data frame with 51 observations on the following 3 variables.
lnoutput
natrual log of output
lnlabor
natrual log of labor
lncapital
natrual log of capital
The data contains 51 observations for 50 US states and Washington DC
Gujarati, D. 2015, Econometrics by Example, Second edition, Palgrave.
See Section 2.2 of Gujarari (2015)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach, Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
data(data1)
data(data1)
Computes the optimal significance level for the mean of a normal distribution (known variance)
Opt.sig.norm.test(ncp=NULL,d=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
Opt.sig.norm.test(ncp=NULL,d=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
ncp |
Non-centrality parameter |
d |
Effect size, Cohen's d |
n |
Sample size |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
Either ncp or d value should be given.
In a general term, if X ~ N(mu,sigma^2); let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
d = (mu1-mu0)/sigma: Cohen's d
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2019). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
Opt.sig.norm.test(d=0.2,n=60,alternative="two.sided")
Opt.sig.norm.test(d=0.2,n=60,alternative="two.sided")
Computes the optimal significance level for the test for t-tests of means
Opt.sig.t.test(ncp=NULL,d=NULL,n=NULL,p=0.5,k=1, type="one.sample",alternative="two.sided",Figure=TRUE)
Opt.sig.t.test(ncp=NULL,d=NULL,n=NULL,p=0.5,k=1, type="one.sample",alternative="two.sided",Figure=TRUE)
ncp |
Non-centrality parameter |
d |
Effect size |
n |
Sample size |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
type |
Type of t test : one- two- or paired-sample |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
Either ncp or d value should be given, with the value of n.
In a general term, if X ~ N(mu,sigma^2); let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
d = (mu1-mu0)/sigma: Cohen's d
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
Opt.sig.t.test(d=0.2,n=60,type="one.sample",alternative="two.sided")
Opt.sig.t.test(d=0.2,n=60,type="one.sample",alternative="two.sided")
Computes the optimal significance level for the test for two proportions
OptSig.2p(ncp=NULL,h=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
OptSig.2p(ncp=NULL,h=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
ncp |
Non-centrality parameter |
h |
Effect size, Cohen's h |
n |
Number of observations (per sample) |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
Either ncp or h value should be specified.
For h, refer to Cohen (1988) or Champely (2017)
In a general term, if X ~ N(mu,sigma^2); let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package,
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
OptSig.2p(h=0.2,n=60,alternative="two.sided")
OptSig.2p(h=0.2,n=60,alternative="two.sided")
Computes the optimal significance level for the test for two proportions
OptSig.2p2n(ncp=NULL,h=NULL,n1=NULL,n2=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
OptSig.2p2n(ncp=NULL,h=NULL,n1=NULL,n2=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
ncp |
Non-centrality parameter |
h |
Effect size, Cohen's h |
n1 |
Number of observations (1st sample) |
n2 |
Number of observations (2nd sample) |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
Either ncp or h value should be specified.
For h, refer to Cohen (1988) or Chapmely (2017)
Assume X ~ N(mu,sigma^2); and let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
OptSig.2p2n(h=0.30,n1=80,n2=245,alternative="greater")
OptSig.2p2n(h=0.30,n1=80,n2=245,alternative="greater")
Computes the optimal significance level for the test for balanced one-way analysis of variance tests
OptSig.anova(K = NULL, n = NULL, f = NULL, p = 0.5, k = 1, Figure = TRUE)
OptSig.anova(K = NULL, n = NULL, f = NULL, p = 0.5, k = 1, Figure = TRUE)
K |
Number of groups |
n |
Number of observations (per group) |
f |
Effect size |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
For the value of f, refer to Cohen (1988) or Champely (2017)
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
OptSig.anova(f=0.28,K=4,n=20)
OptSig.anova(f=0.28,K=4,n=20)
The function calculates the optimal level of significance for the F-test
The bootstrap can be conducted using either iid resampling or wild bootstrap.
OptSig.Boot(y,x,Rmat,rvec,p=0.5,k=1,nboot=3000,wild=FALSE,Figure=TRUE)
OptSig.Boot(y,x,Rmat,rvec,p=0.5,k=1,nboot=3000,wild=FALSE,Figure=TRUE)
y |
a matrix of dependent variable, T by 1 |
x |
a matrix of K independent variable, T by K |
Rmat |
a matrix for J restrictions, J by (K+1) |
rvec |
a vector for restrictions, J by 1 |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
nboot |
the number of bootstrap iterations, the default is 3000 |
wild |
if TRUE, wild bootsrap is conducted; if FALSE (default), bootstrap is based on iid residual resampling |
Figure |
show graph if TRUE (default). No graph otherwise |
See Kim and Choi (2020)
alpha.opt |
Optimal level of significance |
crit.opt |
Critical value at the optimal level |
beta.opt |
Type II error probability at the optimal level |
Applicable to a linear regression model
The black curve in the figure plots the denity under H0; The blue curve in the figure plots the denity under H1.
Jae H. Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach, Abacus, Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) OptSig.Boot(y,x,Rmat,rvec,p=0.5,k=1,nboot=1000,Figure=TRUE)
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) OptSig.Boot(y,x,Rmat,rvec,p=0.5,k=1,nboot=1000,Figure=TRUE)
The function calculates the weighted optimal level of significance for the F-test
The weights are obtained from the bootstrap distribution of the non-centrality parameter estimates
OptSig.BootWeight(y,x,Rmat,rvec,p=0.5,k=1,nboot=3000,wild=FALSE,Figure=TRUE)
OptSig.BootWeight(y,x,Rmat,rvec,p=0.5,k=1,nboot=3000,wild=FALSE,Figure=TRUE)
y |
a matrix of dependent variable, T by 1 |
x |
a matrix of K independent variable, T by K |
Rmat |
a matrix for J restrictions, J by (K+1) |
rvec |
a vector for restrictions, J by 1 |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
nboot |
the number of bootstrap iterations, the default is 3000 |
wild |
if TRUE, wild bootsrap is conducted (default); if FALSE, bootstrap is based on iid resampling |
Figure |
show graph if TRUE . No graph if FALSE (default) |
The bootstrap can be conducted using either iid resampling or wild bootstrap.
alpha.opt |
Optimal level of significance |
crit.opt |
Critical value at the optimal level |
Applicable to a linear regression model
Jae H. Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach. Abacus, Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) OptSig.Boot(y,x,Rmat,rvec,p=0.5,k=1,nboot=1000,Figure=TRUE)
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) OptSig.Boot(y,x,Rmat,rvec,p=0.5,k=1,nboot=1000,Figure=TRUE)
The function calculates the optimal level of significance for a Ch-square test
OptSig.Chisq(w=NULL, N=NULL, ncp=NULL, df, p = 0.5, k = 1, Figure = TRUE)
OptSig.Chisq(w=NULL, N=NULL, ncp=NULL, df, p = 0.5, k = 1, Figure = TRUE)
w |
Effect size, Cohen's w |
N |
Total number of observations |
ncp |
a value of the non-centality paramter |
df |
the degrees of freedom |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
Figure |
show graph if TRUE (default); No graph if FALSE |
See Kim and Choi (2020)
alpha.opt |
Optimal level of significance |
crit.opt |
Critical value at the optimal level |
beta.opt |
Type II error probability at the optimal level |
Applicable to any Chi-square test Either ncp or w (with N) should be given.
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae. H Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
# Optimal level of Significance for the Breusch-Pagan test: Chi-square version data(data1) # call the data: Table 2.1 of Gujarati (2015) # Extract Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices for the slope coefficents sum to 1 Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(1,nrow=1) # Model Estimation M=R.OLS(y,x,Rmat,rvec); print(M$coef) # Breusch-Pagan test for heteroskedasticity e = M$resid[,1] # residuals from unrestricted model estimation # Restriction matrices for the slope coefficients being 0 Rmat=matrix(c(0,0,1,0,0,1),nrow=2); rvec=matrix(0,nrow=2) # Model Estimation for the auxilliary regression M1=R.OLS(e^2,x,Rmat,rvec); # Degrees of Freedom and estimate of non-centrality parameter df1=nrow(Rmat); NCP=M1$ncp # LM stat and p-value LM=nrow(data1)*M1$Rsq[1,1] pval=pchisq(LM,df=df1,lower.tail = FALSE) OptSig.Chisq(df=df1,ncp=NCP,p=0.5,k=1, Figure=TRUE)
# Optimal level of Significance for the Breusch-Pagan test: Chi-square version data(data1) # call the data: Table 2.1 of Gujarati (2015) # Extract Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices for the slope coefficents sum to 1 Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(1,nrow=1) # Model Estimation M=R.OLS(y,x,Rmat,rvec); print(M$coef) # Breusch-Pagan test for heteroskedasticity e = M$resid[,1] # residuals from unrestricted model estimation # Restriction matrices for the slope coefficients being 0 Rmat=matrix(c(0,0,1,0,0,1),nrow=2); rvec=matrix(0,nrow=2) # Model Estimation for the auxilliary regression M1=R.OLS(e^2,x,Rmat,rvec); # Degrees of Freedom and estimate of non-centrality parameter df1=nrow(Rmat); NCP=M1$ncp # LM stat and p-value LM=nrow(data1)*M1$Rsq[1,1] pval=pchisq(LM,df=df1,lower.tail = FALSE) OptSig.Chisq(df=df1,ncp=NCP,p=0.5,k=1, Figure=TRUE)
The function calculates the optimal level of significance for an F-test
OptSig.F(df1, df2, ncp, p = 0.5, k = 1, Figure = TRUE)
OptSig.F(df1, df2, ncp, p = 0.5, k = 1, Figure = TRUE)
df1 |
the first degrees of freedom for the F-distribution |
df2 |
the second degrees of freedom for the F-distribution |
ncp |
a value of of the non-centality paramter |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
Figure |
show graph if TRUE (default); No graph if FALSE |
See Kim and Choi (2020)
alpha.opt |
Optimal level of significance |
crit.opt |
Critical value at the optimal level |
beta.opt |
Type II error probability at the optimal level |
Applicable to any F-test, following F-distribution
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae. H Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp # Optimal level of Significance: Under Normality OptSig.F(df1,df2,ncp=NCP,p=0.5,k=1, Figure=TRUE)
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp # Optimal level of Significance: Under Normality OptSig.F(df1,df2,ncp=NCP,p=0.5,k=1, Figure=TRUE)
Computes the optimal significance level for proportion tests (one sample)
OptSig.p(ncp=NULL,h=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
OptSig.p(ncp=NULL,h=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
ncp |
Non-centraity parameter |
h |
Effect size, Cohen's h |
n |
Number of observations (per sample) |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
Either ncp or h value should be given
For h, refer to Cohen (1988) or Chapmely (2017)
In a general term, if X ~ N(mu,sigma^2); let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
OptSig.p(h=0.2,n=60,alternative="two.sided")
OptSig.p(h=0.2,n=60,alternative="two.sided")
Computes the optimal significance level for the correlation test
OptSig.r(r=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
OptSig.r(r=NULL,n=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
r |
Linear correlation coefficient |
n |
sample size |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II error, k = L2/L1, default is k = 1 |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
In a general term, if X ~ N(mu,sigma^2); let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
OptSig.r(r=0.2,n=60,alternative="two.sided")
OptSig.r(r=0.2,n=60,alternative="two.sided")
Computes the optimal significance level for two samples (different sizes) t-tests of means
OptSig.t2n(ncp=NULL,d=NULL,n1=NULL,n2=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
OptSig.t2n(ncp=NULL,d=NULL,n1=NULL,n2=NULL,p=0.5,k=1,alternative="two.sided",Figure=TRUE)
ncp |
Non-centrality parameter |
d |
Effect size |
n1 |
umber of observations in the first sample |
n2 |
umber of observations in the second sample |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less" |
Figure |
show graph if TRUE (default); No graph if FALSE |
Refer to Kim and Choi (2020) for the details of k and p
Either ncp or d value should be specified.
In a general term, if X ~ N(mu,sigma^2); let H0:mu = mu0; and H1:mu = mu1;
ncp = sqrt(n)(mu1-mu0)/sigma
d = (mu1-mu0)/sigma: Cohen's d
alpha.opt |
Optimal level of significance |
beta.opt |
Type II error probability at the optimal level |
Also refer to the manual for the pwr package
The black curve in the figure is the line of enlightened judgement: see Kim and Choi (2020). The red dot inticates the optimal significance level that minimizes the expected loss: (alpha.opt,beta.opt). The blue horizontal line indicates the case of alpha = 0.05 as a reference point.
Jae H. Kim (using a function from the pwr package)
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach: Abacus: a Journal of Accounting, Finance and Business Studies. Wiley. <https://doi.org/10.1111/abac.12172>
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,NJ: Lawrence Erlbaum.
Stephane Champely (2017). pwr: Basic Functions for Power Analysis. R package version 1.2-1. https://CRAN.R-project.org/package=pwr
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
OptSig.t2n(d=0.6,n1=90,n2=60,alternative="greater")
OptSig.t2n(d=0.6,n1=90,n2=60,alternative="greater")
The function calculates the weighted optimal level of significance for the F-test
The weights are obtained from a folded-normal distribution with mean m and staradrd deviation delta
OptSig.Weight(df1, df2, m, delta = 2, p = 0.5, k = 1, Figure = TRUE)
OptSig.Weight(df1, df2, m, delta = 2, p = 0.5, k = 1, Figure = TRUE)
df1 |
the first degrees of freedom for the F-distribution |
df2 |
the second degrees of freedom for the F-distribution |
m |
a value of of the non-centality paramter, the mean of the folded-normal distribution |
delta |
standard deviation of the folded-normal distribution |
p |
prior probability for H0, default is p = 0.5 |
k |
relative loss from Type I and II errors, k = L2/L1, default is k = 1 |
Figure |
show graph if TRUE (default); No graph if FALSE |
See Kim and Choi (2020)
alpha.opt |
Optimal level of significance |
crit.opt |
Critical value at the optimal level |
The figure shows the folded-normal distribution
Jae H. Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach, Abacus, Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp OptSig.Weight(df1,df2,m=NCP,delta=3,p=0.5,k=1,Figure=TRUE)
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp OptSig.Weight(df1,df2,m=NCP,delta=3,p=0.5,k=1,Figure=TRUE)
This function calculates the power of a Chi-square test, given the value of non-centrality parameter
Power.Chisq(df, ncp, alpha, Figure = TRUE)
Power.Chisq(df, ncp, alpha, Figure = TRUE)
df |
degree of freedom |
ncp |
a value of of the non-centality paramter |
alpha |
the level of significance |
Figure |
show graph if TRUE (default); No graph if FALSE |
See Kim and Choi (2020)
Power |
Power of the test |
Crit.val |
Critical value at alpha level of signifcance |
See Application Section and Appendix of Kim and Choi (2017)
Jae H. Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach, Abacus, Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
Power.Chisq(df=5,ncp=5,alpha=0.05,Figure=TRUE)
Power.Chisq(df=5,ncp=5,alpha=0.05,Figure=TRUE)
This function calculates the power of an F-test, given the value of non-centrality parameter
Power.F(df1, df2, ncp, alpha, Figure = TRUE)
Power.F(df1, df2, ncp, alpha, Figure = TRUE)
df1 |
the first degrees of freedom for the F-distribution |
df2 |
the second degrees of freedom for the F-distribution |
ncp |
a value of of the non-centality paramter |
alpha |
the level of significance |
Figure |
show graph if TRUE (default); No graph if FALSE |
See Kim and Choi (2020)
Power |
Power of the test |
Crit.val |
Critical value at alpha level of signifcance |
See Application Section and Appendix of Kim and Choi (2020)
Jae H. Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach, Abacus, Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp Power.F(df1,df2,ncp=NCP,alpha=0.20747,Figure=TRUE)
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(0.94,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec) # Degrees of Freedom and estimate of non-centrality parameter K=ncol(x)+1; T=length(y) df1=nrow(Rmat);df2=T-K; NCP=M$ncp Power.F(df1,df2,ncp=NCP,alpha=0.20747,Figure=TRUE)
Function to calcuate the Restricted (under H0) OLS Estimators and F-test statistic
R.OLS(y, x, Rmat, rvec)
R.OLS(y, x, Rmat, rvec)
y |
a matrix of dependent variable, T by 1 |
x |
a matrix of K independent variable, T by K |
Rmat |
a matrix for J restrictions, J by (K+1) |
rvec |
a vector for restrictions, J by 1 |
Rmat and rvec are the matrices for the linear restrictions, which a user should supply.
Refer to an econometrics textbook for details.
coef |
matrix of estimated coefficients, (K+1) by 2, under H1 and H0 |
RSq |
R-square values under H1 and H0, 2 by 1 |
resid |
residual vector under H1 and H0, T by 2 |
F.stat |
F-statistic and p-value |
ncp |
non-centrality parameter, estimated by replaicing unknowns using OLS estimates |
The function automatically adds an intercept, so the user need not include a vector of ones in x matrix.
Jae H. Kim
Kim and Choi, 2020, Choosing the Level of Significance: A Decision-theoretic Approach, Abacus, Wiley. <https://doi.org/10.1111/abac.12172>
Leamer, E. 1978, Specification Searches: Ad Hoc Inference with Nonexperimental Data, Wiley, New York.
Kim, JH and Ji, P. 2015, Significance Testing in Empirical Finance: A Critical Review and Assessment, Journal of Empirical Finance 34, 1-14. <DOI:http://dx.doi.org/10.1016/j.jempfin.2015.08.006>
Kim, Jae H., 2020, Decision-theoretic hypothesis testing: A primer with R package OptSig, The American Statistician. <https://doi.org/10.1080/00031305.2020.1750484.>
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(1,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec)
data(data1) # Define Y and X y=data1$lnoutput; x=cbind(data1$lncapital,data1$lnlabor) # Restriction matrices to test for constant returns to scale Rmat=matrix(c(0,1,1),nrow=1); rvec=matrix(1,nrow=1) # Model Estimation and F-test M=R.OLS(y,x,Rmat,rvec)