Regression Tables

The 10 Commandments for Regression Tables

These commands are organized according to how controversial I think they might be. If you are my student or I am your referee these commands are not optional.

  1. Report the number of observations, the r-squared, and the root mean squared error for each regression.

  2. Report the dependent variable and the estimation method. in the table’s caption if it is common to all specifications or as a column heading if it varies across specifications.

  3. Use self-explanatory labels for your explanatory variables. Cryptic abbreviations or symbols from the model section force the reader to page back and forth to understand your results. With five or six columns of regression results there should be enough room to use words to describe each regressor. Put the symbol used in the model in parentheses below this.

  4. Choose sensible units for variables. The coefficients should not be very small (e.g. 0.000032) or very large (e.g. 75432.8). As a rule of thumb, coefficients should only use the first two or three places to the left or right of the decimal point. One exception is the case where variables are unit-free because you are estimating a log-log model. In that case coefficient size is inherently meaningful.

  5. The presentation version of the table should be in large type. Don’t show a table full of tiny numbers and say “I know you can’t read this but…” If necessary, place some of you control variables in an auxiliary table so you can focus attention on the variables of interest.

  6. Put standard errors in the same column as the coefficients. Regression packages put standard errors along side coefficients as separate columns but you should put each regression as a single column in your results table. Columns should be used for 4-8 alternative specifications and samples. Thus the standard error should appear below the related coefficient in parentheses. Use the Stata estout package . You have to install this by typing the following code on the Stata command line:

    . ssc install estout, replace
    
  7. Insert key tables inside the body of the paper. Journals insist upon tables at the end for the final submitted version of the paper. This does not mean you should do it for working papers or first submissions. There is a reason why the printed version of your article puts the tables back into the text: it is easier to read a paper that way without having to constantly flip to the end to find results and then flip back to the text for interpretation. By putting the tables in the text you will also be more aware of whether your paper has the right mix of text and tabular information.

  8. Display standard errors, not t-statistics or p-values Unless the test that the coefficient is not equal to zero is the only conceivable test of interest, display standard errors. These give readers a direct view of the precision with which you are estimating the coefficient. They are useful information for a variety of possible tests and are still valuable even if the reader prefers not to engage in classical hypothesis testing at all.If my arguments have not persuaded you, let me appeal to a higher authority:

    We’re not only (or even primarily) interested in formal hypothesis testing: we like to see the standard errors in parentheses under our regression coefficients. These provide a summary measure of precision that can be used to construct confidence intervals, compare estimators, and test any hypothesis that strikes us, now or later.

    Angrist and Pischke, Mostly Harmless Econometrics, p. 302

  9. Use “a” (1%), “b” (5%), and “c” (10%) superscripts to show statistical significance, if you show it at all. Using multiple asterisks (***) to display statistical significance wastes scarce horizontal space in a table. In my perception, a table stuffed with asterisks looks like you are showing off. If you really like asterisks, and there is something to be said for following common practice, then just pick a level of significance (five percent or possibly one percent) that seems appropriate for your study and then use a single asterisk for that level. You could use a squiggle for coefficients that are only marginally significant. Another approach that is gaining favour is to put the significant coefficients in bold font.

  10. Report significance for two-tailed tests only. You may think it is OK to use one-tailed tests if your theory tells you the sign of the coefficient. However, this potential justification is overwhelmed by the common practice of using two-tailed test criteria. Many readers, will view the use of one-tailed tests as a cynical ploy to exaggerate the significance of your results. Thus, with infinite degrees of freedom, variables are significant at the 5% level for t-stats over 1.96, NOT 1.645.