X Close

Digital Education team blog


Ideas and reflections from UCL's Digital Education team


Stata – reporting without all the cut and paste

By Jim R Tyson, on 7 November 2022

It’s more than a year now since Chuck Huber (Stata Corp Director for Statistical Outreach) posted a series of long, detailed posts about new functionality in Stata 17.   The new, revised table command, and the use of the new collect command.

Over this past year, I’ve been getting to grips with them by reading the Stata blog, hanging out in Stata forums, and even venturing (timidly) a question on Stack Overflow.  It’s only an impression, but it seems to me that the new stuff is only slowly gaining traction among Stata users.  Which is a shame.  It may be a small exaggeration to call it exciting, but this stuff really will make life easier.

The new table command alone is a boon for regular users of Stata.  It provides a nice, logical, flexible approach to building tables in Stata. Where once we had to navigate a smallish bundle of user contributed commands such as taboutestoutesstab and of course the old table and tabulate, to get the output we wanted, formatted as we wanted it, and then wrangle it into Word or some other format, there’s one command to rule them all.  People might well still use their old favourites, but there is much less need to.  And, behind the table functionality, lurks the new collect command – a uniform and (relatively) simple way of accessing and laying out all the stuff computed by Stata commands that we don’t usually see in the default output.

Previously, the laying out of table output was unnecessarily complicated and rather basic.  The new command not only makes it easier, but combined with putdocx, putexcel, and putpdf, provides a simple way to push your output into a suitable format for reporting.  With collect, you can capture the output of Stata commands and use the same techniques to format, layout, label, and decorate in almost any way you want.  So, it’s not only about cross tabulations of data, but, for example, running multiple regression models and laying the results out for comparison.

To my mind, the real impact of this is to encourage us all to stop the cut, paste, wrangle approach – using Stata and Word interactively – and move to scripting more of the reporting stage of research.  This is good for us – the code we write to do it is re-usable; the analysis is more reproducible; we can standardize reporting more easily.

It’s true that moving from interactive use to scripting requires some up-front effort: you have to spend some time learning table and collect.  But it will be worth it.  Scripting this stuff means more time thinking through what output we really want, and less time hunting through menus and searching help for options.

So, after my year with all this, I’m taking a second shot at spreading the word.  On December 2nd I’m running a two hour workshop to introduce these commands.  The workshop requires some basic Stata knowledge and a commitment to scripting your analysis with do files, but you don’t have to be a Stata maven to attend (I’m not), just someone who wants an easier life and better Stata reporting.

You can book at https://tinyurl.com/StataTablesDSD and I hope you will!

Leave a Reply