X Close

Digital Education team blog

Home

Ideas and reflections from UCL's Digital Education team

Menu

Archive for the 'Quantitative subjects' Category

Half the struggle with digital knowledge is knowing what it’s for

By Jim R Tyson, on 10 April 2024

Sometimes I hear an announcement about an update or improvement to some technology or ‘app’ (as the youngsters say) that sounds exciting and eminently worth investigating.  When I start sifting through Google results to find out more, I can spend a day or two sometimes working out what’s going on.  That’s OK, it’s part of my job to do this and then, if what I’ve learned is useful, to find ways to communicate it to other people.

For example, Excel now allows users to create ad hoc and custom functions using lambda().  Now, if you are a computer scientist, mathematician, philosopher or linguist, you will probably have heard of the lambda calculus, an important mathematical invention of the twentieth century that influenced all those disciplines.  It provided a way to formally characterise computation as function application and abstraction (roughly, don’t quote me on this – it was a long time ago).  Now, even students of computer science may sometimes encounter the calculus and end up wondering ‘OK, but what’s it for?’

Well, one way to demonsrate its practical use is by introducing the world of the lambda() function in Excel: it allows you to formally define new Excel functions.  I immediately spotted some  uses  for this.

Descriptive statistics for sub-populations

Excel provides all the most common and useful statistical calculations as basic functions such as count(), average(), var(), stdev() and for some functions there are conditional versions countif(), averageif(),which allow for subsetting your data.  So, it might be that you have two columns of data, the first is some interesting measure (temperature? height? resting bpm?) and the second some characteristic such as ethnicity or gender.

It might be that you want to know what is the average resting heart rate (for example) of the male participants in  your study.  You can do this using averageif(): =averageif(B1:B100, “male”, C1:C100).  Assuming that the sex data are in the range B1:B100 and the resting heart rate data are in C1:100.  In fact, Excel has an averageifs() to help with cases with multiple selection criteria.  So that’s good for data analysts using Excel because it’s a common analytical approach.

However, this only works for averageif(), countif(), maxif(), minif() – there is no conditional var(), stdev(), skew() or kurt().  Well, for variance and standard deviation, we can construct a pivot table to get the subpopulation analysis we want and that’s great.  But occasionally (probably not that often) we want to calculate the skew in resting heart rate for all male participants, and maybe even the kurtosis.  And here, after that long lead in, is where the lambda() function proves it’s worth.  First, lets look at how we would caculate the kurtosis in some measure as a function of gender, using a combination of built-in Excel formulas: Given a data table where column A is the categorical (eg gender) variable and B is the measure of interest:

1 A B
2 1 56
3 1 62
4 1 48
5 1 58
6 1 58
7 1 55
8 1 42
9 1 54
10 1 47
11 2 52
12 2 59
13 2 56
14 2 45
15 2 63
16 2 52
17 2 44

The formula we want is kurt(if(A1:A17=1,B1:B17))

So there are three parameters, the first is the range to which we apply the second (the selection criterion value (here, 1 = male)), the third is the range from which this conditional statement will select data.  I tested this formula and it works fine, it’s just a bit clunky to use compared to the built-in averageif() etc.  So, I decided to reconstruct it using the lambda() function, to produce my own kurtif() function.  In an empty cell, we put the lambda expression:

=LAMBDA(a,n,b, KURT(IF(a = n, b)))

with the three parameters represented (arbitrarily) by a, n and b.  A moment’s reflection and we see the relation between the lambda expression and our previous Excel formula.  From a practical point of view, the lambda expression tells us where in the calculation to plug in the values a, n and b to get our result.

When you enter this expression in a blank cell and hit return you will see the warning message

#CALC!

 

which is Excel recognising that the cell contains a lambda expression.  The next step is to name the new function kurtif().  Copy the new lambda expression to the clipboard (highlight and control-c) then from the formula tab on the ribbon select and open the name manager.  In the dialog that this opens press the new button; give kurtif as the name for the new function and then paste the lambda expression from the clipboard then OK and close the name manager dialog.

Now you can carry out the computation as:

=kurtif(A1:17,1,B1:B17)

which is simpler and has the advantage of looking very like averageif() and the other, similar functions.

A missing significance calculation in Excel

Excel has a very simple to use function that will calculate a correlation coefficient (R) from two arrays of data: =correl(Array1,Array2).  It is a minor annoyance that this calculation doesn’t return a p value for R, allowing us to test the null hypothesis that the true correlation between Array1 and Array2 is zero.  So, let’s assume that we have calculated R for two arrays of data, and we know that n is just the count of one array, and the result is in cell H1 (for no particular reason).

Now, it’s a fairly simple trick to calculate a t value based on the correlation coefficient.  The formula is

and since we have just calculated r, it is simple to calculate t with the formula (and put the result in H3):

=H1*(sqrt(count(Array1)-1))/sqrt(1*-H1^2)

The last (and for now separate) step is to find the significance for this t score with the two-tailed t distribution function with n-2 degrees of freedom

=t.dist.2t(H3, count(Array1)-2)

And there we have it.  So, it would be useful to have a little helper function we could apply simply,  to calculate  from r.  Here is the lambda function code:

=LAMBDA(r,n,(r*SQRT(n-2))/SQRT(1-(r^2)))

and we can name it and use it as before.  I named mine ‘convertRtoT’ and used it like this with the correlation coefficient in H1 and n = 30:

convertRtoT(H1,30)

So now we know what the lambda() function is for.  The example is perhaps a little obscure, but the principle – that half the struggle with digital knowledge is knowing what it’s for – holds for far more mundane cases: I’ve been learning Power BI, and while there are simple answers – power BI is for visualisation – it’s only after I’ve been through a few hours of tutorials that I’m really understanding what it’s about.

Homework

If you want to check out the example calculations, please go ahead.  I checked them all in Excel (and for the t to r conversion, I checked my result against R), but it’s always possible to make a bluder when copying and pasting.  But finding and fixing errors is good practice.  If you want more practice, then I would suggest creating a function skewif() that works like kurtif(), taking two arrays ( a score and a selection criterion) a gives the skewness for the cases selected by the criterion.  Good luck.

(this blog post was supported by the music of Iannis Xenakis, “Six Chansons No 1, ca sent le musc”)

Starting up Stata with personalised options

By Jim R Tyson, on 4 March 2024

There are often things one can do to personalise and improve ones experience with software that involve some cusotmization, and that may be easy on your own machine, but less easy if your machine is managed by the organisation (in this case UCL).  My Laptop is managed by UCL (although I do have some elevated rights).

In an effort to improve my Stata workflow and output, I have several graphing options that I want to apply to all graphis I produce in Stata.  Typically, I want the Title left justified, in black, to take up the whole width of the graph (rather than the plot region) and to appear top left (at 11 o’clock).  The graph region colour should be white with no axis lines for x or y axes and with no fill-colour or border colour.

To simplify this I put these in a global macro ‘graph_opts‘ and add the macro to the start of any graphing command as $graph_opts . Anyone who knows how lazy and inconsistent I am, will already be guessing that while I may aspire to do this, I more often just hack away at my graphs until they look (more or less) as I want.  This is the worst kind of laziness because a little effort in setting this up would make for less work.

So I decided to investigate – could I automate this?  And I can.  At first, my heart sank slightly when I realised I would have to deal with the system paths on my managed machine, but it turned out to be very straight forward.  You can use sysdir on the stata console to find your stata program files folder.  When you navigate to this folder, use dir *.do to check for the presence of the file sysprofile.do -this means you are in the right directory.  Now, create a new do file called profile.do  Any code you add to this file is executed on Stata start up.  Knowing that I added these lines to my own profile.do (I used the Stata do file editor, but any plain text editor such as Windows Notepad would do as well):

// For -twoway- graphs
global graph_opts ///
  title(, justification(left) color(black) span pos(11)) ///
  graphregion(color(white)) ///
  xscale(noline) xtit(,placement(left) justification(left)) ///
  yscale(noline) ylab(,angle(0) nogrid) ///
  legend(region(lc(none) fc(none)))

// For -graph- graphs
global graph_opts_1 ///
  title(, justification(left) color(black) span pos(11)) ///
  graphregion(color(white)) ///
  yscale(noline) ylab(,angle(0) nogrid) ///
  legend(region(lc(none) fc(none)))

Of course you will want to change these to meet your own preferences – which may mean a deep dive into the Stata documentation.  It is however worth it given the time and effort you will save in hacking at graph code (or [shudder] gph files) to ensure that your graphs are all consistently presented in your reports.

Using the Premium version of Equatio with Moodle

By Kerry, on 10 November 2023

Following my post on Using the free Equatio Chrome extension with Moodle, I have finally had the chance to explore how the Premium version works. Currently, UCL has a licence for the Premium version of Equatio for staff and students until July 2024.

Equatio can help you create accessible mathematical content without having to use any code or programming languages. You can easily add formulas and equations to your content through keyboard input, handwriting recognition or voice recording. It is also compatible with LaTeX for more advanced users. In addition, it allows some graphing input.

Screenshot of Equatio toolbar (Premium version)

Equatio toolbar (Premium version)

You may have already explored the free Chrome extension and if not, you can check out our mini guide. While the free version of Equatio can help anyone create accessible maths, it lacks features such as prediction, Equatio Mobile integration, and the screenshot reader. Here is a handy overview of what is included in the premium version compared to the free version.

In case you were not aware (and I only learnt this myself recently), the Premium version of Equatio with unlimited use for Google, Windows and Mac can be installed on any UCL machine. If you would like to try this out when creating Moodle content such as Text and Media areas (formerly Labels), Pages, Books, Discussion Forum messages, Assignment instructions, Quiz questions etc, we have now tested this and created a wiki mini guide.

This will take you through the set up for the Premium version of Equatio for Chrome, the toolbar, setting options and the main features which include the Equation Editor, Handwriting Recognition, Speech Input, LaTeX Editor, Screenshot Reader, Equatio Mobile, Graph Editor, Mathspace and STEM Tools. The mini guide also include links to some handy video guides from Texthelp as well as further information. We hope some of you find it helpful.

Using the free Equatio Chrome Extension with Moodle

By Kerry, on 14 September 2023

You may have heard of texthelp‘s literacy support tool Read&Write but did you know that they also have an education tool called Equatio which helps support the creation of accessible mathematical content online?

Equatio can help you create maths expressions without having to use any code or programming languages. You can easily add formulas and equations to your content through keyboard input, handwriting recognition or voice recording. It is also compatible with LaTeX for more advanced users. In addition, it allows some graphing input.

The Premium version of Equatio with unlimited use for Google, Windows, Mac and LMS can now be installed on UCL machines. Advice and guidance on this will follow in due course but in the meantime, we have created a wiki mini guide on what it is possible to achieve in Moodle with the free version of the Equatio Chrome Extension. This guide will be useful for anyone who would like to dip their toe in to explore what Equatio can do before installing a full version! There is also further information for anyone who is keen to learn more.

Screenshot of using the Equatio chrome extension to insert speech you have recorded and converted to Math.

Using the Equatio chrome extension to insert speech, which you have recorded and converted to Math, into a Moodle Text and media area.