X Close

UCL Centre for Digital Humanities

Home

Menu

How well do Google image results represent reality?

By Oliver W Duke-Williams, on 23 June 2015

Much has been written about Sir Tim Hunt’s remarks at the World Conference of Science Journalists in Seoul earlier this month. The debate has developed in a number of directions, including a discussion about the gender representation in images returned by Google’s image search, with a specific example being made of the male-dominated results when using the search term ‘professor’. Writing in The Guardian, Dame Athene Donald observed:

If you think that doesn’t matter, imagine you are a 12-year-old girl trying to get a sense of what the adult professional world is like. If the only images that appear against the search term of “professor” are either elderly white males or cartoons of men in white coats with sticking-up hair, as a girl you are hardly likely to think it is the sort of career aspiration you should be considering.

The representation of ‘professor’ is of course problematic in a number of ways: as well as being shown as male, professors are also shown as sterotypically balding and bespectacled. Similarly stereotype-driven images are de rigeur in children’s literature, as documented by Professor Melissa Terras. A natural response to this observation is to wonder what the gender representation of other jobs looks like through the prism of Google Images. Are they similarly one-sided? For example, although the Women’s World Cup is under way at the time of writing, searching for ‘footballer’ returns an entirely male set of results. As with the case for professors, this would not encourage a girl to think that football is a sport for all.

The results of the 2011 Census (for England and Wales) were used to identify a number of different occupations accounting for significant employment. Results are given using the SOC2010 classification of occupations. This is a hierarchical classification, broken down into increasingly detailed job descriptions. Two elements of this were used. Firstly, the top level classifications were used, as counts by sex are available at this level. Secondly, the more detailed third level were used, although only a count of total persons is published at this level.

In each case, a gender-neutral search term was used based on the occupation description, and then the number of male and female images returned were manually counted, using the somewhat arbitrary (but consistently applied) metric of those results which appeared on screen with no scrolling. Cartoon and stylised images were included where gender was obvious; where more than one person was included in an image, all who were in focus and relevant were included. Persons in images were not counted if their gender was not obvious. In a number of cases, it was necessary to select one person who matched the image description: thus, when searching for ‘carer’ it was generally the case that each image depicted both the giver and the receiver of care.

Table 1 shows the results for the top level of the SOC coding, giving both the numbers employed as reported in the census, and the gender split as determined by assessing the Google image results. The SOC labels here are rather broad, and inevitably in representing this as a single search term a significant degree of generalisation is required. A suitable search term for the final ‘elementary occupations’ group was not found. The results are quite interesting: in most cases the gender balance is in the right direction, whilst not being close enough to suggest that this an accurate mirror of society. The one category that is clearly wrong is ‘Administrative and secretarial occupations’ which is probably an artefact of the search term used.

Table 1: SOC2010 Top level results
Census results Google results
SOC label Total %male %female Search term  %male  %female
1. Managers, directors and senior officials 2,883,590 65% 35% manager 85% 15%
2. Professional occupations 4,638,066 50% 50% professional 47% 53%
3. Associate professional and technical occupations 3,379,184 58% 42% associate professional 56% 44%
4. Administrative and secretarial occupations 3,052,488 22% 78% administrator 59% 41%
5. Skilled trades occupations 3,069,047 89% 11% skilled trades 100% 0%
6. Caring, leisure and other service occupations 2,502,256 18% 82% carer 11% 89%
7. Sales and customer service occupations 2,250,261 36% 64% customer servicer 0% 100%
8. Process, plant and machine operatives 1,931,309 88% 12% machine operative 95% 5%
9. Elementary occupations 2,975,367 55% 45%  (no search term)

There are some strong gender biases in the occupations shown in Table 1 – both in the census data an in the Google images data – perhaps reflecting long run biases in recruitment or perceived job status.

We can also look at more detailed SOC2010 classified observations from the census, although in this case we do not know the gender balance of those actually employed in these positions. Google searches were made for the top 20 or so jobs (by number of persons employed), omitting those for which no suitable search term was found. Many of the jobs are shown with a strong gender bias. For some jobs, there are separate breakdowns of employment that allow us to find employment by gender. Thus, the search term ‘teacher’ gave results that were 83% female; the most recent data published by the Department for Education for the school workforce in England shows that 74% of teachers were female. It should of course be remembered that the Google results are for a small set of images (those displayed on the first screenful of results) – one different image could alter the results quite easily. There are a small number of images in each case, typically portraying 20-40 people.

Table 2: SOC2010 Third level results
Census results Google results
SOC label Number employed Search term %male %female
711. Sales Assistants and Retail Cashiers 1,552,710 sales assistant 35% 65%
231. Teaching and Educational Professionals 1,154,905 teacher 17% 83%
614. Caring Personal Services 1,079,488 carer 11% 89%
821. Road Transport Drivers 821,906 transport driver 94% 6%
531. Construction and Building Trades 787,359 construction worker 93% 7%
927. Other Elementary Services Occupations 777,077 (no search term)
421. Secretarial and Related Occupations 772,133 secretary 0% 100%
612. Childcare and Related Personal Services 730,177 childcare worker 11% 89%
354. Sales, Marketing and Related Associate Professionals 705,659 marketing worker 16% 84%
923. Elementary Cleaning Occupations 696,100 cleaner 32% 68%
415. Other Administrative Occupations 657,348 administrator 59% 41%
412. Finance 644,829 finance worker 67% 33%
125. Managers and Proprietors in Other Services 600,332 manager 85% 15%
213. Information Technology and Telecommunications Professionals 576,148 it professional 88% 12%
242. Research and Administrative Professionals 573,583 researcher 42% 58%
223. Nursing and Midwifery Professionals 556,471 nurse 4% 96%
353. Business, Finance and Related Associate Professionals 520,768 business professional 60% 40%
113. Functional Managers and Directors 500,567 (no search term)
543. Food Preparation and Hospitality Trades 456,051 food preparation worker 54% 46%
112. Production Managers and Directors 445,668 (no search term)
119. Managers and Directors in Retail and Wholesale 443,147 retail manager 30% 70%
411. Government and Related Organisations 412,872 civil servant 87% 13%
926. Elementary Storage Occupations 402,782 storage worker 94% 6%

An outlier is ‘civil servant’ – a term used because ‘government worker’ as a search term largely returned caricatures etc depicting government workers as idle – which is shown to be strongly male dominated, whereas in reality civil service employees are 56% male for full-time employees, or 53% female when both full and part time employees are counted.

It could be argued that many of these sets of results seem to emphasise gender difference: traditionally male roles and female roles both being represented in a way more stereotypical than actual. It is left as an exercise for the reader, but a working hypothesis might be that in Google image searches, women tend to be under-represented in high status jobs and over-represented in lower status jobs. You might also want to consider representation by ethnic group in this way.

Finally, we return to questions about how academia and science might be presented in image search results. Another set of searches were carried out for some academic / science related job titles (Table 3). There are strong biases in the results. We learn that both lecturers and professors are largely male, with apparent problems for women who would hope to make a career progression. Scientists are mostly male, whereas lab assistants were mostly female. Nobody in any of the images for any of these jobs appeared to be crying. Finally, a search was made for ‘doctor’, and again the results were overwhelmingly male, although clearly doctors are medical rather than academic.

Table 3: Google search results – academic titles
Google results
Search term %male %female
professor 91% 9%
lecturer 80% 20%
scientist 74% 26%
lab assistant 30% 70%
doctor 87% 13%

Compared with data from HESA these Google results significantly under-represent female employment in higher education (22% of professors are female, and 45% of all academic staff, in the most recent data).

We are however left with a dilemma. Would we like Google results to represent actual employment, or the employment pattern that might exist in a less prejudiced society?

 

Comments are closed.