Lies, damned lies and statistics? Telling accurate stories from data

Lies, damned lies and statistics? Telling accurate stories from data

By Nina Herriman, Chief Storyteller at the National Council of Women

In my last blog, I briefly mentioned the importance of accurate stories and promised to talk about how we’re going use the Gender Dashboard to tell accurate stories about gender inequality.

But first, let’s talk about why accuracy is about more than just testing that the numbers are right. Much of this resonates with the things I learned in my history degree. But I’ve resisted calling this post, “why you need an historian on your data science team”, because these are things that everyone working with and talking about data should know.

The past matters

The data that we have can tell us something about now. Some good longitudinal data can tell us something about ten, twenty or, if we’re lucky, thirty years ago. It can’t tell us about the impact of centuries of inequalities on people’s lives today, whether that be on the basis gender, ethnicity, disability or something else.

There’s no such thing as the full story

Whether you’re writing a history book or a data story, you’re making choices about which stories to tell and consequently which stories that you won’t tell. This is partly driven by the information that you have available – and there are huge gaps in NZ data to inform our understanding of gender inequality.

For example, we will struggle to tell the story of trans women while data collection continues to fail to recognise the fact that gender identity is not determined by the sex someone is assigned at birth.

Stats NZ have indicated a commitment to moving toward more inclusive data collection, but to tell these stories we will need to rely on the small number of non-government studies that have been done and international data.

It’s also about the story that you’re trying to tell. We’re producing a gender dashboard so this of course means that we are looking at gender data. But we need to make sure that we include other variables where we know there are inequalities to provide a more accurate story.

3. The data can’t speak for itself

It doesn’t matter if your data scientist has used the biggest computers and run the most sophisticated algorithms over all the data you can get your hands on, there’s still bias and there are still gaps in the data. Someone (or some committee) chose which data to collect and the process for collection, which data to keep and for how long, someone probably cleaned that data and/or it contains errors, someone wrote the algorithm and someone decided which results to report and how to report them.

https://xkcd.com/1838/

Beware the outliers

Remember that rollicking good book you read about the politics and intrigue of sixteenth century England? That does not reflect the experience of most people who actually lived in sixteenth century England and their everyday lives.

It’s the same with data, particularly data about people and social change. Because as much as we’d quite like to change the world in a day, problems embedded in societal and institutional structures are hard and slow to change. So if you see a huge change in your data, there’s a good chance there’s an error somewhere and you should go back and check your analysis.

So, what’s the point then? How do we tell accurate stories from this data?

Tell multiple stories, even if they appear to be conflicting

We tell stories from different perspectives about different groups of people who need different solutions. For example, there’s a narrative about more part-time work opportunities that allow people of all genders to have adequate resources or continue their career while fulfilling other responsibilities in their lives such as childcare and adult care.

There’s also a narrative about policies that allow people who work part-time to move into full-time work or at least more hours if they want or need it.

A third narrative may be about better pay and conditions for those who currently work part-time and wish to continue to do so.

And a fourth may be about more equal sharing of unpaid work, that allows those not in the labour force to take on part-time work.

Talk to the experts

As an umbrella organisation with over 200 member organisations and a reach of over 450,000 people, the National Council of Women have a lot of expertise to draw upon in our work. For each of our four key areas of inequality, we will have a group of expert advisors drawn from these people, as well as having conversations with experts beyond that group.

Nothing about us without us

Consultation with stakeholders who are working with, or representing, the most marginalised communities in New Zealand will be given a high priority. We will endeavour to talk to the groups whose stories we want to tell – particularly those groups with compounding negative outcomes due to discrimination, e.g. Māori, Pacific, Asian, migrant and refugee women, rural women, women with disabilities, queer, trans and gender diverse women – “nothing about us, without us”.

Test

Test that your infrastructure works, test that your numbers are right, test that your dashboard is easy to use, test that your website is easy to navigate, test that your stories resonate with your communities and stakeholders and test that your stories are useful!

In my next blog I’m going to explain a little further about context. Contextualising data is a major aim of the Gender Dashboard – we need to be able to show users what the data means, how the data connects and contextualise it within a narrative arc. Stay tuned!

You Might Also Like

“So, what exactly is your job?”

Ignorance is not bliss – context is everything

Economic Independence: It’s complicated