Can you outline to our readers how you are leveraging AI technology to improve gender representation on information-gathering websites like Wikipedia? How does the model you are deploying work?
To be clear, we are not deploying any model. We view our work as a starting point. We focus on mimicking the human writing process.
The human writer would decide which person to write a biography about and provide information about their occupation, and then the AI can read various articles on the web, predict the structure of the Wikipedia article, and write paragraph by paragraph the article. We then append which reference articles the AI used as citations to these paragraphs.
Your primary objective is to use AI to ensure that there is greater inclusion on Wikipedia, but in your opinion, and based on the comprehensive research you conducted why is there such social bias towards women and marginalised groups on prominent websites like Wikipedia – and what does that tell us about the claim that these social biases are fundamentally embedded in a lot of the hierarchical structures in which our society functions?
Fundamentally, there is bias in society. We don’t live in a world where everyone gets the recognition for their achievements that they deserve. Wikipedia is based on credible, third party sources from the web, meaning that to have a Wikipedia article, there needs to be sufficient information available about that individual.
This makes it very susceptible to societal influence. Furthermore, the Wikipedia editing community is not reflective of society at large, which impacts the kinds of articles that are created. An easy area to recognise is that most Wikipedia articles are written in English, even though English is not the largest first language.
Many people have claimed that AI technology has only served to exacerbate racial and gender inequities, and that bias is baked into the outcomes that AI is asked to predict. Many feel the data used is being ‘discriminatory’ towards marginalised groups. How can you be sure that the AI you use will alleviate the challenges facing AI when it comes to levelling the playing field?
It’s extremely important that AI is developed in a representative way. Looking at facial recognition technology, for instance, it’s clear that some populations are not represented as well as others. This has a lot of impact on how that technology is used when deployed.
Thus, as a community, the first thing that we should be focusing on is measurement and understanding: how can we realise that the models we create are not representative? The dataset that we contribute in this paper is towards that goal. If we are able to understand and measure how differently models treat different groups, then we can make progress on developing more equitable and inclusive AI.
Despite some of the negativity that has been aimed at AI, we know that it is a force for good. How do you see the technology developing in terms of helping to create a society that better promotes inclusion and greater representations for those that are underrepresented?
I envision a world where people are recognised equally for what they have achieved, regardless of their gender or race or ethnicity or other aspects — and if they are notable enough to have a Wikipedia article, they should have one.