Leading Senior Constable Dr Janis Dalins is looking for 100,000 happy images of children – a toddler in a sandpit, a nine-year-old winning an award at school, a sullen teenager unwrapping a present at Christmas and pretending not to care.
The search for these safe, happy pictures is the goal of a new campaign to crowdsource a database of ethically obtained images that Dalins hopes will help build better investigative tools to use in the fight against what some have called a “tsunami” of child sexual assault material online.
Dalins is the co-director of AiLecs lab, a collaboration between Monash University and the Australian federal police, which builds artificial intelligence technologies for use by law enforcement.
In its new My Pictures Matter campaign, people above 18 are being asked to share safe photos of themselves at different stages of their childhood. Once uploaded with information identifying the age and person in the image, these will go into a database of other safe images. Eventually a machine learning algorithm will be made to read this album again and again until it learns what a child looks like. Then it can go looking for them.
The algorithm will be used when a computer is seized from a person suspected of possessing child sexual abuse material to quickly point to where they are most likely to find images of children– an otherwise slow and labour-intensive process that Dalins encountered while working in digital forensics.
“It was totally unpredictable,” he says. “A person gets caught and you think you’ll find a couple hundred pictures, but it turns out this guy is a massive hoarder and that’s when we’d spend days, weeks, months sorting through this stuff.”
“That’s where the triaging comes in; [the AI] says if you want to look for this stuff, look here first because the stuff that is likely bad is what you should be seeing first.” It will then be up to an investigator to review each image flagged by the algorithm.
Monash University will retain ownership of the photograph database and will impose strict restrictions on access.
The AiLecs project is small and targeted but is among a growing number of machine learning algorithms law enforcement, NGOs, business and regulatory authorities are deploying to combat the spread of child sexual abuse material online.
These include those like SAFER, an algorithm developed by not-for-profit group Thorn that runs on a company’s servers and identifies images at the point of upload and web-crawlers like that operated by Project Arachnid that trawls the internet looking for new troves of known child sexual abuse material.
Whatever their function, Dalins says the proliferation of these algorithms is part of a wider technological “arms race” between child sexual offenders and authorities.
“It’s a classic scenario – the same thing happens in cybersecurity: you build a better encryption standard, a better firewall, then someone, somewhere tries to find their way around it,” he says.
“[Online child abusers] were some of the most security-conscious people online. They were far more advanced than the terrorists, back in my day.”
‘A veritable tsunami’
It is an uncomfortable reality that there is more child sexual abuse material being shared online today that at any time since the internet was launched in 1983.
Authorities in the UK have confronted a 15-fold increase in reports of online child sexual abuse material in the past decade. In Australia the eSafety Commission described a 129% spike in reports during the early stages of the pandemic as “veritable tsunami of this shocking material washing across the internet”.
The acting esafety commissioner, Toby Dagg, told Guardian Australia that the issue was a “global problem” with similar spikes recorded during the pandemic in Europe and the US.
“It’s massive,” Dagg says. “My personal view is that it is a slow-rolling catastrophe that doesn’t show any sign of slowing soon.”
Though there is a common perception that offenders are limited to the back alleys of the internet – the so-called dark web, which is heavily watched by law enforcement agencies – Dagg says there has been considerable bleed into the commercial services people use every day.
Dagg says the full suite of services “up and down the technology stack” – social media, image sharing, forums, cloud sharing, encryption, hosting services – are being exploited by offenders, particularly where “safety hasn’t been embraced as a core tenet of industry”.
The flood of reports about child sexual abuse material has come as these services have begun to look for it on their systems – most material detected today is already known to authorities as offenders collect and trade them as “sets”.
As many of these internet companies are based in the US, their reports are made to the National Centre for Missing and Exploited Children (NCMEC), a non-profit organisation that coordinates reports on the matter – and the results from 2021 are telling. Facebook reported 22m instances of child abuse imagery on its servers in 2021. Apple, meanwhile, disclosed just 160.
These reports, however, do not immediately translate into takedowns – each has to be investigated first. Even where entities like Facebook make a good faith effort to report child sexual abuse material on their systems, the sheer volume is overwhelming for authorities.
“It’s happening, it’s happening at scale and as a consequence, you have to conclude that something has failed,” Dagg says. “We are evangelists for the idea of safety by design, that safety should be built into a new service when bringing it to market.”
A fundamental design flaw
How this situation developed owes much to how the internet was built.
Historically, the spread of child sexual abuse material in Australia was limited owing to a combination of factors, including restrictive laws that controlled the importation of adult content.
Offenders often exploited existing adult entertainment supply chains to import this material and needed to form trusted networks with other like-minded individuals to obtain it.
This meant that when one was caught, all were caught.
The advent of the internet changed everything when it created a frictionless medium of communication where images, video and text could be shared near instantaneously to anyone, anywhere in the world.
University of New South Wales criminologist Michael Salter says the development of social media only took this a step further.
“It’s a bit like setting up a kindergarten in a nightclub. Bad things are going to happen,” he says.
Slater says a “naive futurism” among the early architects of the internet assumed the best of every user and failed to consider how bad faith actors might exploit the systems they were building.
Decades later, offenders have become very effective at finding ways to share libraries of content and form dedicated communities.
Slater says this legacy lives on, as many services do not look for child sexual abuse material in their systems and those that do often scan their servers periodically rather than take preventive steps like scanning files at the point of upload.
Meanwhile, as authorities catch up to this reality, there are also murky new frontiers being opened up by technology.
Lara Christensen, a senior lecturer in criminology with the University of the Sunshine Coast, says “virtual child sexual assault material” – video, images or text of any person who is or appears to be a child – poses new challenges.
“The key words there are ‘appears to be’,” Christensen says. “Australian legislation extends beyond protecting actual children and it acknowledges it could be a gateway to other material.”
Though this kind of material has existed for some years, Christensen’s concern is that more sophisticated technologies are opening up a whole new spectrum of offending: realistic computer-generated images of children, real photos of children made to look fictional, deep fakes, morphed photographs and text-based stories.
She says each creates new opportunities to directly harm children and/or attempt to groom them. “It’s all about accessibility, anonymity and affordability,” Christensen says. “When you put those three things in the mix, something can become a huge problem.”
A human in the loop
Over the last decade, the complex mathematics behind algorithms combating the wave of this criminal material have evolved significantly but they are still not without issues.
One of the biggest concerns is that it’s often impossible to know where the private sector has obtained the images it has used to train its AI. These may include images of child sexual abuse or photos scraped from open social media accounts without the consent of those who uploaded them. Algorithms developed by law enforcement have traditionally relied on images of abuse captured from offenders.
This runs the risk of re-traumatising survivors whose images are being used without their consent and baking in the biases of the algorithms’ creators thanks to a problem known as “overfitting” – a situation where algorithms trained on bad or limited data return bad results.
In other words: teach an algorithm to look for apples and it may find you an Apple iPhone.
“Computers will learn exactly what you teach them,” Dalins says.
This is what the AiLecs lab is attempting to prove with its My Pictures Matter campaign: that it is possible to build these essential tools with the full consent and cooperation of those whose childhood images are being used.
But for all the advances in technology, Dalins says child sexual abuse investigation will always require human involvement.
“We’re not talking about identifying stuff so that algorithm says x and that’s what goes to court,” he says. “We’re not seeing a time in the next, five, 10 years where we would completely automate a process like this.
“You need a human in the loop.”
Members of the public can report illegal and restricted content, including child sexual exploitation material, online with the eSafety commission.