Edgar Cervantes / Android Authority
TL;DR
- An Android Authority teardown has revealed that Reddit will use an AI model for detecting harassment.
- The model is trained on content that was previously flagged for violating Reddit’s terms.
We’ve seen large language models (LLMs) used for a variety of features in the last year or so, from text/image generation to virtual assistants and beyond. Now, it looks like we can add one more use case to the list thanks to Reddit.
An APK teardown helps predict features that may arrive on a service in the future based on work-in-progress code. However, it is possible that such predicted features may not make it to a public release.
A teardown of version 2024.10.0 of the Reddit app for Android has revealed that Reddit is now using an LLM to detect harassment on the platform. You can view the relevant strings below.
Code
<string name="hcf_answer_how_model_trained">The harassment model is an large language model (LLM) that is trained on content that our enforcement teams have found to be violating. Moderator actions are also an input in how the model is trained.</string>
<string name="hcf_faq_how_model_trained">How is the harassment model trained?</string>
Reddit also updated its support page a week ago to mention the use of an AI model as part of its harassment filter.
“The filter is powered by a Large Language Model (LLM) that’s trained on moderator actions and content removed by Reddit’s internal tools and enforcement teams,” reads an excerpt from the page.
Either way, it looks like moderators have another tool in their arsenal to fight objectionable content on Reddit. Will this actually do a great job of flagging content, though? We’ll just have to wait and see.