Lead Forensics

What does it take to make your business LLM and GenAI proof?

Theodoros Evgeniou* (Tremau), Max Spero* (Checkfor.ai)

Arguably “the person of the year for 2023” has been AI. We have all been taken by surprise by the speed of innovation and capabilities of Large Language Models (LLMs) and more generally generative AI (GenAI). At the same time, many, particularly in online platforms, raise questions about potential risks these technologies can raise – see this Harvard Business Review article outlining some AI risks. Online platforms may soon get flooded with AI-generated content, with implications for their users’ safety and retention as well as for the platforms’ reputation. There are already startups that offer tools to generate and disseminate massive volumes of GenAI content. 

But AI and GenAI can also be used for our benefit to manage these risks and help us create safer digital spaces and online platforms, as seen in some ideas from the latest Trust & Safety Hackathon.  As new tools emerge, it is a good time to take stock of where we are in terms of the latest innovations and processes to manage the risks online platforms face due to GenAI. 

This article can help answer questions such as:

  • How can we best protect our business, online communities, and users from volumes of AI-generated content (e.g., from review spam to illegal content violating copyright and other laws)? 
  • Can AI-generated content be detected?
  • At what point in the life cycle of AI-generated content can safety guardrails be used, and how? 
  • How are related regulations shaping up across different markets, and what do they mean for you? 

Develop your GenAI Policy considering your business model and needs

Every business with user-generated content needs a GenAI policy. There are generally two questions to be answered. Do users want to see AI-generated content, and are users okay with AI content intermixed with human content?

If your answer to either question is no, then you need to have a policy around AI content. For example, requiring that AI content be disclosed or expressly disallowing AI content. Such a policy can be enforced by human moderators with a keen eye and effective processes with tools like Checkfor.ai

If the answer is yes – users are okay or enthusiastic about seeing AI content – then from a policy point of view you’re good. However, before you go ahead and introduce AI tools directly like Linkedin’s AI-assisted messages, you’ll still need to make sure that the content is safe. For this, you need some guardrails and, more importantly, to always put in place processes to effectively and efficiently moderate AI-generated content, similar to moderation of user-generated content, using tools like Tremau’s moderation platform.

Of course, your GenAI policy depends on your business and context. There is no one-size-fits-all. For example, if you are a marketplace or generally a platform where users rely on other users’ reviews, you may need to ensure no AI-generated reviews find their way to your platform. More generally, you also need to ensure that no illegal content generated by AI, much like user-generated content, lives on your platform. Bots and spam have always been a challenge, but with the power of GenAI they are more powerful and harder to catch. 

Understand and leverage AI Guardrails 

Most commercial AI APIs provide some sort of AI guardrails. Google’s Gemini API automatically rates its outputs on each of four safety categories: Hate Speech, Harassment, Sexually Explicit, and Dangerous Content. If you use Azure’s OpenAI API, you get similar ratings based on the content filters “Hate and Fairness”, Sexual, Violence, and Self-Harm. Both APIs will reject queries that score too highly on any of these categories, but leave intermediate levels of safety moderation up to your discretion. 

If you’re using an open-source model such as Llama-2 or Mistral, you’ll need to roll your own content filter. This can be solved with a separate call to a closed-source classifier (OpenAI’s content filter API, Azure’s AI content safety API) or an open-source solution such as Meta’s newly released LlamaGuard. LlamaGuard is a 7B-parameter LLM-based model that benchmarks very well. It shows promise for prompt and response classification, as well as general content moderation

Ensure that humans are still involved and your processes comply with regulations

No matter what automated tools you may use to protect your users and business, no technology can fully protect you.  All AI tools you use will always make mistakes. You need to ensure such mistakes don’t expose you to operational, customer or regulatory risks. 

First, you will always need to involve humans in the loop who, at the least, will be reviewing some of the content the tools may flag for them to check. Of course, your content review processes need to be effective and efficient. Ironically, the more AI tools become available in the market (e.g., for generating or moderating content) the more people you may need to involve in some cases. 

Second, any content moderation processes and practices need to be designed with your users’ safety and retention – hence also business – in mind. What if errors of your moderation raise concerns? How do you ensure your users have a voice when needed to correct your – or your AI’s – decisions? How to ensure your moderators have all they need to make the best moderation decisions as efficiently and effectively as possible? Managing these, and other, complexities requires that you carefully think about and automate effectively your processes, using for example tools like Tremau’s content moderation platform.  

Finally, 2024 will be the year you will really need to double down ensuring you are not among the companies fined by regulators. The EU’s Digital Services Act will be live for all online platforms operating in Europe, with requirements for you to re-design your processes and provide – or else get fined – reports, such as transparency reports. Of course compliance is necessary whether or not your platform is impacted by or uses AI. 

How can we help you?

At Checkfor.ai and Tremau we work to help you best navigate the new world of powerful AI and new regulations. 

To find out more, contact us at info@tremau.com  and info@checkfor.ai.

*Theodoros Evgeniou is co-founder and Chief Innovation Officer of Tremau, Professor at INSEAD, member of the OECD Network of Experts on AI, advisor to the BCG Henderson Institute, and has been an academic partner on AI at the World Economic Forum. He holds four degrees from MIT, including a PhD in the field of AI. 

*Max Spero is a co-founder and CEO of Checkfor.ai. Previously he was a software engineer at Google and Nuro, building data pipelines and training machine learning models. He holds a BS and MS in Computer Science from Stanford University.

JOIN OUR COMMUNITY

Stay ahead of the curve – sign up to receive the latest policy and tech advice impacting your business.

Share This Post

Further articles

Global Regulations

What does Canada’s proposed Online Harms Act mean for your platform?

In the last three years, roughly 540 million people—representing some of the most lucrative markets—have come under the protection of next-generation online safety laws in the European Union (EU), United Kingdom (UK), and Australia. The dominoes are falling, and Canada is suiting up to join the party.  Canada’s Bill C-63, otherwise known as the Online

Trust & Safety Software
Best practices in trust & safety

Making the Right Choice: Buy or Build Your Trust & Safety Software?

In the realm of software development, the age-old question about building software in-house or buying it from vendors is very common.  It is not surprising that this question is also very common when it comes to enterprise-level Trust & SafetyThe field and practices that manage challenges related to content- and conduct-related risk, including but not

Join our community

Stay ahead of the curve – sign up to receive the latest policy and tech advice impacting your business.