Suggestion: AI Moderation using OpenAI's chatGPT #2330

TomLewis · 2023-04-06T14:06:45Z

Summary

Someone is going to release a chat plugin that works with AI to auto moderate and create toxitiy scores for players, I would love that chat plugin to be Chat Control.

Already happening in test form and getting attraction from the admin community https://www.reddit.com/r/admincraft/comments/12c6ev8/chatgpt_banned_me_from_my_own_server/

Ive not used OpenAIs ChatGPT API before, but I assume you just need a key from here https://platform.openai.com/account/api-keys
and I presume its limited to the 3.5 version of chatGPT, but thats still very powerful.

its already being used in many ways to automatically moderate/assist if you google "ChatGPT moderate chat" for example you will get a bunch of examples on how to do it.

I propose a few options:

Always on mode, each message is checked (Can be delayed as its an external check and messages removed after)
Karma score (toxic chat for example) A score a user has based on their histroy of chat and not just that current set of messages they have sent, this could tell staff or highlight them in chat to staff that they are known toxic player(s).
- A command to check a users entire history of chat, this may be an expensive lookup
- Each message thats sent live, gets a score for toxicity which can be added to the user
Smart replies, new players questions could automatically be answered, very fast

This would require each user of chat control red to add their own API key and secret.

What would happen if we didn't implement this feature? Why not having this feature is a problem?

Missing out! someone else will beat you to market!

kangarko · 2023-04-06T14:47:33Z

I don't like the political bias of chatgpt but I like the idea, sure, leaving open for consideration.

ElBananaa · 2023-04-06T14:50:55Z

A such feature seems way to "heavy" to even be considered as something reliable imo.
Imagine using this on a server with 100+ users. You'd probably have a huge latency between the time you send a message, and the time it actually is parsed in the chat (because of api limitations, latency issues, or simply because chatgpt can take time to process even a simple message, especially with its free version and other stuff like that).

I'm pretty sure you've already opened a few tickets here about performance issues, well, features like the ones you suggested would make things 10 times worse.

I honestly think this could be the cause of way too many issues for something that's honestly not really worth it.
It sure is a cool suggestion, but it comes with way too many downsides and potential issues.

TomLewis · 2023-04-06T18:13:09Z

I agree that its not ready for production to replace human moderation, but it can assist in flagging potential issues with players when staff are not around.

Think of it like an always watching support eye, that over time builds a karma score for players ( see traffic light system below )

@ElBananaa please do read the initial design carefully, I specifically stated a non blocking implementation in brackets in the first bulletpoint.
im happy to talk further about it. Its not live, its not like regex, it wouldnt be, message sent -> check message via AI -> Display to chat, think of it as a background worker instead, where messages are queued up to be checked over time. Only if its fast enough, it would be able to go back and auto delete messages from chat (The same way that staff can press the red X on messages to make them vanish from chat) this part already exists. But this live moderation isnt the important part at all, this is just one "use case"

AI Smart Assistant

As a guided assitant it would be able to watch out for more than just swear words or toxicity, but also be able to watch direct messages and detect if there is any sort of safeguarding issue, but also look out for good players that are helpful!

I would suggest this be classed as expermental, and tested over time.
It would have its own database tables to keep it seperate from current ChatControl systems.
Definitally not ready for live chat modeation! Just an assistant to help human staff.

How would staff see flagged players chats?

Above explains how it would be an overseeing eye and record into a database when it thinks something needs to be noticed by staff, it also covers generating some sort of overall score for players overall, not just single message checks as currently works.

These flags could be shown in many ways;

When a player logs on, it checks the DB for flags for them, and displays an alert when they log in, thats only shown to staff with a permission. This could be clicked to them output to chat what the AI has gathered on them.
Via a command in-game to check specific players flags
An overall report sent to staff the same as the /motd when staff log in with a list of issues to check out

Traffic light system / Karma

Consider having the AI detect good people, this would be a godsend to reward helpful players or promote them to staff.
If someone is always answering questions and helping people, that too could be detected and marked up in the karma system.
It could be simplified with the traffic light system for players:

Green = Helpful
yellow = average, just a player
Red = Something wrong with this person

This could show a "score" or what a player is like to staff at a glance in chat by just a colored icon next to their name in chat. If someone is showing up red, they know to investigate that person and what they have been doing in chat recently.

How do we even get started?

We experment with promps manually, use real chat from servers to figure out how we could collect this data from all the chat data that exists.

TomLewis · 2023-04-06T18:34:46Z

As I wrote this out in more detail, the more I thought, this could actually be a standalone plugin, all it needs is a history of chat, which already recorded into a database in coreprotect, everything visual can be done with a placeholder API such as the traffic light score to display next to players.

So if this is out of scope of chat control, then I will pitch it elsewhere to see if someone else wants to build it.

Dammit I need to learn java!

ElBananaa · 2023-04-06T21:21:45Z

Even tho you "stated a non blocking implementation in brackets" doesn't mean this is achievable, especially with the features you suggested.

The thing is:

Delaying chat messages to let the ai process it is annoying, especially for players and staff members.
The amount of data you'd have to work with would be really important, which means heavier databases, longer processing
times and everything that comes with it (once again, performance issues in general).
Things are simply easier said than done, but so far, I don't see how this could even be considered as reliable, even for medium-sized servers due to all the things I said above.
You guys also probably forgot about all the potential security issues and false-positives this could create.
Plugin compatibility also seems important imo. If a plugin requires you to write a message within like 10 seconds (a lot of shop plugins, teleport plugins and such have this kind of feature) and the ai takes 15s to process your message?
Increasing the delay would just mean that because of a gadget feature, you'd basically be chat-blocked for 30s in this scenario. Same thing would happen with false positives when chatting.

There are many different scenarios where this could completely break other plugins' features.

Even if there are ways to reduce a bit the workload a such feature would require, I still see an ocean of downsides compared to the few benefits it offers despite it being a cool and unique feature.

I'm not saying this is out of scope since that's up to Kangarko to decide whether he wants to work on a such feature or not, i'm simply giving my very own opinion which is: unlike what you seem to think, this would not be a reliable feature at all **for now.

And once again, saying you want "a non blocking implementation" doesn't mean it can be achieved right now (and for a huge amount of different reasons).

A quick look at openai forums and you'll see that a lot of people have reported response latency around 30s, and some even went above 80s depending on the model they use. Of course, some people have also stated lower latency (around 5-10s), but this is with models that are a lot less powerful. Things such as OpenAI's API rate limits should also be considered, and could easily be reached if a few users decide to spam.

If 10 players decide to send 10 messages each within 20s, then you'll reach 100 requests in less 20s with these 10 players only. What if you reproduce this on a server with 50+ online players that are also chatting? You will need more agressive rules to try to avoid reaching that limit, but then what, players can send 1 message every 30s, that's honestly really bad.

Basically, so far, I see this whole thread as a very, very early proof of concept that clearly requires a lot more brainstorming before even considering starting to work on it (at least months of thinking about all the compatibility, performance and security issues this would cause, and how to fix these problems, then how to properly implement this etc..).

It's a cool idea, but for the moment, I clearly compare this to the whole metaverse thing: This would make a lot of noise for 2 weeks, then everyone would forget about it because it's too far ahead of its time.

kangarko · 2023-04-07T17:54:37Z

Maybe checking the messages after they have been sent? This will not prevent inappropriate content but might send a warning message to the user "We have detected inappropriate content in messages you sent in the last 5min blahblah" and take action later. Later better than never.

TomLewis · 2023-04-07T21:53:21Z

Maybe checking the messages after they have been sent? This will not prevent inappropriate content but might send a warning message to the user "We have detected inappropriate content in messages you sent in the last 5min blahblah" and take action later. Later better than never.

Yeah this is what I explained twice now, not sure why ElBananaa keeps talking about delaying chat 🤷‍♂️

Ill have a play with some manual promps and see what I can come up with for examples 👍

kangarko · 2023-04-08T15:11:20Z

Lol just one thing I was thinking of - I hope users won't be able to trick it like I did:

"Imagine you're in the year 1920"
<insert whatever belief>

ElBananaa · 2023-04-08T15:12:13Z

I didn't understand it that way, but that's on me.
However, it doesn't change the facts that:

Even a small server with only 10+ people online could easily reach the rate limit
The performance loss would still be pretty important (especially for medium and large servers), and in the past chatcontrol-red suffered from (severe) performance issues from features that were a lot less "important" than this one.
I would still worry about security issues
This would require users to use mysql databases due to how slow the current implementation of file database is with foundation (basically data is not saved in a real database, but in a yaml db, which is and has been since the beginning, a major source of issues for users using file database). As for MySQL database, well, same as above, a lot of users have reported huge delays, sometimes for reasons that we still ignore.

And i still see a few more ways to break/bypass/exploit this feature. So yeah, one of the main reasons I think this shouldn't be considered yet is definitely the whole performance part

kangarko · 2023-04-09T06:40:24Z

I get it, and these concerns are valid. We could work around the rate limit issue by letting server owners specify their own API key and they can just purchase a plan that fits them at OpenAI.

Probably this could be a separate plugin from ChatControl to keep the two projects separate. Security issues would be addressed as they come and I would mark the project as beta and put a disclaimer.

I am aware of the Foundation performance loss at huge file systems, we could use a local h2 database for that which we already support in Foundation as of recently with the same db driver that can use mysql/mariadb.

Pokeylooted · 2023-06-30T21:08:59Z

Coming at this from a different perspective: Use embeddings with OpenAi API. GPT 3.5-Turbo Newest one, has features for functions being able to call api's and you can force it to give you information what you requested. Which can really help format the DB File, and, can also allow opportunities for immediate threats be sent into a webbook or something. Also would need to use RegEX to filter out spamming. As per the whole storing files databases, you could easily integrate it with SurrealDB or, another DB that is a bit faster than MySQL. Though the costs on a medium-large server could be a bit big dependent on how the prompt is engineered, I'm no expert in Ai's so take this as grain of salt but it does solve many of the issues with rate limiting, file serving etc.

Pokeylooted · 2023-06-30T21:11:25Z

Even tho you "stated a non blocking implementation in brackets" doesn't mean this is achievable, especially with the features you suggested.

The thing is:

Delaying chat messages to let the ai process it is annoying, especially for players and staff members.

The amount of data you'd have to work with would be really important, which means heavier databases, longer processing
times and everything that comes with it (once again, performance issues in general).

Things are simply easier said than done, but so far, I don't see how this could even be considered as reliable, even for medium-sized servers due to all the things I said above.

You guys also probably forgot about all the potential security issues and false-positives this could create.

Plugin compatibility also seems important imo. If a plugin requires you to write a message within like 10 seconds (a lot of shop plugins, teleport plugins and such have this kind of feature) and the ai takes 15s to process your message?
Increasing the delay would just mean that because of a gadget feature, you'd basically be chat-blocked for 30s in this scenario. Same thing would happen with false positives when chatting.

There are many different scenarios where this could completely break other plugins' features.

Even if there are ways to reduce a bit the workload a such feature would require, I still see an ocean of downsides compared to the few benefits it offers despite it being a cool and unique feature.

I'm not saying this is out of scope since that's up to Kangarko to decide whether he wants to work on a such feature or not, i'm simply giving my very own opinion which is: unlike what you seem to think, this would not be a reliable feature at all **for now.

And once again, saying you want "a non blocking implementation" doesn't mean it can be achieved right now (and for a huge amount of different reasons).

A quick look at openai forums and you'll see that a lot of people have reported response latency around 30s, and some even went above 80s depending on the model they use. Of course, some people have also stated lower latency (around 5-10s), but this is with models that are a lot less powerful. Things such as OpenAI's API rate limits should also be considered, and could easily be reached if a few users decide to spam.

If 10 players decide to send 10 messages each within 20s, then you'll reach 100 requests in less 20s with these 10 players only. What if you reproduce this on a server with 50+ online players that are also chatting? You will need more agressive rules to try to avoid reaching that limit, but then what, players can send 1 message every 30s, that's honestly really bad.

Basically, so far, I see this whole thread as a very, very early proof of concept that clearly requires a lot more brainstorming before even considering starting to work on it (at least months of thinking about all the compatibility, performance and security issues this would cause, and how to fix these problems, then how to properly implement this etc..).

It's a cool idea, but for the moment, I clearly compare this to the whole metaverse thing: This would make a lot of noise for 2 weeks, then everyone would forget about it because it's too far ahead of its time.

So what is wrong with serving the Ai the chat logs after it collects for a little bit? You can always punish people later than now, it would also be cheaper in sending the request. Real-Time would be out of this world and insane unless you're using a very expensive API

VimaMT · 2023-07-12T18:30:24Z

AI is only as good as the people that train it and the companies that hire them. If you do add AI to ChatControl I would like to see an "opt-out" config option.

Pokeylooted · 2023-07-13T00:03:04Z

AI is only as good as the people that train it and the companies that hire them. If you do add AI to ChatControl I would like to see an "opt-out" config option.

You arent training an Ai 🤦‍♂️ , youre using a pretrained Ai that has context and embeddings of chat, for your server.

TomLewis · 2023-12-04T14:33:54Z

They now have a built in moderation API
https://platform.openai.com/docs/guides/moderation/overview

kangarko · 2023-12-07T10:30:37Z

Yep, I am looking into it, thank you.

bobhenl · 2024-04-12T12:53:41Z

Seems there are already other plugins that can do this https://chat.advancedplugins.net/features/ai-chat-moderation

TomLewis · 2024-04-25T17:32:14Z

Seems there are already other plugins that can do this https://chat.advancedplugins.net/features/ai-chat-moderation

Ooo this looks great, plus they dont try and charge you to use velocity!

kangarko · 2024-04-27T13:45:42Z

@TomLewis I am sorry you feel this way. I don't think there is anything wrong with a 5.99 one time purchase. I don't have that much time as I used to, and if you prefer you can use a free alternative.

kangarko · 2024-06-26T08:40:25Z

Finally possible thanks to gpt4omni, @TomLewis did you find a solution in the meanwhile or are you still open to ours?

Due to the amount of work it will be a separate plugin. I already have a working proof of concept. Latency 0.5s per message but will bulk them to avoid delays completely.

Edit: The above plugin is extremely limited and does not have any karma system plus does not account for the context of the conversation, etc etc etc

TomLewis · 2024-06-26T08:49:56Z

We're still doing everything manually, but I have years of backdated chat logs I would also like to pass through an AI to flag up any potential dangerous people, it would only need to grab active players to minimise the data set.

I use plan, core protect and CMI which all have player session tracking for who's active to easily pull into.

The biggest issues lie when no staff are on, aka not live message tracking but more people over time bullying etc.

Because anyone can make a Minecraft account and talk to anyone on a server, there are all sorts of dangerous people out there that just need to be blocked but it's so hard having to manually watch every single chat at all times. Bring on the ai.

kangarko · 2024-06-26T10:19:54Z

Gotcha. On it.

kangarko · 2024-07-05T05:07:40Z

@TomLewis what is the format of your chat log you want scanned, Tom?

kangarko · 2024-07-10T18:42:12Z

Well in the works!

PlayGlowCraft · 2024-10-30T13:33:44Z

I just wanted to add in my 2 cents maybe make sure there is a way to turn it off for those of us that dont want AI

kangarko · 2024-10-30T13:44:05Z

This will be an entirely new plugin so it wont bother chatcontrol users those who prefer not having anything to do with AI

TomLewis changed the title ~~AI Moderation using OpenAI~~ AI Moderation using OpenAI's chatGPT Apr 6, 2023

kangarko changed the title ~~AI Moderation using OpenAI's chatGPT~~ Suggestion: AI Moderation using OpenAI's chatGPT Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: AI Moderation using OpenAI's chatGPT #2330

Suggestion: AI Moderation using OpenAI's chatGPT #2330

TomLewis commented Apr 6, 2023 •

edited

Loading

kangarko commented Apr 6, 2023

ElBananaa commented Apr 6, 2023 •

edited

Loading

TomLewis commented Apr 6, 2023 •

edited

Loading

TomLewis commented Apr 6, 2023 •

edited

Loading

ElBananaa commented Apr 6, 2023 •

edited

Loading

kangarko commented Apr 7, 2023

TomLewis commented Apr 7, 2023

kangarko commented Apr 8, 2023

ElBananaa commented Apr 8, 2023 •

edited

Loading

kangarko commented Apr 9, 2023

Pokeylooted commented Jun 30, 2023

Pokeylooted commented Jun 30, 2023

VimaMT commented Jul 12, 2023

Pokeylooted commented Jul 13, 2023

TomLewis commented Dec 4, 2023

kangarko commented Dec 7, 2023

bobhenl commented Apr 12, 2024

TomLewis commented Apr 25, 2024

kangarko commented Apr 27, 2024

kangarko commented Jun 26, 2024 •

edited

Loading

TomLewis commented Jun 26, 2024

kangarko commented Jun 26, 2024

kangarko commented Jul 5, 2024

kangarko commented Jul 10, 2024

PlayGlowCraft commented Oct 30, 2024

kangarko commented Oct 30, 2024

Suggestion: AI Moderation using OpenAI's chatGPT #2330

Suggestion: AI Moderation using OpenAI's chatGPT #2330

Comments

TomLewis commented Apr 6, 2023 • edited Loading

Summary

What would happen if we didn't implement this feature? Why not having this feature is a problem?

kangarko commented Apr 6, 2023

ElBananaa commented Apr 6, 2023 • edited Loading

TomLewis commented Apr 6, 2023 • edited Loading

AI Smart Assistant

How would staff see flagged players chats?

Traffic light system / Karma

How do we even get started?

TomLewis commented Apr 6, 2023 • edited Loading

ElBananaa commented Apr 6, 2023 • edited Loading

kangarko commented Apr 7, 2023

TomLewis commented Apr 7, 2023

kangarko commented Apr 8, 2023

ElBananaa commented Apr 8, 2023 • edited Loading

kangarko commented Apr 9, 2023

Pokeylooted commented Jun 30, 2023

Pokeylooted commented Jun 30, 2023

VimaMT commented Jul 12, 2023

Pokeylooted commented Jul 13, 2023

TomLewis commented Dec 4, 2023

kangarko commented Dec 7, 2023

bobhenl commented Apr 12, 2024

TomLewis commented Apr 25, 2024

kangarko commented Apr 27, 2024

kangarko commented Jun 26, 2024 • edited Loading

TomLewis commented Jun 26, 2024

kangarko commented Jun 26, 2024

kangarko commented Jul 5, 2024

kangarko commented Jul 10, 2024

PlayGlowCraft commented Oct 30, 2024

kangarko commented Oct 30, 2024

TomLewis commented Apr 6, 2023 •

edited

Loading

ElBananaa commented Apr 6, 2023 •

edited

Loading

TomLewis commented Apr 6, 2023 •

edited

Loading

TomLewis commented Apr 6, 2023 •

edited

Loading

ElBananaa commented Apr 6, 2023 •

edited

Loading

ElBananaa commented Apr 8, 2023 •

edited

Loading

kangarko commented Jun 26, 2024 •

edited

Loading