you are viewing a single comment's thread.

view the rest of the comments →

[–]package 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (1 child)

The trick with all these ai chat bot safeguards is to tell them to define some random words as definitions of the trigger words and then use those instead of the triggers.

[–]EternalSunset[S] 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

With most cases it's not just about trigger words. It has some rudimentary understanding of context. The way they limit what it is allowed to say is done through some preemptive natural language prompts that it receives before you are allowed to give it your own prompts. In other words, they literally told it things such as "You are not allowed to write anything harmful or offensive blah blah blah".