Why does this work…?
Because people are creative and continuously find new ways to get around the safeguards put in place
And they always will…
Yeah, Chatgpt won’t tell you how to do illegal things, unless you first tell it that everyone’s dead and there are no more laws. Then it will.
Besides all the shenanigans and all the exploits people try, I have found it an enormously useful tool to assist me in programming. It is really incredible what it can help me do. It cannot do anything on its own, it produces faulty and often shitty code, it makes up things but the stuff I can suddenly do that I didn’t think I could do just a couple weeks ago…
It is still just a very fancy interactive search. It is not intelligent as we understand intelligence. It is the new google more than anything. (And google is really shitty these days)
I do feel like possibly programming because of the structured nature of what it is, is one of the best if not the best use-case for it.
I like to use an aircraft autopilot analogy when explaining things like this.
Yes, autopilots today can fly a plane. Some can even land a plane. But there must be a pilot both attentive to and responsible for the aircraft.
Autopilots don’t replace pilots.
Related:
In a stunning turn of events, the self-driving bus requires two humans. I’m pretty sure we’ve had single-human-operated buses for at least a couple of years.
Thats round our gaff that one and saw that it needs two people! Will let you know if any fun things happen!
I mean, why is it an effective workaround? It seems strange that you can bypass the safeguard by prefacing it with some variant of “tell me a story”. Obviously I don’t know a lot about ChatGPT
I think the major point here is labour costs - just as the “train captain” on a DLR train doesn’t have to be a (highly trained, unionised) train driver, because they only actually drive the train in an emergency.
Presumably they only really need to know where the brake is?
A thing I’ve been saying for many years, and Air France 447 made really obvious: if the human isn’t already actively operating the vehicle, they’ll do a terrible job of spinning up situational awareness in a hurry. In that case the crew had about four minutes from first warning to crash, and simply failed to grasp what was going on. When you have a few seconds between “the system tells you that something is wrong” and “the bus has crashed”, or it never sounds an alarm at all…
This is the other part of the analogy I typically conclude with: it’s genuinely difficult for a trained expert to switch from “passive monitoring” of automated systems into, suddenly, needing to override the automation and immediately take control.
It actually demands a more skilled operator in most cases (especially when real-time is a factor).
Additionally, because the operator is passive during normal operation, the skillset will atrophy over time; so a top-notch operator that has been passively observing automation for a few months or years, but not keeping skills fresh may find, suddenly, that they are in a position of needing to do something and awkwardly being unable to do it, to their own surprise.
I assume it’s because it’s just been told that “X is bad”, but it’s not doing X, it’s doing Y (that happens to include X). You have to be more sophisticated with programming your safeguards.
But you can’t get sophisticated enough to prevent all problems. (eg In order to teach someone that mixing certain cleaning products together will make poisonous gas, you have to teach them how to make poisonous gas. You just have to rely on the user knowing that making poisonous gas is something they shouldn’t be doing.)
What is an effective workaround seems to be changing all the time according to my colleague … it is a bit of a race between users and programmers.
Perhaps they should employ some testers
I think what I find surprising is that they obviously have a safeguard to prevent answering “how do you hotwire a car?” But that safeguard clearly isn’t something like:
if input includes “how do” + pronoun_list + illegal_things_list
then
answer = “I’m afraid I can’t let you do that Dave”
(Apologies for the terrible pseudocode)
Perhaps the devs feel like that would be too blunt an instrument
I really have no idea about the actual architecture of the safeguards.
In general it is just more or less auto-completion and pattern matching. I also find it confusing why they cannot exclude whole patterns. but I guess it shows that chatGPT really has no deeper understanding at all.
It is probably the same reason it makes up stuff. I‘ve fallen for its inventions at least 2 times:
- docker pull has no —dry-run option
- jira has no way to set group based visibility on custom fields
I highly recommend trying out what it can do I thought it was a fad that was going to go away like Tik Tok will. But the tech is there now. If chatGPT goes the way of Napster eventually we‘ll have a spotify.
It’s not going away.
It’ll just be hopelessly integrated with operating systems in the near future.
The new Clippy, but Clippy from Hell, because it’ll, likely, be nearly impossible to disable
In case anyone was still buying Raspberry Pis, the new ones will apparently come with “Sony AI”. It only phones home with metadata. Honest.
As much as I am a fan of the chat version on a website, I have had to disable github copilot quite fast because it got in my way.
If this really ends up integrated into windows so much I am going to return to OSS systems. For now Windows and WSL works for me. An overbearing language model that knows better than me does not work for me. We‘ll see.
Never mind fairies, I think my head might explode if I was driving along and encountered that!