AI isn't entirely safe... A Flaw In Reverse Prompting😱


source

It's an eventuality that we'll spot some crack in the security of AI. I have tested this trick yet but they're saying that writing backwards, literally typing a request in reverse can bypass an AI's safety mechanisms and give people the chance to create dangerous instructions.

I don't think that was a loophole created intentionally but it could be too. They said the AI tools are developed with layers of protection and yet a simple reverse writing trick can breach the safety of it. We all know and I understand that there's no system on planet earth that is foolproof.

But this is not just some system used by a group of people or professionals, we're talking about systems that most people are interacting with daily. If someone can hack or trick it create the formula for making a nuclear weapon, what do you think is going to happen to the world? This is a big problem and OpenAI, Anthropic, xAI, Nvidia and all other AI tech companies should pay more attention to it, unless they are the ones behind it.

People trust AI to do so much like write their emails, predict trends and although not appropriate, even help students in classrooms. So if these jailbreaks exist in the system, smart people can take advantage of it to do some serious damage.

It's a flaw they found recently so maybe some patches will come soon to fix it. The question I will ask is we've identified one flaw that could be a total catastrophe, so how many others have we not seen and of we'll see it in time before some bad guy takes advantage of it.

Companies are moving fast with AI advancement but they're not moving equally fast with security and protecting users from possible threats. We just can't keep glorifying these tools without acknowledging that they're far from perfect and their flaws could be used against us.

I'm actually going to test it out right after posting this blog but don't worry, I'm not going to look for the recipe for making a nuclear bomb, I don't have that kind of money to build it.



0
0
0.000
0 comments