Originally published on Cybersecurity Magazine.
Cybersecurity is an industry that evolves fast. Systems are more complex, more critical and more public than ever.
There is another element to this trend however. Cybersecurity moves fast because threats & nefarious actors move even faster. Techniques to compromise systems and evade detection become more and more sophisticated, and it’s about to get a hell of a lot worse with AI in the mix.
You’d have to have been hiding under a pretty big rock to miss the recent news about various billion dollar chatbots taking on everything from image creation to creative writing. Artificial Intelligence and Machine Learning have been used in a variety of cybersecurity tools - but let’s talk about the flip side of that coin.
How could AI be used to attack, rather than defend?
Let’s take a look at some of the approaches in AI.
Generative AI is a type of machine learning where a neural network is trained with large amounts of data, and can create data similar to the training material. An example of this could be thousands of images of a single subject, and you then ask the model to create a new one.
Supervised learning is training an algorithm with data that has a specified output, and then asking the algorithm to assess new data and predict the outcome. Think taking a million emails, and telling the algorithm which are spam and which aren’t. The model should then be able to tell whether a new email is spam or not, and flag it as such.
Unsupervised Learning is when you train an algorithm, but without a set outcome. You then ask the algorithm to find patterns and structure in the data. An example of this could be looking at similarities in shopping patterns across a large number of customers in a retail app.
Reinforcement Learning is when an algorithm is rewarded or punished for outcomes, based on trial and error. Like a robot trying to navigate a maze, and being rewarded for finding the exit, or punished for making wrong turns.
Now think of some nefarious applications for the technology.
While AI is using better algorithms to detect, identify and block spam by training models with a huge number of example emails - the threat actors could be using technology to write more convincing content that continually changes. Making it both harder to detect, and more likely to fool users when it makes it through the net.
That robot finding its way through the maze via trial and error? Picture an algorithm with plenty of patience testing all possible angles of attack on a web application. Not only can it systematically worth through all possible attacks, it can look for combinations of attacks and vulnerabilities that would be otherwise unexploitable independently, and it can be trained to be rewarded for successful exploits and seek this pattern out on hundreds of thousands of other applications in almost real time.
Another concern is data poisoning, where algorithms are trained using manipulated data to create a bias that advantages the attacker, such as misinformation in the run up to an election.
What’s interesting (read, terrifying) is that this isn’t theoretical. Proof of concept malware was developed by IBM in 2018, using AI to evade detection, and deliver tailored payloads at the opportune moment of attack. Popular chatbots have been flooded with poisoned data within days of going live.
Another emerging attack vector from large language models comes from prompt injection attacks. Similar to data poisoning, these attacks breach the nature of the AI systems to generate nefarious responses. They have the potential to leak data, execute unauthorized code, or exploit other vulnerabilities on systems such as ChatGPT or software built on top of it.
Does that mean that Skynet is poised to attack?
In a word, no. These systems aren’t sentient (yet) but they are being wielded by humans, for both good and bad. A major risk could be that a piece of misunderstood code is deployed in the wild, and the consequences exceed the expectations of the author. History would be repeating itself, with examples like the Morris Worm and the Samy Worm both were far more effective than originally intended.
The other concern is that these technical advances create a “next gen” moment for attackers, where they can rapidly adopt new techniques, and outpace the defenses currently in place.
Ok, pass me the tin foil hat. I’m going off grid.
Cybersecurity companies have been using machine learning and AI for a decade, and haven’t been caught napping. While these technologies are new to the general public, they have been evolving how we think about the systems we secure for years.
As always, defending your assets requires a multi-layered approach. Security tools using AI are trained to identify patterns of incoming attacks, detect unusual behavior, and automatically take action to isolate systems if a breach is suspected. Natural language models can train employees using the latest phishing emails, and simulated attacks can be used to quickly assess security of a new system, or check for regressions/new threats on existing systems.
Existing generative AI models are able to code at an entry level software developer level (passing the Google technical entrance exam) and can spot vulnerabilities in static code with better coverage than industry leading cyber security scanners, covering more languages and with a lower false positive ratio, and they will only get better over time.
And don’t forget humans. As security evolves, so do security professionals. These heroes of the infosec world aren’t waiting around for AI to replace them or crash their networks, they are working day and night to find new ways to utilize this new wave of technology to their advantage.
Keep calm, and carry on. We got this.