Advances in AI are moving so rapidly that a new age of cybersecurity threats are already upon us
As cybersecurity professionals, we know it’s crucial to stay ahead of emerging technologies and understand their implications on security. One such technology is, without a doubt, large language models (LLMs) like GPT-4, which have the potential to revolutionize how we interact with computers. However, with great power comes great responsibility [Amazing Fantasy #15, 1962]. In this post, I want to explore the risks associated with large language models, provide a few actionable insights, and offer a bit of guidance on mitigating some of these risks.
The main cybersecurity risks of large language models
- Automated malware generation:
AI-generated code, including malware, can bypass traditional security measures such as signature-based detection systems. Criminals can use large language models to create custom malware for targeted attacks. Cybercriminals, especially those with substantial nation-state funding and bent on disruption, now have the capability to develop AI-driven, context-sensitive, and highly elusive malware and ransomware. These sophisticated threats can evaluate a target system’s security measures, rapidly adapt, and can imitate standard system communications to avoid detection. For example, malware can be designed to activate when the device user engages the device’s camera, effectively circumventing facial recognition authentication.Action: Enhance security protocols by deploying advanced threat detection systems that use machine learning to baseline activity and then identify suspicious patterns and behaviors. Microsegmentation, a security strategy that divides networks into smaller, isolated segments to enhance control and visibility, can minimize the potential damage from AI-generated malware, as it restricts lateral movement and prevents the automated spread across the entire network.
- Misinformation and disinformation:
Large language models can generate content that appears authentic and well-researched, making it difficult to discern between fact and fiction. They can do this in bulk, producing 1000s of SEO-friendly articles a day. An AI-driven crop of generic news sites, and fake news sites, already exist for the purposes of generating click-bait and spreading false information – which can lead to misguided decision-making or erode trust in digital communication channels.Action: We’re inherently a skeptical audience, but many of our colleagues are more susceptible to what they read without verification. For nearly a decade, digital fact-checking and verification have been learned as standard practice in universities teaching journalism around the globe – and will be standard practice on every desktop very soon. While it may not immediately seem like our problem, investment in the tools and technologies to verify the authenticity of generated content and to promote digital literacy within our organizations is going to be a very real part of the next few years.
- Manipulation and social engineering:
Advanced language models can create convincing phishing emails, social media posts, or messages that exploit human vulnerabilities. AI has the potential to enhance spear-phishing attacks by conducting extensive reconnaissance, such as examining hundreds of thousands of social media profiles to identify high-profile, pertinent targets. It can then autonomously initiate customized, human-like interactions to bait victims into providing bad actors with unauthorized access. Spear-phishing attacks are already challenging to detect, and when powered by AI, they pose an even more significant and widespread threat. These attacks may manipulate users into revealing sensitive information or taking actions that compromise security, by literally learning how our peers and leaders write and producing messaging accordingly.Action: Implement comprehensive employee training programs on social engineering and update security policies to address the evolving threat landscape. Embrace lateral movement protection and adopt zero trust best practices so that even if colleague accounts are compromised, and attackers do get in, those attackers can’t get any further, and critical applications and data remain secure.
Examples of LLM risk
- In a controlled experiment, OpenAI researchers found that GPT-3 could generate plausible-sounding but incorrect answers to questions about COVID-19. This sobering experiment demonstrated the potential for large language models to inadvertently spread misinformation.
- When Microsoft launched Tay, a chatbot designed to learn from users’ interactions on social media, within hours, the chatbot began posting offensive and inflammatory content, as it was manipulated by users who fed it inappropriate input – flagging the potential for abuse and highlighted the importance of implementing safeguards in AI systems.
- On March 30, 2023, Italy’s Data Protection Authority (the Garante) released a provisional emergency ruling directing San Francisco-based OpenAI LLC to cease using ChatGPT for the processing personal data belonging to individuals situated in Italy, while awaiting further inquiry and “until ChatGPT respects privacy.” The watchdog cited “the lack of a notice to users and to all those involved whose data is gathered by OpenAI” and stated that there appears to be “no legal basis underpinning the massive collection and processing of personal data in order to ‘train’ the algorithms on which the platform relies.”
LLM cybersecurity itself
Security challenges in artificial intelligence and LLMs often arise as exploitations of machine learning techniques. These manipulations can deceive algorithms into making inaccurate evaluations of a given situation, leading to incorrect or even disastrous outcomes. AI/ML systems present the same opportunities for exploitation and misconfigurations as other technologies, but they also come with unique risks, and as more enterprises undergo AI-powered digital transformations, these vulnerabilities continue to increase. Take, for example, the recent release of ChatGPT’s GPT-4 now having the ability to connect to look for information and other data sets on the internet: could disruption be as simple as the age-old SEO hack of hiding inaccurate black text on a black background, in otherwise respectable copy?
Two critical assets in AI include data (particularly large datasets) used for learning, training, and making predictions, and the data models themselves, which result from training algorithms with large datasets and serve as a competitive advantage for companies. It is critical that data-driven organizations should develop strategies to protect both datasets and data models, ensuring the security of their AI projects.
Since AI and machine learning systems rely on vast amounts of complex data, there are a number of ways they can be exploited. Despite existing mechanisms for detecting and countering attacks, a very real challenge remains in applying confidential computing principles to develop secure AI applications – one of which we are very much aware.
According to Statistica, AI currently manages 75% of network security solutions in global enterprises. Gartner predicts that by mid-2023, cybersecurity and risk management expenditures will reach $175.5 billion. With AI overseeing IT security solutions, tricking these algorithms is akin to bypassing security measures – which is why cybersecurity professionals and data scientists must collaborate to prevent such exploitation in the future.
Mitigating the cybersecurity risks behind large language models
In the development of LLMs and the mitigation of their negative application, there are a few ways we can approach the issue responsibly and ethically so that AI can be used positively in cybersecurity and not as a tool for causing harm and the spread of malware, lies, and hype.
- Responsible development and deployment:
Encouraging the use of best practices in LLM development, such as incorporating fairness, accountability, and transparency, is going to be an important part of AI development in the years to come.
Implementation of strict access controls, monitoring, and auditing will be standard (and best/recommended) practices to prevent the unauthorized use of AI systems.
- Collaboration and information sharing:
Cybersecurity providers, organizations, researchers, and governments will need to foster partnerships and share information about AI risks in order to further understand and develop effective countermeasures.
Engaging with AI developers and vendors to better comprehend their technologies will be a crucial part of ensuring proper risk management and effective security measures.
- Employee training and awareness:
Providing regular and ongoing training and education for employees to help them recognize and respond to LLM-related threats will be an unavoidable necessity.
Encouraging a security-first mindset and empowering employees to take ownership of cybersecurity within the organization is now, and will be for the future, essential.
- Investing in advanced security solutions:
Organizations will be required to employ AI-powered security tools that can adapt to evolving threats and identify anomalous behavior.
Continuously updating and refining cybersecurity measures in response to the rapidly changing threat landscape must be standard practice.
Large language models like GPT-4 have the potential to revolutionize the way we interact with technology. However, they also pose significant risks to cybersecurity. By understanding these risks, taking proactive measures, and fostering a culture of security awareness, organizations can harness the benefits of AI while mitigating potential threats.
As experienced cybersecurity professionals, it is our responsibility to stay ahead of the curve and adapt to the ever-changing landscape of technology and cybersecurity threats. By remaining vigilant, investing in advanced security solutions, and promoting collaboration within the industry, we can effectively tackle the challenges posed by large language models and other AI-driven technologies.
The key takeaways for addressing the risks associated with large language models, right now, appear to be to understand and assess the risks associated with AI technologies, a drive to foster collaboration, the promotion of responsible AI development and deployment, investment in advanced security solutions, and the continuous and responsive update of security measures. A priority will be, as it is now, the implementation of comprehensive employee training programs to raise awareness and enhance cybersecurity.
By following these guidelines, it’s hoped we can ensure that our organizations are well-equipped to face the challenges of LLM-driven technology while maximizing the benefits of innovation.
LLMs and AIs can solve so much if they are safe. As we continue to explore their potential, and that of other advanced technologies, let’s work to create a secure and prosperous digital landscape for everyone.