Are we ready for AI-generated code?

In recent months, we’ve been blown away by the quality of computer-generated faces, cat photos, videos, articles, and even art. Artificial intelligence (AI) and machine learning (ML) have also quietly infiltrated software development, with tools like GitHub Copilot, Tabnine, Polycode, and and others Taking the next logical step is to put the existing code autocomplete functionality on AI steroids. Unlike cat images, the origin, quality, and security of an app’s code can have wide-reaching implications — and at least for security, research shows the risk is real.

Before Academic research It has already been shown that GitHub Copilot often generates code with security vulnerabilities. Recently, hands-on analysis from Invicti security engineer Kadir Arslan showed just that Unsafe code suggestions Still the rule rather than the exception with the co-pilot. Arslan finds that suggestions for many common tasks involve only the absolute bare bones, often taking the basic and least safe path, and that accepting them without modification may lead to functional but weak implementations.

A tool like Copilot is (by design) highly autocomplete enabled, trained on open source code to suggest excerpts that could be relevant in a similar context. This makes the quality and security of the proposals closely related to the quality and security of the training set. So the bigger questions are not about Copilot or any other specific tool but about AI-generated program code in general.

It is reasonable to assume that the copilot is only the tip of the spear and that similar generators will become commonplace in the coming years. This means that we, the tech industry, need to start asking how such code is created, how it is used, and who will be held responsible when things go wrong.

Satnaf syndrome

Traditional code autocompletion that looks up function definitions to complete function names and reminds you of the arguments you need saves huge time. Since these suggestions are just a shortcut to searching for documents on your own, we’ve learned to implicitly trust everything the IDE suggests. Once an AI tool comes up, its suggestions are no longer guaranteed to be correct — but they still feel friendly and trustworthy, so they’re more likely to be accepted.

Especially for less experienced developers, the convenience of having a free block of code encourages a change of mindset from “Is this code close enough to what I’m going to write” to “How can I modify this code so it works for me”.

GitHub states very clearly that copilot suggestions should always be carefully analyzed, reviewed, and tested, but human nature dictates that even sub-code will occasionally make it into production. It’s a bit like driving while looking at a GPS than the road.

Supply chain security issues

The Log4j security crisis Moving software supply chain security, and specifically, open source security into the spotlight, with the latest White House memo On the development of safe and new software A bill to improve open source security. With these and other initiatives, the presence of any open source code in your applications may soon need to be written to a SBOM, which is only possible if you intentionally include a particular dependency. Software Configuration Analysis (SCA) tools also rely on that knowledge to detect and report outdated or vulnerable open source components.

But what if your app includes AI-generated code that ultimately originates from an open-source training suite? Theoretically, if one core proposition matches existing code and is accepted as is, you might have open source code in your software but not in your SBOM. This can lead to compliance issues, not to mention the potential for liability if your code turns out to be insecure and results in a breach – and SCA won’t help you, as it can only find weak dependencies, not vulnerabilities in your code.

Disadvantages of licensing and attribution

In continuation of this train of thought, to use open source code you have to comply with its license terms. Depending on a specific open source license, you will need to at least provide attribution or sometimes release your code as open source. Some licenses prohibit commercial use entirely. Whatever the license, you need to know where the code came from and how it was licensed.

Again, what if you had AI-generated code in your app that just happened to be identical to existing open source code? If you have an audit, will you find that you are using code without the required attribution? Or maybe you need to open source some of your commercial code to stay compatible? It probably isn’t a realistic risk yet with current tools, but these are the kind of questions we should all be asking today, not in 10 years. (And just to be clear, GitHub Copilot has an optional filter to block suggestions that match existing code to reduce supply chain risk.)

deeper security implications

Back to security, an AI/ML model is only as good (and bad) as its training set. We’ve seen that in the past For example, in cases Facial recognition algorithms show racial biases Because of the data they were trained on. So if we have research showing that a code generator frequently produces suggestions without regard for security, we can conclude that this is what their learning set (i.e. publicly available code) looks like. And what if the unsafe code generated by the AI ​​then falls back into that code base? Can the suggestions be safe?

The security questions don’t stop there. If AI-based code generators gain popularity and start calculating a meaningful percentage of new code, chances are someone will try to attack them. It is indeed possible to cheat AI image recognition by poisoning its learning set. Sooner or later, malicious actors will try to put uniquely vulnerable code into the public repositories in hopes that it will come up in suggestions and eventually end up in a production application, opening it up to easy attack.

What about monoculture? If multiple applications end up using the same very weak suggestion, regardless of its origin, we might be looking at epidemics of vulnerabilities or perhaps even AI vulnerabilities.

AI monitoring

Some of these scenarios may seem far-fetched today, but they are all things we in the tech industry need to discuss. Again, GitHub Copilot is only in the spotlight because it’s currently leading the way, and GitHub provides clear warnings about AI-generated suggestion warnings. As with autocomplete on your phone or route suggestions in your satnav, they are just hints to make our lives easier, and it’s up to us to take them or leave them.

With their potential to greatly improve development efficiency, AI-based code generators are likely to become a permanent part of the software world. In terms of application security, though, this is another source of potentially vulnerable code that needs to pass strict security testing before it’s allowed to be produced. We’re looking for an entirely new way to inject security vulnerabilities (and possibly unchecked dependencies) directly into your first-party code, so it makes sense to treat AI-enhanced programming rules as unreliable until tested – and that means testing everything as much as possible. What you want can.

Even relatively transparent ML solutions like Copilot already raise some legal and ethical questions, not to mention security concerns. But just imagine that one day a new tool starts generating code that works perfectly and passes security tests, except for one minute detail: no one knows how it works. That’s when it’s time to panic.

Leave a Comment