What is AI-Assisted Coding?

AI-assisted coding is the use of AI tools (machine learning models, or large language models) to generate source code. These tools allow developers to describe in English or other native language what they want their code to do and then have the tool generate the code to accomplish that task. This includes the creation of common elements such as custom variable names and the calling of external modules or services. AI code tools can also translate code from one language to another, a very useful capability for programmers updating legacy applications, for example, by transforming old COBOL programmes into modern day Java.

The algorithms powering these tools are trained on vast amounts of existing source code, typically from publicly available open-source projects. Based on these examples, the algorithms can generate custom code on demand, via text prompts.

It’s important to note that just like other text-based generative AI, the machine learning models behind the code generation have no understanding of how code works or the logic behind a request. The code it provides is simply based on algorithmic predictions of what comes next in a text sequence. It’s quite possible to find that the AI tool writes code that works but does not do what the programmer intended, or syntactically does what was asked but could never actually work. An LLM cannot recognise falsity within its own answers - all you can do is ask it to try again.

A Brief History of AI-Assisted Coding

Automatically generating source code is not new. Low-code and no-code development tools have been available for decades and Integrated Development Environments (IDEs) have long offered auto-complete features. AI-assisted code generation is different in that unlike low-code and no-code tools, it does not rely on pre-built code modules to construct applications without writing much (or possibly any) code. Instead, AI-assisted coding tools generate custom code from scratch each time. In theory, there are no limitations on what AI-assisted code generation tools can do. If the algorithms are sophisticated enough and the training data is sufficiently comprehensive, they can write highly complex applications based solely on natural-language descriptions of how those applications should work.

Although developers have long wished for a faster way to write code, AI-assisted coding solutions have matured only over the past couple of years. This is certainly a major innovation in software development and given that programming languages have a finite set of instructions and syntax, we can expect the accuracy and efficacy of code to only increase.

Popular AI-Assisted Coding Tools

Several vendors now offer AI-assisted coding tools or services, with GitHub Copilot (based on OpenAI Codex) and Microsoft IntelliCode being among the most well-known. Smaller companies such as Tabnine also offer production-ready tools for AI-assisted code generation. Others include AlphaCode and DeepCode. Some, such as Kite, have gone by the wayside because they did not achieve sufficient reliability quickly enough. In the open-source realm, PolyCoder is a favoured coding tool, although open-source solutions in this area tend to be less polished than their commercial counterparts.

How AI-Assisted Coding Tools Actually Work

We’ve said that AI-assisted coding tools utilise large language models trained on massive datasets of code from open source repositories, including code in public repositories on GitHub. These models learn patterns and correlations between code and natural language descriptions, allowing them to generate code snippets based on text prompts from developers. When a developer provides an English description of the functionality they want code for, most AI models generate a list of potential code completions, ranked by likelihood. The developer can then select the most appropriate completion or continue refining the prompt until the desired code is generated.

This iterative process allows developers to rapidly prototype and explore different coding approaches without having to write everything from scratch. The AI essentially acts as a highly advanced auto-complete and code generation assistant.

Common programming and coding languages supported by AI code assistants include Python, Java, JavaScript and Ruby. There are also AI tools that will generate code in PHP, TypeScript, C++, C# and Swift.

Pros and Cons of AI-Powered Coding

Of course there are advantages and disadvantages to coding with the help of AI tools. Let’s have a look at these.

Pros The main benefit of AI-assisted code generation is that it saves developers time, by allowing them to generate code without actually having to write it. AI tools can also be helpful in situations where developers want to implement specific functionality but are unsure how to do so. In addition, AI coding assistants can improve developer productivity by automating repetitive coding tasks, suggesting optimisations and providing instant feedback on code quality and best practices.

Cons There are several drawbacks to AI-assisted coding.

Low accuracy: Advanced solutions such as OpenAI Codex generate accurate code only 37% of the time, according to OpenAI itself. This indicates that there is still a long way to go before programmers could assume that AI code is ready to be used with just a few tweaks.

Extensive code review: Developers must accept or reject automatically generated code as they work, which can be distracting for some people, perhaps to the point of thinking that they may as well write all of their code manually.

Legal and ethical issues: Many AI tools, such as OpenAI Codex and Copilot, are trained using open-source code, meaning they essentially reproduce code inside other projects. Genertaed code could therefore include some snippets identical to the training block. This means that if the AI-generated code includes open-source licensed code, the license of that open-source code may need to be applied to the developed code. Others, such as, Tabnine, claim that its AI model is trained on only source code with permissive licenses, such as MIT and Apache 2.0.

We can at least assume that there are legal and ethical questions of possible plagiarism or license violation and courts are yet to decide on what constitutes derivative work and fair use in the context of AI-generated code.

Security risks: Since AI-generated code usually comes from existing code, it is dependent on the quality of the original. If the training data contains code that is flawed in some way, this could inadvertently introduce security vulnerabilities or bugs into derived code.

Lack of understanding: While AI can generate functional code, it does not understand the underlying logic or intent behind the code. This may make it difficult for developers to maintain or extend the AI-generated portions.

Clearly challenges exist in the world of AI-assisted coding. These include code security, intellectual property rights and the inability of AI to truly understand the intent and context behind the code it generates. One obvious conclusion is that AI tools may be most useful for developers who want a fast way to generate relatively simple code and are willing to take whatever time is needed to review and tweak it until it’s properly accurate. They might also be more appropriate as tools to use on proprietary projects whose source code will not be exposed to public scrutiny, at least until legal and ethical issues are clarified.

The Future of AI-Assisted Coding

We can safely assume that it will take several more years before we fully understand how effective AI coding tools really are and whether they expose users to complicated legal or ethical challenges. But there is no doubt that they are already used by countless programmers the world over and this use is growing. As language models and training data continue to improve, the accuracy and capabilities of AI coding assistants will increase. For now, AI-assisted coding is a promising technology that can save developers time and effort, as long as it is used with caution and a critical eye.

Related Training Courses

Useful Resources