In 2021, I discovered something exciting — an application of machine learning that was both mind-blowing and practical.
The premise was simple. Type a description of the code you want in your editor, and GitHub Copilot will generate the code. It was terrific, and many people, including myself, were excited to use it.
The idea that you can prompt a machine to generate code for you is obviously interesting for contract lawyers. I believe we are getting closer every day. I am waiting for my early access to Spellbook.
As a poorly trained and very busy programmer, it feels like I am a target of Github Copilot. The costs was also not so ridiculous. (Spellbook Legal costs $89 a month compared to Copilot's $10 a month) Even so, I haven't tried it for over a year. I wasn’t comfortable enough with the idea and I wasn’t sure how to express it.
Now I can. I recently came across a website proposing to investigate Github Copilot. The main author is Matthew Butterick. He’s the author of Typography for Lawyers and this site proudly uses the Equity typeface.
In short, the training of GitHub Copilot on open source repositories it hosts probably raises questions on whether such use complies with its copyright licenses. Is it fair use to use publicly accessible code for computational analysis? You might recall that Singapore recently passed an amendment to the Copyright Act providing an exception for computational data analysis. If GitHub Copilot is right that it is fair use, any code anywhere is game to be consumed by the learning machine.
Of course, the idea that it might be illegal hasn’t exactly stopped me from trying.
The key objection to GitHub Copilot is that it is not open source. By packaging the world’s open-source code in an AI model, and spitting it out to its user with no context, a user only interacts with Github Copilot. It is, in essence, a coding walled garden.
Copilot introduces what we might call a more selfish interface to open-source software: just give me what I want! With Copilot, open-source users never have to know who made their software. They never have to interact with a community. They never have to contribute.
For someone who wants to learn to code, this enticing idea is probably a double-edged sword. You could probably swim around using prompts with your AI pair programmer, but without any context, you are not learning much. If I wanted to know how something works, I would like to run it, read its code and interact with its community. I am a member of a group of people with shared goals, not someone who just wants to consume other people’s work.
Matthew Butterick might end up with enough material to sue Microsoft, and the legal issues raised will be interesting for the open-source community. For now, though, I am going to stick to programming the hard way.