I wouldn’t use GitHub Copilot

Feature image

In 2021, I discovered something exciting — an application of machine learning that was both mind-blowing and practical.

The premise was simple. Type a description of the code you want in your editor, and GitHub Copilot will generate the code. It was terrific, and many people, including myself, were excited to use it.

🚀 I just got access to @github Copilot and it's super amazing!!! This is going to save me so much time!! Check out the short video below! #GitHubCopilot I think I'll spend more time writing function descriptions now than the code itself :D pic.twitter.com/HKXJVtGffm

— abhishek (@abhi1thakur) June 30, 2021

The idea that you can prompt a machine to generate code for you is obviously interesting for contract lawyers. I believe we are getting closer every day. I am waiting for my early access to Spellbook.

As a poorly trained and very busy programmer, it feels like I am a target of Github Copilot. The costs was also not so ridiculous. (Spellbook Legal costs $89 a month compared to Copilot's $10 a month) Even so, I haven't tried it for over a year. I wasn’t comfortable enough with the idea and I wasn’t sure how to express it.

Now I can. I recently came across a website proposing to investigate Github Copilot. The main author is Matthew Butterick. He’s the author of Typography for Lawyers and this site proudly uses the Equity typeface.

GitHub Copilot investigation · Joseph Saveri Law Firm & Matthew ButterickGitHub Copilot investigation

In short, the training of GitHub Copilot on open source repositories it hosts probably raises questions on whether such use complies with its copyright licenses. Is it fair use to use publicly accessible code for computational analysis? You might recall that Singapore recently passed an amendment to the Copyright Act providing an exception for computational data analysis. If GitHub Copilot is right that it is fair use, any code anywhere is game to be consumed by the learning machine.

Of course, the idea that it might be illegal hasn’t exactly stopped me from trying.

The key objection to GitHub Copilot is that it is not open source. By packaging the world’s open-source code in an AI model, and spitting it out to its user with no context, a user only interacts with Github Copilot. It is, in essence, a coding walled garden.

Copi­lot intro­duces what we might call a more self­ish inter­face to open-source soft­ware: just give me what I want! With Copi­lot, open-source users never have to know who made their soft­ware. They never have to inter­act with a com­mu­nity. They never have to con­tribute.

For someone who wants to learn to code, this enticing idea is probably a double-edged sword. You could probably swim around using prompts with your AI pair programmer, but without any context, you are not learning much. If I wanted to know how something works, I would like to run it, read its code and interact with its community. I am a member of a group of people with shared goals, not someone who just wants to consume other people’s work.

Matthew Butterick might end up with enough material to sue Microsoft, and the legal issues raised will be interesting for the open-source community. For now, though, I am going to stick to programming the hard way.

#OpenSource #Programming #GitHubCopilot #DataMining #Copyright #MachineLearning #News #Newsletter #tech #TechnologyLaw

Author Portrait Love.Law.Robots. – A blog by Ang Hou Fu