GitHub Copilot: What's the legal questions on the AI-driven coding assistant?

Copilot is a Visual Studio Code extension developed by GitHub in collaboration with OpenAI that employs machine learning to suggest functions or lines of code as developers write their software.

While the Free Software Foundation has raised some salient questions about the legality and legitimacy of GitHub’s AI-driven coding assistant, citing lack of fairness and therefore unacceptable and unjust, from their perspective.

According to the foundation, Copilot requires the running of a software that is not free, that is, Visual Studio, or a part of Visual Studio Code, and it serves as a Software Substitute which raises many other questions which require deeper examination.

Why GitHub Copilot is ‘unacceptable and unjust’ according to the Free Software Foundation?

The Free Software Foundation stated that Copilot's use of freely licensed software has many implications for an incredibly large portion of the free software community.

And there are many inquiries about its position on questions such as “Developers wanting to know if training a neural network on their software can be considered fair use. Others who want to use Copilot wonder if the code snippets copied from GitHub-hosted repositories could result in copyright infringement?"

Even if everything is legally copacetic, activists imagine if there isn’t something fundamentally unfair about a proprietary software company building a service off their work. While all topics related to Copilot's effect on free software may be in scope, the following questions are of particular interest:

Is Copilot's training on public repositories infringing copyright? Is it fair use?
How likely is the output of Copilot to generate actionable claims of violations on GPL-licensed works?
How can developers ensure that any code to which they hold the copyright is protected against violations generated by Copilot?
Is there a way for developers using Copilot to comply with free software licenses like the GPL?
If Copilot learns from AGPL-covered code, is Copilot infringing the AGPL?
If Copilot generates code which does give rise to a violation of a free software licensed work, how can this violation be discovered by the copyright holder on the underlying work?
Is a trained artificial intelligence (AI) / machine learning (ML) model resulting from machine learning a compiled version of the training data, or is it something else, like source code that users can modify by doing further training?
Is the Copilot trained AI/ML model copyrighted? If so, who holds that copyright?
Should ethical advocacy organizations like the FSF argue for change in copyright law relevant to these questions?

The Free Software Foundation is offering $500 for white papers on the topic submitted by developers that it publishes and requests for funding to do further research leading to a later paper. And submissions are open until Monday, August 23, with guidelines for the papers available at fsf.org.

GitHub, on its part, has responded by expressing its willingness to be open about any issues, stating that this is a new space, and they are keen to engage in a discussion with developers on these topics and lead the industry in setting appropriate standards for training AI models.