“HeyGitHub!” Wallpaper, Tartan Plaids, and GitHub Copilot

Looking and Feeling our Way to Future Code Generation

Jim Salmons
GitHub Copilot for Disabled Developers

--

What do historical pattern books for wallpaper stencils, plaid cloth samples, and similar instructional materials for tried and true designs have to do with upping the game of code generation by voice-control interaction of GitHub’s recently announced “Hey, GitHub!” service? That’s the topic we’ll explore in this follow-up to my recent article looking at the intersection of Conversational Copilot and the use of metamodel subgraphs in my Digital Humanities research.

In colonial America prior to the mass production of commercial wallpapers, local artisans meticulously stenciled repetitive patterns on the walls of homes in Williamsburg, Annapolis, and other burgeoning centers of urban culture. Unlike the free-form creativity of modern graffiti artists, these artisans learned and applied the idiomatic designs that were sure to give a pleasing and successful room interior design. In the same vein, weavers did not rely on guesstimates of what Scottish tartan plaids look like for fear of displeasing the lairds of neighboring clans and possibly setting off a bloody conflict between these notoriously tribal networks.

Pattern books for standardizing cultural artifacts were an effective means of collecting and transmitting these and similar designs. Computer software also benefits from reuse of design patterns, especially in the creation of user interfaces and user interaction models for applications. Apple’s Macintosh was famously ruthless in prescribing its UI/UX look and feel standards so as to reduce the learning curve for users to move from one application to another. So it is not unreasonable to consider how such a collection of pattern-book-like model training datasets might contribute to advancing the human-computer interaction dialogue for GitHub’s Copilot-based “Hey, GitHub!” service.

A Modest Proposal

My goal for this article is not to present a full and complete example of a model-training dataset for refining the “Hey, GitHub!” Copilot-powered code generation service. Rather I simply want to envision and capture enough of this idea so that this approach can be considered for exploration by the GitHub Next Copilot developers and by the early adopter users participating in the Technology Preview program. I will present a brief overview description of the idea, and illuminate these ideas with a code snippet that might be included in such model-training dataset repositories.

To make my point, I will reference some of the ideas and content from my first article, “HeyGitHub!” — Metamodel Subgraphs and the Evolution of GitHub’s Conversational Copilot.

A Simplified View of the “Hey, GitHub!” Architecture

To understand the potential approach to using pattern-book-like datasets to enhance the “Hey, GitHub!” service, let’s first consider a simplified view of its current implementation. With a goal of significantly reducing the use of keyboard and mouse, the “Hey, GitHub!” service provides a voice-input wrapper around the VSCode IDE.

Without having actual knowledge of this GitHub service’s architecture, I will generally characterize the service as shown in this simple block diagram.

Given that the current instructions for participation in the service’s Technology Preview require installation of a Java SDK along with a “Hey, GitHub!”-specific instance of the Copilot Extension in VSCode, it is reasonable to assume that a separate Java-based layer is responsible for the voice-input mechanism I am calling the Hey GitHub Listener. The Listener, implemented as an instance of Microsoft’s embedded speech solution, is responsible for accepting the developer’s voice input and performing the first essential step in turning voice into code development action.

The Listener’s behavior requires that it determine which of three categories of response is appropriate:

  • The user may utter a command that is self-referential to the service itself, such as “Hey, GitHub” to get the service’s attention to accept further input.
  • The user’s input may be interpreted as a command that is directly piped through to VSCode to perform development activities such as IDE control like “Go to line 56” or code editing like “Change the return type of the function to a boolean.”
  • If a voice input is not recognized as either a self-reference or IDE command, the utterance is assumed to be an indication by the developer of what source code Copilot is to generate in the current context. The service then encodes the input from voice to text and passes this textual input to the Copilot engine for the generation of a best-guess code suggestion or any alternates if the developer is not satisfied with a suggestion.

Regardless of the category of voice input heard by the Listener, the result is that the developer is performing the bulk of development activity by voice without relying on keyboard or mouse.

How a Pattern-Book Repository Could Add “Sherlockean” Smarts to the “Hey, GitHub!” Conversation

If you read the first of my articles in the “HeyGitHub!” series, you may remember that I speculated in some detail about how a higher-level conversation might play out between a developer and a more robust Future Copilot. This more “worldly” Copilot was described as having “prior knowledge” of information structures and content meaning based on model training on a set of metamodel subgraph-aware Ground Truth reference datasets.

The hypothetical code-generating conversation in my first article assumed that Copilot had knowledge about the complex document structures of magazines utilizing the MAGAZINEgts Ground-Truth Storage format with its metamodel subgraph design pattern. A closer read of that hypothetical conversation reveals that I also assumed our Future Copilot to have a rather more Sherlockean level of understanding about the design and user interactions of application interfaces and workflow.

For example, that hypothetical conversation included this excerpted exchange:

Developer: Create a new Toolkit Python program called Ad Ferret using the default import libraries specification in the Workflow Toolkit template.

Copilot: Done.

Developer: Create a view and edit interface based on the items in the Advertisement Specifications section of the Document Structures branch of the metamodel.

Copilot: Okay, to iterate over these Advertisements I will create a Group in the Widget Panel supporting selection of the Advertisements grouped by Advertiser using a Drop-down Listbox with Previous and Next buttons to iterate through each Advertiser’s Advertisements ordered by Issue date and page number.

Developer: Okay.

Copilot: According to the Advertisement Specification document structure, I can capture the Size, Shape, and Position of each Advertisement. What interface widgets would you like to use for these parameters?

Developer: Use Radio Buttons for the Advertisement Size and Shape, and a Drop-down Combobox for the Position, then show me the Toolkit window.

Putting aside the potential “deep weeds” of the model training needed for Future Copilot to know about the MAGAZINEgts format, let’s consider what prior/shared knowledge Future Copilot would need to be exposed to in order to be conversant in the above dialogue.

The above “HeyGitHub!” Conversational Copilot exchange envisions the voice input of a Digital Humanities scholarly developer building a metamodel-aware discovery and curation application. Below is a GIF animation of a comparable wxPython-based application I created the “old fashioned way” to do text- and data-mining of advertisements in the 48-issue run of Softalk magazine found in on the digital shelves of the Internet Archive:

Using the current “Rainman”-like Savant Syndrome capabilities of the current implementation, Copilot can do a remarkable job of generating code suggestions for Python source needed to produce this tool. But this capability is due to Copilot having looked at tens, perhaps hundreds, of millions of lines of Python code to be near presciently capable of recognizing the complementary relationship between previously seen code and the current context. The Rainman Copilot is an awesome applied pattern recognition and context-pattern generating machine. But Rainman, like all incredible savants, cannot sit with you and explain its awesome ability or carry on a self-reflective conversation about its underlying domain of knowledge. (To be fair, the ‘Explain this code’ feature of the GitHub Next lab’s experimental Copilot Lab extension to VSCode is making remarkable strides in terms of Copilot’s ability to explain its code suggestions.)

To move Copilot from its Rainman status to incorporate Sherlockean cognitive ability, we are going to need to subject Copilot to some “schooling.” This is where the instructional design of a small collection of pattern-book-like reference training datasets come into play.

While the popular press and proof-of-concept websites are rampant with examples of the powers of multi-billion parameter transformer based Large Language Model machine learning systems, there is complementary research that shows how ML models can achieve impressive results when fed relatively small caches of highly curated ground-truth datasets. For Future Copilot to be able to engage in high-level cognitive discussions with developer users, we will need to refine Copilot’s ability to round out its pattern-based abilities with encoded knowledge of information structures and those structures’ semantic meaning.

Once we have these curriculum instructional materials, we will be able to incrementally enhance our Rainman Copilot with Sherlockean abilities as suggested by this slight elaboration of the layered model of the “HeyGitHub” platform architecture:

Honestly, I’m not knowledgable enough of the Machine Learning domain to know whether this Sherlockean wisdom is best captured by extension of the Rainman monolithic model, or if the Copilot engine evolves as a “mix and match” collection of ML algorithmically-specific models strung together in a contextually configured solution-seeking pipeline. However this prior Sherlockean knowledge is learned by Copilot, I know that its performance requirement is to be able to respond to the Developer user by engaging with incremental feedback. Such incremental feedback reflects this more subtle understanding of the structure and meaning of information beyond the rote brute-force response of pattern-based best-guess code-generating suggestions.

What Would a Pattern-Book “Textbook” Look Like to Add Sherlockean Knowledge to Rainman Copilot?

The full presentation of an example model-training Ground Truth reference dataset capturing the domain of widget-based user interface design and implementation is beyond the scope of this article. I can, however, provide some thoughts and a code snippet to suggest what the content of a Python-focused Pattern-Book for Copilot training might look like.

The GIF animation above of the FactMiners Toolkit ‘Ad Ferret’ app is built on the widely-used wxPython library. This object-oriented library is a wrapper on the mature and wide-ranging C++-based wxWidgets framework of UI components and their complementary UX classes.

To conclude this article, I’ll present a brief look at what the Sherlockean pattern-book model-training materials might look like to train Copilot to efficiently produce a widget-based user interface and simple user interaction with these interface elements.

My imagined pattern-book material would consist of a rich collection of Python scripts and Jupyter notebooks to showcase the range of available wxPython widgets and their use to present and manipulate examples of data based on various Python datatypes. This collection of scripts and notebooks would be nested in a directory structure such as “./UIUXpatterns/Python/wxPython/list_of_strings_widgets.py”. The following snippet is representative of the code that demonstrates the range of widgets and their use to support user interface display and interaction with a List of String data:

#!/usr/bin/env python
#
# Widgets to handle a List of String data
#
# Language: >= Python 3.x
# UI/UX Framework: wxPython >= 2.3
#

import wx

'''
The sample data that will be used to explore appropriate widgets.
Data type: Python List of Strings
Appropriate UI widgets: RadioBox, CheckListBox, Listbox, or Combobox...
'''
my_data = ["item1", "item2", "item3", "item4"]

#---------------------------------------------------------------------------
class WxListOfStringsWidgets(wx.Panel):
# When User says: "Which widgets are useful for handling Lists of Strings?"
# Respond with the following: "A RadioBox, CheckListBox, Listbox, or
# Combobox can be used with a List of Strings."
def __init__(self, parent, log):
self.log = log
wx.Panel.__init__(self, parent, -1)
self.sizer = wx.BoxSizer(wx.VERTICAL)
self.SetSizer(self.sizer)

#-----------------------------------------------------------------------
# Widget class: wxRadioBox
# When User says: "Add a RadioBox to handle my_data."
# Respond with the following: [code suggestion]
#
def add_rbox(self):
sizer = wx.BoxSizer(wx.VERTICAL)
rb = wx.RadioBox(
self, -1, "wx.RadioBox", wx.DefaultPosition, wx.DefaultSize,
my_data, 2, wx.RA_SPECIFY_COLS
)
self.Bind(wx.EVT_RADIOBOX, self.EvtRadioBox, rb)
rb.SetLabel("List of Strings by RadioBox")
sizer.Add(rb, 0, wx.ALL, 20)
self.sizer.Add(sizer)
self.SetSizer(self.sizer)

def EvtRadioBox(self, event):
self.log.WriteText('EvtRadioBox: %d\n' % event.GetInt())

#-----------------------------------------------------------------------
# Widget class: wxCheckListBox
# When User says: "Add a CheckListBox to handle my_data."
# Respond with the following: [code suggestion]
#
def add_cklb(self):
sizer = wx.BoxSizer(wx.VERTICAL)
stxt = wx.StaticText(self, -1, "This example uses the " +
"wxCheckListBox control.", (45, 15))
sizer.Add(stxt, 0, wx.ALL, 20)
lb = wx.CheckListBox(self, -1, (80, 70)) #, choices=sampleList)
for txt in my_data:
lb.Append(txt)
lb.SetSize(lb.GetBestSize())
self.Bind(wx.EVT_LISTBOX, self.EvtListBox, lb)
self.Bind(wx.EVT_CHECKLISTBOX, self.EvtCheckListBox, lb)
lb.SetSelection(0)
self.lb = lb
sizer.Add(self.lb, 0, wx.ALL, 20)
self.sizer.Add(sizer, 0, wx.ALL, 20)

def EvtListBox(self, event):
self.log.WriteText('EvtListBox: %s\n' % event.GetString())

def EvtCheckListBox(self, event):
index = event.GetSelection()
label = self.lb.GetString(index)
status = 'un'
if self.lb.IsChecked(index):
status = ''
self.log.WriteText('Box %s is %schecked \n' % (label, status))
self.lb.SetSelection(index)

#... [snip snip] and so on...
#---------------------------------------------------------------------------

def runTest(frame, nb, log):
win = WxListOfStringsWidgets(nb, log)
win.add_rbox()
win.add_cklb()
return win


if __name__ == '__main__':
import sys,os
import run
run.main(['', os.path.basename(sys.argv[0])] + sys.argv[1:])

Note: This is an overly simplified snippet quickly drafted to introduce these ideas following introduction of the “HeyGitHub!” service announcement during the recent #GitHubUniverse conference. I intend to create a more thorough sample repository and release it on GitHub. I will then highlight an excerpt here and link to this more complete repository.

This short example gives a sense of how a Pattern-Book Ground Truth dataset might be used to provide domain-specific instructional material. The source code in such a repository will be used to refine the code-generation capabilities of Future Copilot and its associated “HeyGitHub” voice-input service.

Yes, But Can’t We Just?…

Perhaps the effort to create such additional code samples is misguided. Doesn’t Copilot already have sufficient UI/UX framework samples in the huge amounts of code it has been trained on “in the wild” of publicly accessible Open Source project repositories? And might it be more efficient to provide such capabilities by simply using modern IDE’s feature for user-extensible parameterized autocomplete snippets?

The answer to these questions might be a bit fuzzy. However, these questions do not point to weaknesses of the proposed approach sufficient enough to derail the exploration of custom training for Copilot and its complementary “HeyGitHub” service.

The answer to the second question is more easily addressed with an appeal to moving this level of code generation from the domain of IDE-specific builtin snippet template editor support to the more agnostic realm of incorporating this capability into the ML code generation model. IDE toolmakers are not likely to take on the burden of providing detailed code snippet templates for all the various UI/UX frameworks available for use in their customers’ specific projects. Nor do developers want to painstakingly enter such snippet templates into their IDE of choice before being able to efficiently produce code within their projects.

In the case of Copilot having already seen enough examples of code exercising UI/UX frameworks, this may well be true for achieving the threshold ability to generate useful code suggestions sufficient to contribute to overall developer productivity gains. But this sighting of naturally occurring UI/UX framework code may not allow Copilot to be as helpful as it could be. This extra level of custom training may be especially useful when we focus on the additional challenge of providing incremental dialogue-style “HeyGitHub!” voice coding interaction.

One of the exciting features of using Copilot is how it can make the leap from the developer’s entry of code-comment descriptions to the generation of source code suggestions consistent with those comments. Custom training as described here has the potential to move developer productivity gains “up the food-chain” into “HeyGitHub!” voice-coding. In addition to refining the Copilot model directly, UI/UX framework-specific Ground Truth model training datasets can inject helpful context-setting material into extension of the command interpretation and response of the “HeyGitHub!” Listener.

You may have noticed that I provided some “helper” content in the comments of the snippet above. These helpers include structured comments of “When the user says…” and “Respond with the following.” In particular, you may note that the proposed response may be a conversational reply — “A RadioBox, CheckListBox, Listbox, or Combobox can be used with a List of Strings.” — or a code suggestion. Such structured training materials may be used by the “HeyGitHub!” Listener to provide a more interactive and incremental voice-coding experience.

Thinking Synergistically, A Future So Bright We Have to Wear Shades

The most compelling reason for taking the extra effort to create framework- and domain-specific Ground-Truth reference datasets for Copilot and the “HeyGitHub!” Listener training is the potential synergy of the accumulation and interaction of these custom training “lessons” to support more comprehensive human-machine conversations.

As always comments or questions are welcome in-line here, on Twitter (🥺) at @Jim_Salmons, and in the Discord channel for participants in the “HeyGitHub!” Technology Preview program.

Happy Healthy Vibes from Colorado USA,
-: Jim Salmons :-

Jim Salmons is a seventy-one year old post-cancer Digital Humanities Citizen Scientist. His primary research is focused on the development of a Ground Truth Storage format providing an integrated complex document structure and content depiction model for the study of digitized collections of print era magazines and newspapers. A July 2020 fall at home resulted in a severe spinal cord injury that has dramatically compromised his manual dexterity and mobility.

Jim was fortunate to be provided access to the GitHub Copilot Technology Early Access Community during his initial efforts to get back to work on the Python-based tool development activities of his primary research interest. Upon experiencing the dramatic positive impact of GitHub Copilot on his own development productivity, he became passionately interested in designing a research and support program to investigate and document the use of this innovative programming assistive technology for use by disabled developers.

--

--

Jim Salmons
GitHub Copilot for Disabled Developers

I am a #CitizenScientist doing #DigitalHumanities & #MachineLearning research via FactMiners & The Softalk Apple Project. Medium is my #OpenAccess channel.