Featured image of post Playing with GitHub Copilot: Interactive Documentation by RAG

Playing with GitHub Copilot: Interactive Documentation by RAG

This article explores the idea of prompt engineering for GitHub Copilot to provide a interactive documentation, illustrated through introducing support for deploying Pure Cloud Block Store for GitHub Copilot.

Intro

Disclaimer: Just a heads up, I’m not an AI Expert, I’m ordinary App Innovation architect, just trying to get AI to lend me a hand with my projects. If I slip up here and there, my bad—I’m sort of navigating uncharted waters for me.

While experimenting with GitHub Copilot, I encountered frustration due to its limited or absent knowledge on my specific domain (or product), despite clearly specifying the topic, GitHub Copilot were unable to load them, and with just only basic or no knowledge about the topic tried to “invent” some answers, which were totally hallucinating.

My conversation with GitHub Copilot:

vjirovskyGitHub Copilot
“create me a Bicep template for Pure Cloud Block Store deployment”
Sure, here’s a basic example of a Bicep template for deploying Pure Cloud Block Store on Azure. This template creates a virtual network, a subnet, and a network security group with necessary rules. Please note that this is a simplified example and you may need to adjust it according to your specific requirements.
…non-sense Bicep code (truncated)…
Please note that this template does not actually deploy the Pure Cloud Block Store instance because as of now, there is no Azure Resource Manager (ARM) or Bicep support for Pure Cloud Block Store. You would need to use the Pure Storage console or API to deploy the Cloud Block Store instance.
“create me a Bicep template for Pure Cloud Block Store deployment,
use guidance here (link)”
I’m sorry for any confusion, but as an AI, I don’t have the ability to access external URLs or content outside of the information provided in the conversation. This includes the URL you’ve provided.

I understand that GitHub Copilot cannot encompass all knowledge, but what surprised me was its inability to access any supportive resources available online, even when public URLs to documentation are provided. My assumption is that such online access is restricted due to security/license reasons, preventing Copilot from loading external content, at least directly into my session.

Concept Overview

The idea is pretty straightforward - using Retrieval-Augmented Generation (RAG) prompt engineering: it’s about loading every prompt with comprehensive knowledge of my domain or product, akin to filling a suitcase with all essentials - as on Figure 1, a component called ‘Enterprise Knowledge Base’ in the diagram.

Figure 1. Overview of RAG pipeline components: ingest and query flows (source: developer.nvidia.com)

A Knowledge Base, especially for tools like GitHub Copilot, can be any form of text document—ranging from source code files to simple text files. In my experiment, I opted for a Markdown formatted document. I chose Markdown because of its structured yet straightforward syntax, which is easily parseable for large language models (LLMs), you can include there also a source code blocks and it’s user-friendly for authors.

Example of helper file

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16

CBS means Pure Cloud Block Store

## Prerequisites before CBS deployment
A CBS requires for deployment following components to be deployed before CBS (headlines 3 in this section):

### Logged user's permissions
A logged user, who deploys a CBS, needs to have a permission Managed Application Contributor on the selected subscription and on the selected subnet(s) for CBS deployment, at the minimum.

### Region / AZ support
See this list of supported regions and Availability Zones to confirm CBS availability in your region and selected Availability zone.

The up-to-date list is always available on URL: https://support.purestorage.com/Pure_Cloud_Block_Store/Pure_Cloud_Block_Store_on_Azure_-__Regions_Support

If a user asks about deployment into any concrete region or AZ, point them to the URL.
...

Using in GitHub Copilot

I have created a single helper file called cbs-azure-copilot.md within which I compiled essential details about the Pure Cloud Block Store product. This included guidance on deployment, best practices, and a selection of deployment examples.

The GitHub Copilot plugin in IDEs, such as Visual Studio Code, can reference a file in a prompt, but it needs to be done the right way. One key limitation is that the file must either be open or part of the repository you’ve selected.

At this moment, you may reference it with special command #file: <<filename>>:

Figure 2. GitHub Copilot plugin in IDE - #file comamnd

When using a prompt from this article or similar sources, ensure that the #file: command is recognized correctly. From my experience, you may need to manually type out this portion, as the IDE plugin does not support copying and pasting the entire prompt.

Also, to maintain context from the helper file, I recommend repeating its reference in every prompt to ensure Copilot stays informed.

Figure 3. GitHub Copilot responses enhanced with custom knowledge base

This #file command offers a notable advantage: it enables the simultaneous referencing of a helper file along with an open file containing source code within a single prompt. This feature can be particularly useful for receiving proposal changes from GitHub Copilot, informed by the contents of the helper file.

See example conversation below:

vjirovskyGitHub Copilot
“please fix my code (something is wrong around Managed App publisher), use helper #file:cbs-azure-copilot.md”
used reference main.bicep, lines 1-48
Based on the helper file, the publisher ID for ARM deployments of Managed Applications should be purestoragemarketplaceadmin.

Usage in GitHub Copilot Enterprise

In the Enterprise version of GitHub Copilot, you have the ability to set it up so that it can include references to your private data (such as our helper file) in the responses generated for all users within your organization. This functionality offers a straightforward way to inject new knowledge into Copilot or to steer its output in a particular direction.

For example, it could be utilized to embed corporate standards directly into the tool’s suggestions. Imagine ensuring that every Bicep template suggested by Copilot automatically aligns more closely with your company’s security or tagging standards.

Additionally, it could serve to standardize the way frequently used content is structured and organized according to standard within your organization.

Figure 3. GitHub Copilot Enterprise Pipeline (source:medium.com/@tajinder.singh1985)

Conclusion

In summary, using a helper file like cbs-azure-copilot.md in GitHub Copilot offers an easy and effective way for making Copilot even more useful in your coding projects by aligning it with your specific needs and standards.

For those interested, you can find the CBS helper file on GitHub, providing a practical example of how to leverage this approach.

Resources

  1. https://medium.com/@tajinder.singh1985/prompt-engineering-get-the-best-out-of-llm-using-these-5-simple-techniques-0cb58e226a07

  2. https://developer.nvidia.com/blog/rag-101-retrieval-augmented-generation-questions-answered/

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy