A look at Patchwork: YC backed LLM Startup


In the last couple of days, I’ve spent some hours playing with Patchwork. Patchwork is an open-source framework that leverages AI to accelerate asynchronous development tasks like code reviews, linting, patching, and documentation. It is a Y Combinator backed company.

The GitHub repository for Patchwork can be found here: https://github.com/patched-codes/patchwork.

Patchwork offers two ways to use it. One is through their open-source CLI that utilizes LLMs like OpenAI to perform tasks. You can install the CLI using the following command:

pip install 'patchwork-cli[all]' --upgrade

The other option is to use their cloud offering at https://app.patched.codes/signin. There, you can either leverage predefined workflows or create your own using a visual editor.

This post focuses on my experience with their CLI tool, as I haven’t used their cloud offering yet.

Patchwork comes bundled with six patchflows:

  • GenerateDocstring: Generates docstrings for methods in your code.
  • AutoFix: Generates and applies fixes to code vulnerabilities within a repository.
  • PRReview: Upon PR creation, extracts code diffs, summarizes changes, and comments on the PR.
  • GenerateREADME: Creates a README markdown file for a given folder to add documentation to your repository.
  • DependencyUpgrade: Updates your dependencies from vulnerable versions to fixed ones.
  • ResolveIssue: Identifies the files in your repository that need updates to resolve an issue (or bug) and creates a PR to fix it.

A patchflow is composed of multiple steps. These steps are python code.

To understand how Patchwork works, we’ll explore a couple of predefined Patchflows.

Patchflow 1. GenerateREADME

I started playing with the most basic Patchwork functionality: generating a README file for a project. I was curious to see how it handles converting the codebase into a prompt. Specifically, I wondered which files it passes to the LLM: does it pass all the code, or does it select specific files? How does it keep LLM context length in check? These were some of the questions I had. As their documentation mentions, they leverage an open-source tool called code2prompt. This tool converts a repository’s code into a markdown document that serves as context for the LLM.

The most important code is the GenerateREADME Patchflow configuration.

PatchflowProgressBar(self).register_steps(
    CallCode2Prompt,
    CallLLM,
    CommitChanges,
    CreatePR,
    ExtractModelResponse,
    ModifyCode,
    PreparePR,
    PreparePrompt,
)

It defines all the steps that this patchflow uses. It first converts code to a prompt, then call LLM to generate README in markdown format, then create a branch and commit changes, and finally raise a PR. These predefined steps brings determinism in the workflow. It is mostly glueing together multiply APIs call and Python functions it to a reliable workflow.

To use GenerateREADME workflow, we will call patchflow CLI on the react-pdf code.

patchwork GenerateREADME folder_path=/tmp/react-pdf

When I ran the above command it greeted me with an error message.

 Error running patchflow GenerateREADME: Invalid inputs for steps:
 Missing required input: At least one of google_api_key, openai_api_key, patched_api_key has to be set

I already have OPENAI_API_KEY environment variable set so I was hoping they should default to using it.

So, I provided my OpenAI key as an argument.

 patchwork GenerateREADME openai_api_key=sk-xxxxx folder_path=/tmp/react-pdf

Again, it failed. I anticipated this failure because when I examined the code of the GenerateREADME patchflow, I discovered they were directly passing the output of prompt2code to the LLM for generating the README. Anyone who has built substantial applications with LLMs knows that LLMs have finite context length limitations. The react-pdf codebase, with its 49,490 lines, far exceeds the default context length of the GPT-3.5-turbo model, which is 16,385 tokens.

I expected them to default to at least the gpt-4o-mini model, which is both cheaper and more capable with a context length of 128,000 tokens. Until they implement intelligent preprocessing to trim the context to fit within the context window, using a gpt-4o-mini model could be beneficial.

I explored the code2prompt code and found that it offers an exclude option to omit unwanted files from the generated markdown. This could help reduce the context length. Unfortunately, Patchwork disappointed me by exposing the filter option of code2prompt instead. This option restricts the files used for generating the README, which is not as helpful for our purpose.

patchwork GenerateREADME openai_api_key=sk-xxxxx filter="*.js" folder_path=/tmp/react-pdf

Even after including only js file it was still failing so I decided to use a smaller library https://github.com/sindresorhus/query-string. It only has 2682 lines of code. We will use gpt-4o-mini model so that we have larger context window to work with. So, I am hoping this should work.

patchwork GenerateREADME openai_api_key=sk-xxxx model="gpt-4o-mini" folder_path=/tmp/query-string

It worked. Below is the README.md it generated.

# Query String Library Documentation

The Query String library provides utilities for parsing and stringifying URL query strings. It enables developers to easily manipulate parameters in URLs, providing various options for how arrays and values are handled.

## Overview

The library consists of functions to:
- Parse a query string or a full URL into an object.
- Stringify an object into a query string.
- Extract query parameters from URLs.
- Pick or exclude specific parameters.
- Create URLs with query strings and fragment identifiers.

## Inputs

### Parsing Functions
- **`parse(query: string, options?: ParseOptions): ParsedQuery`**
  - **Input:** 
    - `query`: The query string to be parsed.
    - `options`: Custom options for parsing (like enabling decoding and parsing numbers).

### Stringifying Functions
- **`stringify(object: StringifiableRecord, options?: StringifyOptions): string`**
  - **Input:** 
    - `object`: The object to be converted into a query string.
    - `options`: Custom options for stringifying (like handling array formats and encoding).

Removed for brevity...

## Outputs

### Parsing Functions
- **`parse`** returns an object representation of the query string, with keys corresponding to parameter names and values corresponding to parameter values.
- **`parseUrl`** returns an object with the URL, the parsed query string, and, if applicable, the fragment identifier.

Removed for brevity...

## Example Usage

```javascript
import queryString from 'query-string';

// Parsing query strings
const parsed = queryString.parse('?foo=bar&bar=baz');
console.log(parsed); // { foo: 'bar', bar: 'baz' }

// Stringifying objects
const stringified = queryString.stringify({ foo: 'bar', bar: 'baz' });
console.log(stringified); // 'foo=bar&bar=baz'

// Manipulating URLs
const url = 'https://example.com/?a=1&b=2#fragment';
const parsedUrl = queryString.parseUrl(url);
console.log(parsedUrl);
// { url: 'https://example.com/', query: { a: '1', b: '2' }, fragmentIdentifier: '#fragment' }
```

This library is suitable for both browser and Node.js environments and supports various encoding and decoding options, making it versatile for web development tasks related to URLs and query strings.

Not bad. It generated a useful README. With gpt-4o-mini model it did an ok job to create the README.md.

Below is the default system prompt they use. It is quite basic and you will have to update it to meet your needs.

Summarize the contents of the given code for documentation purposes in markdown. Make sure there is a heading, as well clear sections for inputs and outputs. Consider both what the code does and how it is likely to be used by someone. Use markdown formatting to make it look nice.

Let’s use a different system prompt.

You are an expert technical documentation writer. You are tasked with writing README.md for a Git repository. Write the README.md following the example shown below in the example section.

## Example

"""
# Foobar

Foobar is a Python library for dealing with word pluralization.

## Installation

Use the package manager [pip](https://pip.pypa.io/en/stable/) to install foobar.

```bash
pip install foobar
```

## Usage

```python
import foobar

# returns 'words'
foobar.pluralize('word')

# returns 'geese'
foobar.pluralize('goose')

# returns 'phenomenon'
foobar.singularize('phenomena')
```

## Contributing

Pull requests are welcome. For major changes, please open an issue first
to discuss what you would like to change.

Please make sure to update tests as appropriate.

## License

[MIT](https://choosealicense.com/licenses/mit/)
"""

This time, when we ran the command, it generated a README that follows the example structure. Below is a screenshot of the rendered README content.

Patchflow 2. DependencyUpgrade

Now, let’s look at a slightly more complex patchflow. To use this patchflow, you first need to install a couple of tools.

npm install -g @cyclonedx/cdxgen
pip install owasp-depscan

This patchflow has following steps.

PatchflowProgressBar(self).register_steps(
    AnalyzeImpact,
    CallLLM,
    CommitChanges,
    CreatePR,
    ExtractDiff,
    ExtractModelResponse,
    ExtractPackageManagerFile,
    ModifyCode,
    PreparePR,
    PreparePrompt,
    ScanDepscan,
)

The above is not the order in which these steps run. The DependencyUpgrade run method call the steps in the order required.

Now, let’s try the DependencyUpgrade patchflow.

patchwork DependencyUpgrade openai_api_key=sk-xxx model="gpt-4o-mini" folder_path=/tmp/query-string

It greeted me with the following error message.

Error running patchflow DependencyUpgrade: cdxgen version 10.9.2 is installed, but version 10.7.1 is required. 

I am unsure why they require uses to install a specific version. Ideally it should work within a version range. heir documentation lacks any mention of this requirement. It’s unclear if they’re using Patchwork to develop Patchwork itself.

So, let’s downgrade the dependency.

npm install -g @cyclonedx/cdxgen@10.7.1

Now, let’s again try the command. This time again it failed.

patchwork DependencyUpgrade openai_api_key=sk-xxx model="gpt-4o-mini" folder_path=/tmp/query-string

The reason for failure is that this patchflow does not support folder_path option. You have to run this patchflow from within the project directory. So, change directory to cd /tmp/query-string and then ran the above command.

patchwork DependencyUpgrade openai_api_key=sk-xxx model="gpt-4o-mini" 

This time also it failed.

Step ScanDepscan failed                                                                                       Error running patchflow DependencyUpgrade: SBOM VDR file not found from Depscan.

ScanDepscan step run the following depscan command.

depscan --debug --reports-dir /var/folders/_r/6md7npdj7_98rq_th2bwpx6h0000gr/T/tmp9m1ezvg2

I looked into the reports directory and found that it generates sbom-universal.json file. The code was looking for file `*vdr.json. Since this file does not exist so it was failing. I was not sure why it is generating sbom-universal.json. Ideally, it should generate sbom-js.json.

VDR stands for Vulnerability Disclosure Report.

I looked at the depscan docs and found that you can specify the language using an option as shown below.

depscan --debug -t js --reports-dir /var/folders/_r/6md7npdj7_98rq_th2bwpx6h0000gr/T/tmp9m1ezvg2

This generates sbom-js.json which has the complete dependency graph.

patchwork DependencyUpgrade openai_api_key=sk-xxx model="gpt-4o-mini" language="js"

Now, when it calls depscan it passes -t js option. But, it again failed because code was searching for sbom-js.vdr.json

I tested depscan with a couple of GitHub repositories. I observed that a vdr.json file is only generated when a repository has vulnerabilities. Since the query-string library has no known vulnerabilities, depscan didn’t generate the file. To test the functionality further, I used an older, unmaintained repository: https://github.com/anonrig/url-js.

Checkout the code and change directory to this repo.

git clone git@github.com:anonrig/url-js.git
cd url-js

Now, let’s run the patchwork CLI again.

patchwork DependencyUpgrade openai_api_key=sk-xxx model="gpt-4o-mini" language="js"

This time it finished successfully. It created a new branch called dependencyupgrade-main and made one commit to it. Since I didn’t provide a GitHub API token, it couldn’t create the PR on my behalf.

The package.json diff is shown below.

One question you might have is whether this works for all programming languages and build tools. Looking at the code of the find_package_manager_files method in the ExtractPackageManagerFile class, it supports four language stacks.

package_manager_files = {
    "pypi": ["requirements.txt", "Pipfile", "pyproject.toml"],
    "maven": ["pom.xml", "build.gradle", "build.gradle.kts"],
    "npm": ["package.json", "yarn.lock"],
    "golang": ["go.mod", "go.sum"],
}

Conclusion

The open-source version of Patchwork is still in its early stages and lacks maturity. The tool struggled with basic use cases. For simple tasks like generating a README, I expected it to handle edge cases, such as context length limitations. Additionally, Patchwork would benefit from improved documentation and more informative logging.

Despite its limitations, Patchwork demonstrates the potential of LLMs to automate routine development tasks. By focusing on these smaller, repetitive actions, Patchwork hints at a future where LLMs can significantly reduce developer toil.


Discover more from Shekhar Gulati

Subscribe to get the latest posts sent to your email.

Leave a comment