I tested out GLM 4.71 to do some LLM-assisted programming with Nim2. One of these tests was to take hand written spec docs based on a PDF of how the LPeg3 parsing machine works. The goal was to cram through as much of the bootstrap phase as possible with it. It was able to create a recursive parser bsaed on some examples4 and mostly get it right. Then I realized while working on the AST extractor that there was a flaw in the spec.

Arguments were not capturing themselves as a node along with their key and value fields inside. Or, at least, the spec did not call for the to do so.

I fixed the spec and sought to fix the code–something I really did not wish to do as fixing throwaway code is kind of painful. But fixing this now makes extracting the tree from it simpler to model. So I did.

Turns out that the LLM was capturing this data but not the keys or values. So in short:

  1. The spec was wrong and captured only key and value fields, not the argument
  2. The LLM implemented capturing the argument and not the fields
  3. The wrong implementation was created off the wrong spec

I hadn’t captured this initially because I was just in a hurry to blast past the bootstrap phase. The tool will generate its own parser so the quality of the first pass isn’t important. When I got to writing the AST code by hand and reading the spec carefully–I found the implementation was wrong and also the spec.

Another one of those cases that makes you incredily skeptical of companies in a hurry to replace their human programmers.


  1. Currently, a cheap 10$/mo LLM from China with relative capabilities of Claude. ↩︎

  2. As far as “activation energy” goes its been great. If a project is completely unworkable to executive dysfunction its nice to give it examples, specs, and have it push the project a bit more forward. As far as a programmer goes it has been pretty bad. ↩︎

  3. LPeg is a popular parsing library for Lua. ↩︎

  4. Which are taken from my own experience writing these things by hand–tediously. ↩︎