Published: 2024-03-05T19:55:51+01:00
The AI-driven text-based adventure game has finally hit an important milestone: the implementation of hyper-specific GBNF grammar constraints at runtime!
https://git.agnos.is/projectmoon/ai-game
This allows restricting the output of the LLM to very specific values at a per-field level. It took quite a long time to implement this properly, as I don't have that much free time on my hands due to family obligations and job searching. The implementation went through several major iterations before I finally got it right.
- Basic limiting of primitive fields.
- Limiting fields but comparing based on rule text.
- Proper recursive output of rules and sorting to de-duplicate.
The addition of this feature also addresses a bunch of shortcomings in the original implementation of the derive macro. Overall, I am happy with how it works, though not necessarily happy with how long it took.
How the GBNF Derive Macro Works
Before explaining how the limiting works, a quick overview of the derive macro itself is useful. The macro, when added as a derive attribute on a struct, will add a function to the type itself called to_grammar
. This function produces a set of valid GBNF rules that allows the LLM to output values according to the shape of the annotated type.
Example:
[derive(Gbnf)]
pub struct CommandResponse {
pub valid: bool,
pub text: String
}
//...
let grammar: &'static str = CommandResponse::to_grammar();
When this grammar is given to the LLM, its output will be a JSON value that matches the struct, which can be directly deserialized by serde. This is very powerful, but suffers from a big limitation, which also plagued the original hand-rolled GBNF rules: With a string value, the LLM could still output nonsense, even though it was constrained to a JSON format. So, if it was instructed to set a field to an ID in the scene, it might do that, or it might not.
This is where the new helper attribute #[gbnf_limit]
comes in. By adding this helper attribute to a field on a struct annotated with #[derive(Gbnf)]
, A corresponding “limit struct” is created, which allows the developer to give a set of valid values the LLM can produce for the given fields.
The New GBNF Limit Implementation
The final implementation of this feature makes use of the following:
- A dynamic trait called GbnfLimitedField.
- A custom-generated limit struct, created by the derive macro.
- A new addition to the GBNF types: GbnfLimit.
The GbnfLimit type is a nested enum that mirrors the structure of the type it was derived from. This limit is passed into the initial GbnfComplex type at the root of the GBNF rule hierarchy, and as rules are recursively generated, each rule type looks into the limit struct (or its nested descendants) to see if its values should be limited. If so, the type is wrapped in an opaque “limited GBNF type” that implements the standard AsGrammar trait. These wrapper trait implementations reference the underlying type to create a limit rule, which is either the actual limited values, or a set of nested limited values (if a field of a subtype is limited).
The derive macro takes care of generating the limit struct itself, along with its implementation to convert the limit struct into a GbnfLimit instance. This was the easy part. Almost all of the complexity lay in the code for actually turning GBNF rules into the final text that would be fed to the LLM.
Implementing the limitation feature made very clear numerous flaws in the original GBNF macro implementation, which have now been addressed. Most importantly, the GBNF rules are now generated recursively, with each rule 0 or more dependent rules. This allows proper handling of optional fields, list fields, and of course, limited fields (as well as optional limited fields, and list limited fields!).
Next Steps
The next steps are pretty straightforward:
- Spin the GBNF code into its own crate, to make it available to others.
- Attempt to get rid of the requirement to use Boxed dynamic traits for the limit struct.
- Improve the error reporting from the derive macro.
- Make proper use of the limiting feature in the game itself!
- Tests! Game is getting complicated enough that it needs them.
The initial use of the feature in the game itself will simply constrain the LLM to pick from any IDs present in the scene. Later, some prompts will be rewritten, and a more accurate list of IDs can be given to the LLM.
License: CC-BY-SA-4.0.
Written by: @[email protected]