agnos.is

The agnos.is blog

Published: 2024-12-05T11:24:00+01:00

No, I'm not talking about the Google AI model.

I'm talking about the Gemini protocol.

But chances are, if you're reading this post, you already know that I'm talking about the protocol:

Gemini is a new internet technology supporting an electronic library of interconnected text documents. [...] We are out to build a lightweight online space where documents are just documents.

When I created this site, it was essentially my return to having a personal website on the internet. At first, I only had the Gemini capsule; the HTML version was added later. But why create a Gemini capsule and ignore the normal HTTP Web? The first reason was, simply, because I could.

But there were other reasons:

  • Simplicity in user experience.
  • Simplicity of maintaining the site.
  • A content-first experience.

Simplicity

Gemini, by design, is very simple. Perhaps too simple. The gemtext format is similar to Markdown, but stripped down to its barest essential components:

  • One line does one thing (text, link, quote, etc).
  • Only three heading levels.
  • No stylistic formatting.

There is no styling of text in Gemtext. There is no bold, italics, or underlines. There are no inline code snippets. Only code blocks. This forces the author (me!) to focus on the best possible presentation of the text. What little styling is possible over Gemini protocol is up to the user's browser.

The User Experience

Although Geminispace is populated almost exclusively by tech nerds, I try to design my capsule for people who are NOT tech nerds. This is my personal site, not an obscure git repository.

The forced simplicity of Gemini and its native text format, gemtext, carries over to the Web version of this capsule. The Web version is static HTML, and is designed to appear more or less exactly the same way as the capsule appears in a Gemini browser.

The result is a fast-loading website that renders well on both desktop and mobile form factors. There is no JavaScript, and navigating the site is as good as I am able to make it.

The Development Experience

The original goal with creating the site in Gemini was to make it easy to maintain. It is easy to create content for it. Admittedly, though, creating this website did turn into an exercise of making my own static site generation pipeline. This is not the definition of “simple” for regular people. The site is actually stored in a git repo and built and updated via a CI (continuous integration) pipeline. Any change I make to the site is immediately deployed and visible on both the Gemini capsule and the HTML version.

Most of the heavy lifting is done by gempost.

The main Gemlog and the Astroponic Garden are treated as two separate “capsules” by Gempost. The “main site” contains the Gemlog you are now reading, and the rest of the content as pages. The Astroponic Garden is the tinylog formatted stream of consciousness.

Gemlog Astroponic Garden

Because everything for creating and building the site is in place, creating content “just works.” But it took a while to get there.

Content-First

When I was writing this post, I originally used the term “content-driven.” But I quickly changed it to content-first. In my mind, “content-driven” more refers to a marketing-centered experience. I don't want to “drive” things. Content-first is a better way to explain what I want out of Gemini. The idea is that the CONTENT informs the design (what little of it is possible) and not the other way around.

Gemtext forces you to get creative when it comes to presenting a beautiful experience to the reader. I am very particular about how a page looks in Gemini browsers, and I design this capsule with that in mind.

It's hard to quantify exactly WHAT particulars I have. But it tends to be:

  • Language that is engaging and not overly dense.
  • Natural flow of words and links in the document.
  • Headings that promote ease of navigation and reading.

The biggest challenge for me has been presenting an engaging layout without making the text overly dense or hard to follow.

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-11-30T12:43:52+01:00

A quick update: I am now working on a simple tool to allow LLMs to connect to Gemini capsules.

Find the documentation here.

Find the code here: https://openwebui.com/t/projectmoon/gemini_protocol_tool/ https://git.agnos.is/projectmoon/open-webui-filters/src/branch/master/gemini.py

The initial release supports basic direct connection to Gemini pages, and relies on the model's contextual understanding to navigate the capsule. This allows you to interact with Gemini in a conversational way.

To-Do List:

  • Implement handling of non-2x status codes.
  • Convert Gemtext to Markdown for better LLM understanding.
  • Handle file uploads?
  • Handle non-text content.

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-11-29T22:49:14+01:00

A major update for the OpenStreetMap tool for OpenWebUI is now available. You can grab it at one of the links below.

https://openwebui.com/t/projectmoon/open_street_map/ https://git.agnos.is/projectmoon/open-webui-filters

The tool is now on the 2.x iteration, and comes with a number of improvements and bug fixes.

New Features

There are several new features that make the tool more useful and reliable.

  • Navigation: LLMs can now provide navigation information from ORS, and answer questions about distance between places.
  • New Citations: Fancy, styled citations per OSM result.
  • New POI category: tourist attractions.
  • POI ranking system: currently used only for tourist attractions, to push more prominent attractions to the top.

Of the officially tested models, the recommended ones for use with the navigation feature are:

  • Qwen2.5
  • Mistral Nemo
  • Mistral Small

These models (and their derivatives) demonstrate contextual awareness and can understand conversations where the user asks about a destination, and then asks how to get there in a later message.

Llama3 will work, but it requires a very explicit prompt to use the navigation function. Something like: “How do I get to X? Use the OSM navigation tool” usually triggers it.

Other Improvements

Other improvements and bug fixes:

  • Information from OpenRouteService is now cached.
  • Various fixes for crashes and misleading information.

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-10-15T22:48:20+02:00

Short and simple update: two new filters that make use of Open-WebUI's new feature to render collapsible sections in order to separate the reasoning and thought processes from final output for models that support this kind of workflow.

There are two new filters: one for Reflection 70b and one for Artificium 8b.

These filters act on the final output of the models, and separate the LLM's reasoning and thinking steps from the final output, and put them into collapsible sections. It makes it easier to focus on the LLM's final answer. Additionally, the filters can optionally remove the thinking steps from the input when text is submitted to the LLM, in order to save on token use.

Artificium Thinking Filter Collapsible Thought Filter (Reflection 70b)

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-09-30T19:01:37+02:00

The 1.0.0 release of the OpenStreetMap tool for OpenWebUI has arrived!

https://openwebui.com/t/projectmoon/open_street_map/

The 1.0.0 release has the following changes that make the tool a powerful map searching solution OpenWebUI, allowing users to find points of interest (POIs) near any place: – Caching: all data except the actual POI search is now cached, reducing load on public OSM services. – Nameless entities are now handled. This is important for minor landmarks, playgrounds, and other leisure areas. – The tool now can now report its results as a citation in replies. – Breaking change: the Nominatim URL setting has changed.

Along with calculating travel distance to nearby destinations, and various other bug fixes over the past weeks, I feel like it's time for a 1.0 release.

Breaking Config Change

The biggest announcement is the breaking change of the Nominatim URL valve setting for the tool. This used to point directly to Nominatim's search endpoint. But now the setting must point to the root URL of the Nominatim instance, because it's now using two different Nominatim endpoints.

If you need to change the setting, your LLM will refuse to search OpenStreetMap and warn you about this.

Caching and Nameless Entities

The two most important changes are caching and nameless entity handling. The public OSM services are available for free to everyone, and the data does not change THAT often. By caching address lookup data, extra POI information, and navigation travel distance, a lot queries against the OSM servers can be avoided. This is particularly important for the common scenario of the user searching from their house, school, work: “What's the closest X near me?” Instead of hitting Nominatim every time to resolve the user's location, it is now simply pulled from the cache.

Additionally, many entities in OpenStreetMap do not have names. This is either because of lack of proper tagging, or the place may not actually have a name at all. This is very common with things like playgrounds in neighborhoods, small green spaces in cities, and so on. – The logic for determining what entities are “useful” to the tool has changed. – The tool now resolves an address for these nameless entities and uses it as the name.

Sending the results as a citation is just a nice bonus that gives some insight into what the OSM tool is doing when it searches for the user.

The Future

The tool is rapidly approaching feature completion. Releases in the immediate future are going to focus on: – Cleaning up the code for a maintainable future. – Fixing bugs and edge cases. – Adding more predefined POI search functions.

The code is a bit of a mess at the moment because of OpenWebUI's requirement to have all the Python code in one file. I am considering ways to deal with this, ranging from moving a lot of the code to a PyPI module, or simply just having a simple build file that concats the code together into one Python module.

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-09-27T23:29:52+02:00

Version 0.9.0 of the OpenStreetMap tool for Open WebUI integrates OpenRouteService (ORS) for more accurate results. Travel distances between locations are calculated more precisely, rather than relying solely on “as the crow flies” measurements.

Get the latest version of the tool

Haversine Distance

By default, the OSM tool uses a straightforward distance calculation method, known as “Haversine distance,” which can be misleading in real-world scenarios. The Haversine distance is the direct distance between two points on a sphere (Earth, in our case)..

For example, two points might be close by geographically but could take a significantly longer time to reach due to roads, traffic, and other factors. Integrating ORS helps overcome this by providing actual travel distances based on real road networks.

How to Enable ORS in the Tool

Go to the OpenRouteService website.

Sign up for a free ORS account and log in. Once logged in, navigate to the API token section and create a new token. This token will be used to authenticate your requests to ORS.

In the tool’s configuration settings, you’ll find field for the ORS API Key. Enter your API key here, and now the tool will use ORS routing to calculate more accurate distances!

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-09-25T21:51:41+02:00

I have updated the Checkpoint Summary Filter for Open WebUI to account for changes to Open WebUI's code. The filter reaches deep into the internals of Open WebUI to enable per-chat continuous summarization to handle the context limit, so with all of the recent refactorings in Open WebUI, this broke the Checkpoint Summary Filter. But I have patched it to run on the latest versions of Open WebUI (0.3.29+ at the moment).

In other news, I will delete the Narrative Memory Filter, as the Checkpoint Summarization Filter is meant to be its replacement, and there is no point in maintaining two filters. Not to mention, the Narrative Memory Filter's functionality is subpar.

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-09-23T22:16:34+02:00

In the latest versions of the OpenStreetMap tool for OpenWebUI (0.6.3+), I have added proper handling of real-time location. OpenWebUI has the ability to report the user's location via the browser geolocation API.

https://git.agnos.is/projectmoon/open-webui-filters

Previous versions of the OSM tool did not prompt the large language model (LLM) properly when this information was available, so results for questions like “Where is the nearest grocery store?” would either be completely nonsensical, or centered on some nearby major city that the LLM statistically associates with the GPS coordinates reported by the browser.

Proper Handling of Real-Time Location

In order to have the OSM tool be able to answer questions like “where is the nearest grocery store to me?“, it needs access to your realtime location. This can be accomplished with the following steps: – Enable user location in your user settings. – Create a model with a system prompt that references the variable {{USER_LOCATION}}. – OpenWebUI will automatically substitute the GPS coordinates reported by the browser into the model's system prompt on every message.

Without the {{USER_LOCATION}} variable in the system prompt, OpenWebUI will not be able to give your real-time location to the LLM, and the OpenStreetMap tool will not give accurate results.

How Does It Work?

This was quite simple to enable. OpenWebUI uses the docstrings of Python functions to assemble a “tools schema” for the language model to use. OSM tool prior to 0.6.3 generated a schema that told the LLM to search for things near an “address or place.” Because of this, models would dutifully give a place or address, and NOT GPS coordinates.

This resulted in interesting behavior. The model would pick a city that it statistically associates with the user's GPS coordinates, and feed that in as a place to search nearby.

A simple tweak fixes this. The schema now tells the model to search for a “place, address, or GPS coordinates.” Models will now give the OSM tool GPS coordinates when available, and thus the tool can now answer questions like “Where is the nearest grocery store?”

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-09-17T22:35:38+00:00

Two blog posts and two updates in one day. I have now just published 0.5.0 of the OpenStreetMap tool for OpenWbeUI.

Install it from OpenWebUI.com Get the code from agnos.is Git

The reason why this update is getting another blog post is because I added support for searching for Ways in OpenStreetMap, which makes search results much, much more accurate.

While many map features in OSM are nodes, or single points on the map, many others are defined as “ways,” or shapes on the map. Buildings, roads, areas (parks, parking lots, etc), and everything else that you can draw with a line is considered a “way.” In our case, we're mostly concerned with the buildings.

By changing the Overpass Turbo query, I was able to get the bounding information for Ways, which is basically the set of GPS coordinates that make up its shape. Calculate the center coordinate, and we can now add these Ways to the search results!

Of course, the 0.5.0 update includes everything from the 0.4.0 update:

  • Complete rewrite of search result handling to prevent incorrect OSM map links being generated, and bad info given.
  • New setting: Instruction Oriented Interpretation. Controls the result interpretation instructions given to the LLM.
  • The Instruction Oriented Interpretation setting is also a user valve that can be controlled on a per-chat basis.
  • Added ability to search for: public transit, schools/universities, libraries, bike rental locations, car rental locations.

License: CC-BY-SA-4.0.

Written by: @[email protected]

Published: 2024-09-17T20:22:27+00:00

I just published version 0.4.0 of the OpenStreetMap tool for OpenWebUI. This tool allows large language models (LLMs) to search OpenStreetMap to find points of interest near addresses and locations.

Install it from OpenWebUI.com Get the code from agnos.is Git

I rewrote the search result handling to ensure that links to OpenStreetMap are generated more accurately. I noticed that Mistral-Nemo was generating bad links: either the GPS coordinates would be the right numbers, but the wrong sign (negative or positive flipped), or the URL itself was invalid despite having the right GPS coordinates.

  • Result handling now generates a Markdown result list for increased accuracy.
  • Each entry includes a simple bullet list consisting of the GPS coordinates, address, etc.
  • The entry also includes a generated OSM map link.

A new setting called “Instruction Oriented Interpretation” has also been added. This setting controls the level of detail of the instructions sent to the language model. It should improve result consistency, especially with models like Hermes 3. You can also toggle the setting on a per-chat basis for more fine-grained control. Finally, the search functionality has been expanded to include public transit, schools and universities, libraries, bike rental locations, and car rental locations.

License: CC-BY-SA-4.0.

Written by: @[email protected]