https://github.com/djandries/llmvm/blob/master/frontends%2Fcodeassist%2FREADME.md
What will be the way to go?
(Under the umbrella premis of defying proprietary standards and ecosystems)
@Shivaraj B H ^^ helix-bound, as well.
https://github.com/DJAndries/llmvm/issues/7
https://github.com/DJAndries/llmvm/issues/8
For the true elite who have long left CLI-based tools behind and embraced the virtues of the GUI, there is (at least for VSCode) for interacting with locally hosted A.I. models:
https://github.com/continuedev/continue
https://github.com/ex3ndr/llama-coder
I have been playing with these a bit. Not sure what is out there for Neovim, yet. Probably some things will be found.
llmvm is a good approach, indeed for text based editors, the way how it injects itself into the LSP protocol.
(or even for VSCode)
It can run models against:
It's actually nice that every layer of the stack can be provided remotely, even the core component to the frontend.
I mean there is stuff out there like
https://github.com/nomnivore/ollama.nvim
https://github.com/jpmcb/nvim-llama
So people are playing with this obviously, I am just not sure what will stick yet.
Llmvm Core
manages state related to text generation, such as:
Model presets
Prompt templates
Message threads
Projects/workspaces
That's pretty neat and a sound design.
I guess running against locally hosted remote API is what I would go for, most likely either ollama API or OpenAI compatible, which more and more are implementing.
Yes, LLVM core is looking promising.
If you might then integrate something like khoj which can use entire GitHub repos as context apparently, it might get even more interesting
I wonder what would happen if we self-hosted khoj and fed it nixpkgs as a repo.
I'm always a bit wary of the foundations of auch productized "solutions", tbh.
"Solutions to what?"
In LLMVM i was able to exactly identify what I'd be missing in about an hour of research:
I think, this is what we want:
Issue 8 requests project scope context.
But for the LSP-based codeassist feature, too much context might actually be harmful for the result quality.
Of course, if your language is non typed, you're really a bit out of luck. But that's true anyways.
Let's see if you get a discussion on your issues.
Locally hosted remote api:
It appears that from the web of context around specifically this resource https://github.com/rustformers/llm
That llvm.cpp is the current state of the art model runtime (over ONNX, which is a bit more slow moving and has more industry support). Nitro is built on it.
Specifically the GGUF model format appears to be regarded as promising by the domain experts in the same web of context.
The llm
create used by llmvm
is lagging a bit on support for the latest llvm.cpp
developments, which is why separately hosting with nitro and using the offloading backend (e.g. via a OpenAI compatible API) sounds like a good escape hatch.
But it's nice that llmvm
could also manage your model runtime seemlessly.
But for the LSP-based codeassist feature, too much context might actually be harmful for the result quality.
So khoj has it's own client. And they have a GitHub integration. I am not entirely firm yet on the theoretical basis for the technique behind it, which is Retrieval Augmented Generation for LLMs. But basically it runs a Postgres-turned-VectorDB instance to store context if I see this correctly.
Seems like it, yes. And thanks! I just wondered why the heck postgres?
Is that more performant? Or probably it does a transformation ahead of time?
Yeah that illustration is nice. If need to guesstimate this, but I'd say the corpus embeddings need to be stored somewhere.
What is corpus embedding?
The corpus can also be a custom embedding layer, which is specifically designed for the use case when other pre-trained corpora cannot supply sufficient data.
yes, in order to explain exactly what that is, I'd have to read the papers detailing the technique. Again, guesstimating this, it is some linear-algebra based compression of the information in your text corpus (notes, GitHub repo) , which then can be use to enhance the LLM output to refer to information peculiar to your corpus of text.
Since we're not dealing with images and ledgers in a Code Editor scenario, this type of transformation and corpus embedding is probably less relevant.
But for retreival/search it seems key.
I mean Code is text. Text is run through a tokenizer, and embeddings are vector-space-based linear algebra reprsentations of the tokenization of your text (more or less). But that is true for LLMs in general.
Preliminary Conclusion:
well I am not so sure you couldn't somehow fuse these
but as it stand right now, you are right
It doesn't make really sense in a near scope: for retreival you already have LSPs and rich semantic code identifiers, such as types, etc.
Of course that disregards some unknown potential that might just reveal itself after it is there.
But Khoj can run with the same model backend as llmvm, for example, so at that level, they can share a stack (or a credit card). :-)
I mean what I have in mind here is: use these corpus embeddings of a GitHub repo or a whole set of GitHub repos (think: nixpkgs + all commuity repos of sufficient quality; or alternatively, all the repos in the cargo database with a certain number of stars for rust) with an existing code-optimized LLM. Is something you'd have to try and see what it does.
How do you get the corpus embeddings injected into the prompt? Because maybe that's not too far off as a data source to inject from khoj for lvmvm.
But I have no idea if this even makes any sense. But since nixpkgs is a monorepo it might (ironically) be an easy thing to just try and feed it to khoj. And then run khoj with codellama or deepseek-coder and see what comes out.
How do you get the corpus embeddings injected into the prompt?
I mean I assume that is what khoj does when you query it?
And the for getting the repos into khoj they have this:
https://docs.khoj.dev/online-data-sources/github_integration/
Noe it seems like you might have to trick or configure it explicitly it to accept the file containing the actual code as the plaintext stuff you want to create context embeddings for. It seems to be somewhat against the design idea of khoj to do that. However that doesn't necessarily mean it wouldn't work. But it might take some time it seems. At least they say it does for large repos.
Khoj just could give the preprocessimg to the toolchain, not the other way round.
that's all things you'd have to experimentally try out which works and which doesn't. I mean I might try and set something up at some point.
Btw, Eka Types is what I see as our highly rated comtext embedding sources for the Eka ecosystem. Since typed and schemaed they should be of extremely high quality, already.
I mean the funny things about these LLMs is that they do stuff that looks intelligent, but nobody really knows what is going on in there in detail. It might just be a fancy non-linear lossy compression technique like zip files that you can query in natural language (or weird optimized totally unnatural and unexpected ways). So if types add information and you fine-tune a model of a typed variety of an otherwise untyped language it might add something. But until you do you don't know. In the end it's all just text that is being vectorized. (Or context for that matter, which works a bit differently, but don't ask me for the details yet. I will probably read some at some point :grinning_face_with_smiling_eyes: )
This is how an entry is "compiled" from markdown. Nothing fancy, and definitly of very limited value compared to how native LSP can generate entries.
So in sum: khoj isn't really useful for the editor case, as far as I can see.
So in sum: khoj isn't really useful for the editor case, as far as I can see.
That is possible, yes. I will try it nonetheless and see what it does. Maybe it is also just a nice open source thing to allow users ask questions verbally about potential documentation and improve their learning experience.
definitly of very limited value compared to how native LSP can generate entries.
I am not sure this is the right standard of comparison. Because the idea is to enhance the query of some LLM.
Nothing fancy
I guess from a software development perspective LLMs are nothing fancy either. It's just a file with pre-trained weights and a program using the file, doing some input-output. This is why people doing symbolic A.I. for decades were so upset when they found that stupid deep-learning models (like transformer-based LLMs) were magnitudes better at many tasks.
Yes, and my argument actually was that while one enhances the LLM output about types and actual code samples, throwing in an occasional markdown snippet into the context might not necessarily produce better results, because types are already as concise as it can get.
But not actually sure, either: maybe the LLM can benefit from accompanying prose, as well.
In all of this: do input token length on the system prompt ("context") count costs in typical pricing models?
afaik that depends on the company. Here is OpenAI, they calculate both input and output prices:
Last updated: Jan 18 2025 at 04:45 UTC