Stream: nixify-llm

Topic: Editor integration (LSP?)


view this post on Zulip David Arnold (Feb 24 2024 at 09:11):

https://github.com/djandries/llmvm/blob/master/frontends%2Fcodeassist%2FREADME.md

view this post on Zulip David Arnold (Feb 24 2024 at 09:11):

What will be the way to go?

(Under the umbrella premis of defying proprietary standards and ecosystems)

view this post on Zulip David Arnold (Feb 24 2024 at 09:12):

@Shivaraj B H ^^ helix-bound, as well.

view this post on Zulip David Arnold (Feb 24 2024 at 09:37):

https://github.com/DJAndries/llmvm/issues/7

view this post on Zulip David Arnold (Feb 24 2024 at 10:03):

https://github.com/DJAndries/llmvm/issues/8

view this post on Zulip Andreas (Feb 24 2024 at 10:04):

For the true elite who have long left CLI-based tools behind and embraced the virtues of the GUI, there is (at least for VSCode) for interacting with locally hosted A.I. models:

https://github.com/continuedev/continue
https://github.com/ex3ndr/llama-coder

I have been playing with these a bit. Not sure what is out there for Neovim, yet. Probably some things will be found.

view this post on Zulip David Arnold (Feb 24 2024 at 10:05):

llmvm is a good approach, indeed for text based editors, the way how it injects itself into the LSP protocol.

view this post on Zulip David Arnold (Feb 24 2024 at 10:06):

(or even for VSCode)

view this post on Zulip David Arnold (Feb 24 2024 at 10:08):

It can run models against:

It's actually nice that every layer of the stack can be provided remotely, even the core component to the frontend.

view this post on Zulip Andreas (Feb 24 2024 at 10:09):

I mean there is stuff out there like

https://github.com/nomnivore/ollama.nvim
https://github.com/jpmcb/nvim-llama

So people are playing with this obviously, I am just not sure what will stick yet.

view this post on Zulip David Arnold (Feb 24 2024 at 10:09):

Llmvm Core

manages state related to text generation, such as:
Model presets
Prompt templates
Message threads
Projects/workspaces

That's pretty neat and a sound design.

view this post on Zulip Andreas (Feb 24 2024 at 10:11):

I guess running against locally hosted remote API is what I would go for, most likely either ollama API or OpenAI compatible, which more and more are implementing.

Yes, LLVM core is looking promising.

If you might then integrate something like khoj which can use entire GitHub repos as context apparently, it might get even more interesting

view this post on Zulip Andreas (Feb 24 2024 at 10:12):

I wonder what would happen if we self-hosted khoj and fed it nixpkgs as a repo.

view this post on Zulip David Arnold (Feb 24 2024 at 10:12):

I'm always a bit wary of the foundations of auch productized "solutions", tbh.

"Solutions to what?"

In LLMVM i was able to exactly identify what I'd be missing in about an hour of research:

I think, this is what we want:

view this post on Zulip David Arnold (Feb 24 2024 at 10:14):

Issue 8 requests project scope context.

But for the LSP-based codeassist feature, too much context might actually be harmful for the result quality.

view this post on Zulip David Arnold (Feb 24 2024 at 10:14):

Of course, if your language is non typed, you're really a bit out of luck. But that's true anyways.

view this post on Zulip Andreas (Feb 24 2024 at 10:16):

Let's see if you get a discussion on your issues.

view this post on Zulip David Arnold (Feb 24 2024 at 10:22):

Locally hosted remote api:

It appears that from the web of context around specifically this resource https://github.com/rustformers/llm

That llvm.cpp is the current state of the art model runtime (over ONNX, which is a bit more slow moving and has more industry support). Nitro is built on it.

Specifically the GGUF model format appears to be regarded as promising by the domain experts in the same web of context.

The llm create used by llmvm is lagging a bit on support for the latest llvm.cpp developments, which is why separately hosting with nitro and using the offloading backend (e.g. via a OpenAI compatible API) sounds like a good escape hatch.

But it's nice that llmvm could also manage your model runtime seemlessly.

view this post on Zulip Andreas (Feb 24 2024 at 10:40):

But for the LSP-based codeassist feature, too much context might actually be harmful for the result quality.

So khoj has it's own client. And they have a GitHub integration. I am not entirely firm yet on the theoretical basis for the technique behind it, which is Retrieval Augmented Generation for LLMs. But basically it runs a Postgres-turned-VectorDB instance to store context if I see this correctly.

view this post on Zulip David Arnold (Feb 24 2024 at 10:42):

Seems like it, yes. And thanks! I just wondered why the heck postgres?

Is that more performant? Or probably it does a transformation ahead of time?

view this post on Zulip David Arnold (Feb 24 2024 at 10:45):

view this post on Zulip Andreas (Feb 24 2024 at 10:47):

Yeah that illustration is nice. If need to guesstimate this, but I'd say the corpus embeddings need to be stored somewhere.

view this post on Zulip David Arnold (Feb 24 2024 at 10:47):

What is corpus embedding?

The corpus can also be a custom embedding layer, which is specifically designed for the use case when other pre-trained corpora cannot supply sufficient data.

view this post on Zulip Andreas (Feb 24 2024 at 10:49):

yes, in order to explain exactly what that is, I'd have to read the papers detailing the technique. Again, guesstimating this, it is some linear-algebra based compression of the information in your text corpus (notes, GitHub repo) , which then can be use to enhance the LLM output to refer to information peculiar to your corpus of text.

view this post on Zulip David Arnold (Feb 24 2024 at 10:50):

Since we're not dealing with images and ledgers in a Code Editor scenario, this type of transformation and corpus embedding is probably less relevant.

But for retreival/search it seems key.

view this post on Zulip Andreas (Feb 24 2024 at 10:51):

I mean Code is text. Text is run through a tokenizer, and embeddings are vector-space-based linear algebra reprsentations of the tokenization of your text (more or less). But that is true for LLMs in general.

view this post on Zulip David Arnold (Feb 24 2024 at 10:51):

Preliminary Conclusion:

view this post on Zulip Andreas (Feb 24 2024 at 10:52):

well I am not so sure you couldn't somehow fuse these

view this post on Zulip Andreas (Feb 24 2024 at 10:53):

but as it stand right now, you are right

view this post on Zulip David Arnold (Feb 24 2024 at 10:53):

It doesn't make really sense in a near scope: for retreival you already have LSPs and rich semantic code identifiers, such as types, etc.

Of course that disregards some unknown potential that might just reveal itself after it is there.

view this post on Zulip David Arnold (Feb 24 2024 at 10:54):

But Khoj can run with the same model backend as llmvm, for example, so at that level, they can share a stack (or a credit card). :-)

view this post on Zulip Andreas (Feb 24 2024 at 10:55):

I mean what I have in mind here is: use these corpus embeddings of a GitHub repo or a whole set of GitHub repos (think: nixpkgs + all commuity repos of sufficient quality; or alternatively, all the repos in the cargo database with a certain number of stars for rust) with an existing code-optimized LLM. Is something you'd have to try and see what it does.

view this post on Zulip David Arnold (Feb 24 2024 at 10:57):

How do you get the corpus embeddings injected into the prompt? Because maybe that's not too far off as a data source to inject from khoj for lvmvm.

view this post on Zulip Andreas (Feb 24 2024 at 10:57):

But I have no idea if this even makes any sense. But since nixpkgs is a monorepo it might (ironically) be an easy thing to just try and feed it to khoj. And then run khoj with codellama or deepseek-coder and see what comes out.

view this post on Zulip Andreas (Feb 24 2024 at 10:58):

How do you get the corpus embeddings injected into the prompt?

I mean I assume that is what khoj does when you query it?

And the for getting the repos into khoj they have this:

https://docs.khoj.dev/online-data-sources/github_integration/

view this post on Zulip Andreas (Feb 24 2024 at 11:05):

Noe it seems like you might have to trick or configure it explicitly it to accept the file containing the actual code as the plaintext stuff you want to create context embeddings for. It seems to be somewhat against the design idea of khoj to do that. However that doesn't necessarily mean it wouldn't work. But it might take some time it seems. At least they say it does for large repos.

view this post on Zulip David Arnold (Feb 24 2024 at 11:06):

Khoj just could give the preprocessimg to the toolchain, not the other way round.

view this post on Zulip Andreas (Feb 24 2024 at 11:07):

that's all things you'd have to experimentally try out which works and which doesn't. I mean I might try and set something up at some point.

view this post on Zulip David Arnold (Feb 24 2024 at 11:07):

Btw, Eka Types is what I see as our highly rated comtext embedding sources for the Eka ecosystem. Since typed and schemaed they should be of extremely high quality, already.

view this post on Zulip Andreas (Feb 24 2024 at 11:20):

I mean the funny things about these LLMs is that they do stuff that looks intelligent, but nobody really knows what is going on in there in detail. It might just be a fancy non-linear lossy compression technique like zip files that you can query in natural language (or weird optimized totally unnatural and unexpected ways). So if types add information and you fine-tune a model of a typed variety of an otherwise untyped language it might add something. But until you do you don't know. In the end it's all just text that is being vectorized. (Or context for that matter, which works a bit differently, but don't ask me for the details yet. I will probably read some at some point :grinning_face_with_smiling_eyes: )

view this post on Zulip David Arnold (Feb 24 2024 at 11:23):

https://github.com/khoj-ai/khoj/blob/master/src%2Fkhoj%2Fprocessor%2Fcontent%2Fmarkdown%2Fmarkdown_to_entries.py#L79-L96

This is how an entry is "compiled" from markdown. Nothing fancy, and definitly of very limited value compared to how native LSP can generate entries.

So in sum: khoj isn't really useful for the editor case, as far as I can see.

view this post on Zulip Andreas (Feb 24 2024 at 11:38):

So in sum: khoj isn't really useful for the editor case, as far as I can see.

That is possible, yes. I will try it nonetheless and see what it does. Maybe it is also just a nice open source thing to allow users ask questions verbally about potential documentation and improve their learning experience.

definitly of very limited value compared to how native LSP can generate entries.

I am not sure this is the right standard of comparison. Because the idea is to enhance the query of some LLM.

view this post on Zulip Andreas (Feb 24 2024 at 11:43):

Nothing fancy

I guess from a software development perspective LLMs are nothing fancy either. It's just a file with pre-trained weights and a program using the file, doing some input-output. This is why people doing symbolic A.I. for decades were so upset when they found that stupid deep-learning models (like transformer-based LLMs) were magnitudes better at many tasks.

view this post on Zulip David Arnold (Feb 24 2024 at 11:55):

Yes, and my argument actually was that while one enhances the LLM output about types and actual code samples, throwing in an occasional markdown snippet into the context might not necessarily produce better results, because types are already as concise as it can get.

But not actually sure, either: maybe the LLM can benefit from accompanying prose, as well.

view this post on Zulip David Arnold (Feb 24 2024 at 12:00):

In all of this: do input token length on the system prompt ("context") count costs in typical pricing models?

view this post on Zulip Andreas (Feb 24 2024 at 12:35):

afaik that depends on the company. Here is OpenAI, they calculate both input and output prices:

https://openai.com/pricing


Last updated: Nov 15 2024 at 11:45 UTC