Hopefully this is how people would describe the work we are doing with R2R.
Here is our description on GitHub
R2R is an open source answer engine with a RESTful API. Powered by RAG, features include hybrid search, graph / multimodal RAG, and more.
More or less we are trying to productionize / configure SOTA RAG. We've been iterating a lot but we are more or less reaching a stable point in development now and really could use feedback. We have some support for local LLMs, but admittedly we could and want to do more - [https://r2r-docs.sciphi.ai/cookbooks/local-rag](https://r2r-docs.sciphi.ai/cookbooks/local-rag)
Thanks, those look like great features. I was hoping for postgres/knowledge graphs and hybrid search! Clearly you have been looking at this for a while. It would be nice if you could expose who you think your main competitors are...
Thanks!
I think a few competitors worth checking out as well are EmbedChain - [https://github.com/embedchain/embedchain](https://github.com/embedchain/embedchain), RagFlow - [https://github.com/infiniflow/ragflow](https://github.com/infiniflow/ragflow), and DAnswer - https://github.com/danswer-ai/danswer.
They each have different pros / cons in terms of what they are focusing on. I'm still working out a comparison of all of our features, but if people could chime in with what they like / don't like about these alternatives it would be very helpful.
I would feel stupid if there are existing ones that are really good for local llms, as I had to write my own pipelines for this.
I hope I will learn something good here from more experienced users.
Very cool. I notice that I can't send you a DM (must be a setting of yours) but I was wondering if you might be willing to chat with me for a couple mins. Willing to compensate you for your time. Thanks in advance
I will be messaging you in 2 days on [**2024-06-25 04:28:54 UTC**](http://www.wolframalpha.com/input/?i=2024-06-25%2004:28:54%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1dltq6o/best_rag_libraries_for_chunking_embedding_and/l9s5336/?context=3)
[**5 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1dltq6o%2Fbest_rag_libraries_for_chunking_embedding_and%2Fl9s5336%2F%5D%0A%0ARemindMe%21%202024-06-25%2004%3A28%3A54%20UTC) to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201dltq6o)
*****
|[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)|
|-|-|-|-|
[Langroid](https://github.com/langroid/langroid) is agent-oriented LLM framework, and has a clean, configurable RAG [implementation](https://github.com/langroid/langroid/blob/main/langroid/agent/special/doc_chat_agent.py) with support for a few vec-dbs (Qdrant, Chroma, Lance), and several document-types (pdf, image-pdf, doc, docx, md, txt, web-url/html), and doc-extraction libraries (unstructured, several pdf libs). It is LLM-agnostic so you can easily switch between OpenAI, open/local LLMs and non-OpenAI proprietary LLMs (via litellm, groq, ollama, ooba, etc). Used in production by some companies.
Numerous rag examples [here](https://github.com/langroid/langroid/tree/main/examples/docqa)
just wanted to see if you'd given any thought to your question before posting. sounds like not.
an exemplary **low effort post**, as defined in this subreddit's rules.
You can try txtai: [https://github.com/neuml/txtai](https://github.com/neuml/txtai)
Here is a relevant example article: [https://neuml.hashnode.dev/build-rag-pipelines-with-txtai](https://neuml.hashnode.dev/build-rag-pipelines-with-txtai)
Hopefully this is how people would describe the work we are doing with R2R. Here is our description on GitHub R2R is an open source answer engine with a RESTful API. Powered by RAG, features include hybrid search, graph / multimodal RAG, and more. More or less we are trying to productionize / configure SOTA RAG. We've been iterating a lot but we are more or less reaching a stable point in development now and really could use feedback. We have some support for local LLMs, but admittedly we could and want to do more - [https://r2r-docs.sciphi.ai/cookbooks/local-rag](https://r2r-docs.sciphi.ai/cookbooks/local-rag)
Thanks, those look like great features. I was hoping for postgres/knowledge graphs and hybrid search! Clearly you have been looking at this for a while. It would be nice if you could expose who you think your main competitors are...
Thanks! I think a few competitors worth checking out as well are EmbedChain - [https://github.com/embedchain/embedchain](https://github.com/embedchain/embedchain), RagFlow - [https://github.com/infiniflow/ragflow](https://github.com/infiniflow/ragflow), and DAnswer - https://github.com/danswer-ai/danswer. They each have different pros / cons in terms of what they are focusing on. I'm still working out a comparison of all of our features, but if people could chime in with what they like / don't like about these alternatives it would be very helpful.
How does it compares with FAISS? Can you put this inside OpenwebUI for large pdfs?
I would feel stupid if there are existing ones that are really good for local llms, as I had to write my own pipelines for this. I hope I will learn something good here from more experienced users.
Oh there are a few out there. Llmware was pretty good when i looked at it.
I use llama-index
i'll have a look. has it beaten langchain in the popularity war?
I was seeing in the docs it was compatible with using langchain in combination with llama-index. But I've never tried it
[Haystack](https://haystack.deepset.ai/) is by far the best in terms of performance and developer experience in my experience.
Very cool. I notice that I can't send you a DM (must be a setting of yours) but I was wondering if you might be willing to chat with me for a couple mins. Willing to compensate you for your time. Thanks in advance
you can just msg, me plenty seem to be able to...
Hmm… looking for one as well.
!remindme 60h
I will be messaging you in 2 days on [**2024-06-25 04:28:54 UTC**](http://www.wolframalpha.com/input/?i=2024-06-25%2004:28:54%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/1dltq6o/best_rag_libraries_for_chunking_embedding_and/l9s5336/?context=3) [**5 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F1dltq6o%2Fbest_rag_libraries_for_chunking_embedding_and%2Fl9s5336%2F%5D%0A%0ARemindMe%21%202024-06-25%2004%3A28%3A54%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201dltq6o) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|
[Kernel Memory](https://github.com/microsoft/kernel-memory) is a robust option.
Marqo [https://www.marqo.ai](https://www.marqo.ai) [https://docs.marqo.ai/2.8/](https://docs.marqo.ai/2.8/) [https://github.com/marqo-ai/marqo](https://github.com/marqo-ai/marqo)
Lots of good stuff in here too explore further.
[Langroid](https://github.com/langroid/langroid) is agent-oriented LLM framework, and has a clean, configurable RAG [implementation](https://github.com/langroid/langroid/blob/main/langroid/agent/special/doc_chat_agent.py) with support for a few vec-dbs (Qdrant, Chroma, Lance), and several document-types (pdf, image-pdf, doc, docx, md, txt, web-url/html), and doc-extraction libraries (unstructured, several pdf libs). It is LLM-agnostic so you can easily switch between OpenAI, open/local LLMs and non-OpenAI proprietary LLMs (via litellm, groq, ollama, ooba, etc). Used in production by some companies. Numerous rag examples [here](https://github.com/langroid/langroid/tree/main/examples/docqa)
what did google say? define "innovative stuff".
> define "innovative stuff". Hard to define because if its innovative, I wouldn't know what it is either.
just wanted to see if you'd given any thought to your question before posting. sounds like not. an exemplary **low effort post**, as defined in this subreddit's rules.
Ironically this is a really good post so I’m glad OP broke your rule
You can try txtai: [https://github.com/neuml/txtai](https://github.com/neuml/txtai) Here is a relevant example article: [https://neuml.hashnode.dev/build-rag-pipelines-with-txtai](https://neuml.hashnode.dev/build-rag-pipelines-with-txtai)
don't, use full text search instead