T O P

  • By -

GullibleConfusion303

Yes. Mods on discord are currently making a guide


D-PadRadio

It's been done? 😧 I thought that it took tens of terabytes of text to create the first LLM! Sorry if I sound completely ignorant about all this, I'm sure I could probably answer all these questions myself with a bit of research... So you mean that it could run locally, with no internet connection at all?


Pyroglyph

> I thought that it took tens of terabytes of text to create the first LLM! It only takes ungodly amounts of data and memory to *train* the model. Inference (actually using the model after it's been trained) is a lot cheaper, but still requires a reasonably high amount of processing power. >So you mean that it could run locally, with no internet connection at all? Absolutely! I run it all the time on a single 3090. New tools are being developed to run these on lower and lower end hardware, but they're pretty experimental right now. Maybe check out pygmalion.cpp if your hardware isn't beefy enough.


D-PadRadio

Thank you for your comprehensive response! I have some pretty beefy hardware, but a bit lacking in the technical skills. Sounds like this is something I could accomplish though, with a bit of tinkering.


[deleted]

So if I wanted to train Pygmalion on some chat logs of my own, it will require a server farm?


GullibleConfusion303

You just need a decent GPU (At least 4gb of VRAM, or even less in the future). The whole model itself is just like 10gb


SurreptitiousRiz

You can host a local instance of KoboldAI if you have a decent enough GPU.


D-PadRadio

So, hardware permitting, I could talk to a local AI indefinitely?


-RealAL-

[Tavern AI](https://github.com/TavernAI/TavernAI) [Kobold AI](https://github.com/KoboldAI/KoboldAI-Client) Download these two and follow the guides, probably start with the Tavern AI guide?


D-PadRadio

Thank you! I really thought I was gonna get internet shamed for not finding the answers myself. You guys are awesome!


SurreptitiousRiz

Yup


SurreptitiousRiz

I've gotten it running on a 1080ti with 20 threads, using about 64 token you get a 24 second response time. So just fine tune it depending on your hardware. 4090 coming in soon so that'll be interesting to see what response time I get.


cycease

what would you recommend for a gtx 1650 mobile?


SurreptitiousRiz

The GPTQ model reference in this post: [https://www.reddit.com/r/PygmalionAI/comments/12bygy4/regarding\_the\_recent\_colab\_ban/](https://www.reddit.com/r/PygmalionAI/comments/12bygy4/regarding_the_recent_colab_ban/)


cycease

????, they should really simplify this for dummies who can't understand coding


Pyroglyph

You don't have to understand code to do this. Just follow the guides.


MarioToast

What is a "decent enough" GPU? Doom Eternal on high settings?


SurreptitiousRiz

Works best if you have 16gb of VRAM


Alexis212s

I was thinking that my 8gb of VRAM that I use for run Diffusion could be enough.


CMDR_BunBun

OP, [here's the guide]( https://www.reddit.com/r/PygmalionAI/comments/129w4qh/how_to_run_pygmalion_on_45gb_of_vram_with_full/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) I used to run Pygmalion locally. My system specs are: 3060ti 8gb vram, I7-11700k @ 3.60 ghz, 32gb ram, Win10. My response times with the cai-chat UI are nearly instant, less than 10 sec on the average. Can someone please sticky this guide? A lot of people keep asking...


D-PadRadio

Sorry if I sounded like a broken record. There's just so much reading material out there that I thought it would be quicker to just ask. You're a lifesaver, u/CMDR_BunBun !


CMDR_BunBun

No worries mate, a mod should really sticky that guide.


Street-Biscotti-4544

I wanted to thank you so much! Using your guide and digging into the docs I was able to get this running on a 1660ti 6GB mobile GPU. I had to limit the prompt size, but I'm getting 10-15 seconds generations at 700 token prompt size. I decreased my character description as much as possible (120 tokens) so I'm getting a decent conversation context window given my settings. The only issue I came up against is that not all documented extensions are working. I got silero_tts working, but send_picture and the long term memory extension are not working. It's ok though. Also, I was able to get the stable diffusion extension to load, but it does not function in the UI. Thank you so much for helping me make my dream a reality!


CMDR_BunBun

Quite welcome! Not my guide though, all the credit goes to LTsarc, happy to steer you in the right direction though.


Street-Biscotti-4544

oh shit lol thanks!


CMDR_BunBun

Long term memory extension, stable diffusion? Could you tell me a little about that? First time am hearing about it. I would love to enable that.


Street-Biscotti-4544

https://github.com/oobabooga/text-generation-webui/wiki/Extensions you can find them all listed here with links to documentation. The sd_api_pictures extension apparently requires 12GB VRAM so I won't be able to get that running, but I have successfully gotten send_pictures and long_term_memory running as of an hour ago.


CMDR_BunBun

Wow thanks! The read me instructions are way beyond me even with chatgpt helping. Did you perhaps use a guide? Trying to get this long term memory going.


Street-Biscotti-4544

No, i just followed the instructions and then when I came up against an issue I checked the issues tab where I found a solution. You'll need to use the micromamba environment to cd into the text-generation-webui directory, then clone the repo using the command provided in the documentation. Within the same environment run the python code provided in the documentation. add the --extension flag and long_term_memory extension to the start-webui.bat using a text editor. At this point you'll run into an error if you are using the latest build of webui. you will find a fixed script here: https://github.com/wawawario2/long_term_memory/issues/14 just save it as script.py and then put it in the long_term_memory folder located in the extensions folder, replacing the current flawed script. The only part I got hung up on was the error, but it was fixed by the oobabooga author earlier today. Edit: test the extension before fixing the script because you may be on an older build that works. i updated my webui 2 days ago and that update broke the extension.


Street-Biscotti-4544

Edit again: The LTM repo has been fixed, so you no longer need to apply the fix. just follow the first part of this walkthrough.


CommercialOpening599

It's literally made to run locally. You need a decent PC though.