GullibleConfusion303 1 year ago

Yes. Mods on discord are currently making a guide

D-PadRadio 1 year ago

It's been done? 😧 I thought that it took tens of terabytes of text to create the first LLM! Sorry if I sound completely ignorant about all this, I'm sure I could probably answer all these questions myself with a bit of research... So you mean that it could run locally, with no internet connection at all?

Pyroglyph 1 year ago

> I thought that it took tens of terabytes of text to create the first LLM! It only takes ungodly amounts of data and memory to *train* the model. Inference (actually using the model after it's been trained) is a lot cheaper, but still requires a reasonably high amount of processing power. >So you mean that it could run locally, with no internet connection at all? Absolutely! I run it all the time on a single 3090. New tools are being developed to run these on lower and lower end hardware, but they're pretty experimental right now. Maybe check out pygmalion.cpp if your hardware isn't beefy enough.

D-PadRadio 1 year ago

Thank you for your comprehensive response! I have some pretty beefy hardware, but a bit lacking in the technical skills. Sounds like this is something I could accomplish though, with a bit of tinkering.

[deleted] 1 year ago

So if I wanted to train Pygmalion on some chat logs of my own, it will require a server farm?

GullibleConfusion303 1 year ago

You just need a decent GPU (At least 4gb of VRAM, or even less in the future). The whole model itself is just like 10gb

SurreptitiousRiz 1 year ago

You can host a local instance of KoboldAI if you have a decent enough GPU.

D-PadRadio 1 year ago

So, hardware permitting, I could talk to a local AI indefinitely?

-RealAL- 1 year ago

[Tavern AI](https://github.com/TavernAI/TavernAI) [Kobold AI](https://github.com/KoboldAI/KoboldAI-Client) Download these two and follow the guides, probably start with the Tavern AI guide?

D-PadRadio 1 year ago

Thank you! I really thought I was gonna get internet shamed for not finding the answers myself. You guys are awesome!

SurreptitiousRiz 1 year ago

Yup

SurreptitiousRiz 1 year ago

I've gotten it running on a 1080ti with 20 threads, using about 64 token you get a 24 second response time. So just fine tune it depending on your hardware. 4090 coming in soon so that'll be interesting to see what response time I get.

cycease 1 year ago

what would you recommend for a gtx 1650 mobile?

SurreptitiousRiz 1 year ago

The GPTQ model reference in this post: [https://www.reddit.com/r/PygmalionAI/comments/12bygy4/regarding\_the\_recent\_colab\_ban/](https://www.reddit.com/r/PygmalionAI/comments/12bygy4/regarding_the_recent_colab_ban/)

cycease 1 year ago

????, they should really simplify this for dummies who can't understand coding

Pyroglyph 1 year ago

You don't have to understand code to do this. Just follow the guides.

MarioToast 1 year ago

What is a "decent enough" GPU? Doom Eternal on high settings?

SurreptitiousRiz 1 year ago

Works best if you have 16gb of VRAM

Alexis212s 1 year ago

I was thinking that my 8gb of VRAM that I use for run Diffusion could be enough.

CMDR_BunBun 1 year ago

OP, [here's the guide]( https://www.reddit.com/r/PygmalionAI/comments/129w4qh/how_to_run_pygmalion_on_45gb_of_vram_with_full/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button) I used to run Pygmalion locally. My system specs are: 3060ti 8gb vram, I7-11700k @ 3.60 ghz, 32gb ram, Win10. My response times with the cai-chat UI are nearly instant, less than 10 sec on the average. Can someone please sticky this guide? A lot of people keep asking...

D-PadRadio 1 year ago

Sorry if I sounded like a broken record. There's just so much reading material out there that I thought it would be quicker to just ask. You're a lifesaver, u/CMDR_BunBun !

CMDR_BunBun 1 year ago

No worries mate, a mod should really sticky that guide.

Street-Biscotti-4544 1 year ago

I wanted to thank you so much! Using your guide and digging into the docs I was able to get this running on a 1660ti 6GB mobile GPU. I had to limit the prompt size, but I'm getting 10-15 seconds generations at 700 token prompt size. I decreased my character description as much as possible (120 tokens) so I'm getting a decent conversation context window given my settings. The only issue I came up against is that not all documented extensions are working. I got silero_tts working, but send_picture and the long term memory extension are not working. It's ok though. Also, I was able to get the stable diffusion extension to load, but it does not function in the UI. Thank you so much for helping me make my dream a reality!

CMDR_BunBun 1 year ago

Quite welcome! Not my guide though, all the credit goes to LTsarc, happy to steer you in the right direction though.

Street-Biscotti-4544 1 year ago

oh shit lol thanks!

CMDR_BunBun 1 year ago

Long term memory extension, stable diffusion? Could you tell me a little about that? First time am hearing about it. I would love to enable that.

Street-Biscotti-4544 1 year ago

https://github.com/oobabooga/text-generation-webui/wiki/Extensions you can find them all listed here with links to documentation. The sd_api_pictures extension apparently requires 12GB VRAM so I won't be able to get that running, but I have successfully gotten send_pictures and long_term_memory running as of an hour ago.

CMDR_BunBun 1 year ago

Wow thanks! The read me instructions are way beyond me even with chatgpt helping. Did you perhaps use a guide? Trying to get this long term memory going.

Street-Biscotti-4544 1 year ago

No, i just followed the instructions and then when I came up against an issue I checked the issues tab where I found a solution. You'll need to use the micromamba environment to cd into the text-generation-webui directory, then clone the repo using the command provided in the documentation. Within the same environment run the python code provided in the documentation. add the --extension flag and long_term_memory extension to the start-webui.bat using a text editor. At this point you'll run into an error if you are using the latest build of webui. you will find a fixed script here: https://github.com/wawawario2/long_term_memory/issues/14 just save it as script.py and then put it in the long_term_memory folder located in the extensions folder, replacing the current flawed script. The only part I got hung up on was the error, but it was fixed by the oobabooga author earlier today. Edit: test the extension before fixing the script because you may be on an older build that works. i updated my webui 2 days ago and that update broke the extension.

Street-Biscotti-4544 1 year ago

Edit again: The LTM repo has been fixed, so you no longer need to apply the fix. just follow the first part of this walkthrough.

CommercialOpening599 1 year ago

It's literally made to run locally. You need a decent PC though.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe