(Edit: apologies, I should have clarified initially I'm running on Linux OS. I didn't realize it might not be obvious from the screenshot alone for a

30B model now needs only 5.8GB of RAM? How? #638

submited by
Style Pass
2023-04-01 23:00:05

(Edit: apologies, I should have clarified initially I'm running on Linux OS. I didn't realize it might not be obvious from the screenshot alone for a non-Linux users.All tests are done on Ubuntu based Linux Mint 21.1)

I've been only playing with 30B model so far, since neither 7B nor 13B were very engaging. As recently as yesterday 30B model fill just close to 30GB, but today's release now runs fine with less than 6GB (and that's with system memory usage).

Initially I thought it must be a bug, but I couldn't notice any quality loss in responses, and then I saw there was some major change introduced only hours ago, but the fundamentals of those changes are a little over my head.

Maybe someone smarter than me can at least roughly explain, in basic terms (if that's even possible at all), how memory usage dropped 5 times overnight?

it did actually not, you are not seeing the actual ram usage, because the os now counts it as filesystem cache. Since #613 the model is a memory mapped file. It does not do the impossible, it's just that gnome-system-monitor does not show cached files. 😄

Leave a Comment