what's the best model these days I could fit in 128gb ram?

trave@lemmy.sdf.org · 9 days ago

what's the best model these days I could fit in 128gb ram?

trave@lemmy.sdf.org · 6 days ago

update: I tried GLM4.5 air and it was awesome until I remembered how censored it is by the Chinese government. Which I guess is fine if I’m just coding but just on principal I didn’t like running a model that will refuse to talk about things China doesn’t like. I tried Dolphin-Mistral-24B which will answer anything but isn’t particularly smart.

So I’m trying out gpt-oss-120b which was running at an amazing 5.21t/s but the reasoning output was broken and it seems the way to fix it was to switch from the llamacpp python wrapper to pure llamacpp

…which I did, and it fixed the reasoning output… but now I only get .61t/s :|

anyway, I’m on my journey :) thanks y’all

Sims@lemmy.ml · 3 days ago

? they are just trying to protect themselves from western propaganda, just as all nations should. Its working great. Everything you’ve heard about China comes from the western propaganda apparatus, and I doubt you have discovered how insanely polluted the western information sphere is - amongst other things, propaganda towards enemies of the US Plutocracy. All models are trained on that nonsense, and you can’t say “Hi” to a western model without being influenced by western ideological pollution/propaganda…

humanspiral@lemmy.ca · 6 days ago

on principal I didn’t like running a model that will refuse to talk about things China doesn’t like.

A good way to define a political issue is that there are at least 2 sides to a narrative. You can’t use a LLM to decide a side to favour if you can’t really use Wikipedia either. It takes deep expertise and an open mind to determine a side more likely to contain more truth.

You may or not seek confirmation of your political views, but media you like should do so more than a LLM, and it is a better LLM that avoids confirming or denying your views, arguably anyway.