Yes this is a recipe for extremely slow inference: I’m running a 2013 Mac Pro with 128gb of ram. I’m not optimizing for speed, I’m optimizing for aesthetics and intelligence :)

Anyway, what model would you recommend? I’m looking for something general-purpose but with solid programming skills. Ideally obliterated as well, I’m running this locally I might as well have all the freedoms. Thanks for the tips!

  • trave@lemmy.sdf.orgOP
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    6 days ago

    update: I tried GLM4.5 air and it was awesome until I remembered how censored it is by the Chinese government. Which I guess is fine if I’m just coding but just on principal I didn’t like running a model that will refuse to talk about things China doesn’t like. I tried Dolphin-Mistral-24B which will answer anything but isn’t particularly smart.

    So I’m trying out gpt-oss-120b which was running at an amazing 5.21t/s but the reasoning output was broken and it seems the way to fix it was to switch from the llamacpp python wrapper to pure llamacpp

    …which I did, and it fixed the reasoning output… but now I only get .61t/s :|

    anyway, I’m on my journey :) thanks y’all

    • Sims@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      3 days ago

      ? they are just trying to protect themselves from western propaganda, just as all nations should. Its working great. Everything you’ve heard about China comes from the western propaganda apparatus, and I doubt you have discovered how insanely polluted the western information sphere is - amongst other things, propaganda towards enemies of the US Plutocracy. All models are trained on that nonsense, and you can’t say “Hi” to a western model without being influenced by western ideological pollution/propaganda…

    • humanspiral@lemmy.ca
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      on principal I didn’t like running a model that will refuse to talk about things China doesn’t like.

      A good way to define a political issue is that there are at least 2 sides to a narrative. You can’t use a LLM to decide a side to favour if you can’t really use Wikipedia either. It takes deep expertise and an open mind to determine a side more likely to contain more truth.

      You may or not seek confirmation of your political views, but media you like should do so more than a LLM, and it is a better LLM that avoids confirming or denying your views, arguably anyway.