🙅: follow ggd prompt

👉: tell her she doesn’t need drugs

  • maria [she/her]@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    18 days ago

    oooh i see.

    (guess whos the owner moderator of lemmy qwen community hehe >v< )

    but yesssss smaller models struggle with anything not task oriented.

    ive tried the recent gemma 26B MoE and it seems to show more actual understanding, so that might be an option if u got 32GB of RAM (yes, regular RAM, its has some speed to it)

    ive found that when switching from ollama to llama-cpp improved speed drastically, to the point that qwen3.6-35B-A3B now runs at 9 tok/sec instead of 4 (due to multi-token-prediction doing its wonders apparently)