Temperature is just a matter of randomness, getting it higher actually helps in generating more variations from the same prompt, but the coherency is still a problem.
Apparently I was wrong, the problem is not only quantization. It is because it's not Stanford's Alpaca and another alpaca-like model. This was what I can surely say about that.
I guess I found the reason. Dalai system does quantization on the models and it makes them incredibly fast, but the cost of this quantization is less coherency.
Haghiri75 OP t1_jd32z3b wrote
Reply to comment by j-solorzano in Alpaca-7B and Dalai, how can I get coherent results? by Haghiri75
Temperature is just a matter of randomness, getting it higher actually helps in generating more variations from the same prompt, but the coherency is still a problem.