DeepSeek is not as based as I thought it would be

sexywheat [none/use name]@hexbear.net · 19 days ago

DeepSeek is not as based as I thought it would be

Sickos [they/them, it/its]@hexbear.net · 19 days ago

Did you ask it in Chinese? LLMs can only learn from quantity in the given language. There’s a lot more propaganda

sexywheat [none/use name]@hexbear.net · 19 days ago

No, English, so maybe I shouldn’t be so surprised.

Le_Wokisme [they/them, undecided]@hexbear.net · 19 days ago

i’d think you could build a vector space in multiple languages (or in those meta languages the pre-LLM machine translation tools use). the programmers would have to design it to do that of course but there’s no reason the tokens for blue cat, gato azul, and 蓝猫 shouldn’t be correlated.