I investigated millions of tweets from the Kremlin’s ‘troll factory’ and discovered classic propaganda techniques reimagined for the social media age.

911@lemmynsfw.com · 10 months ago

I investigated millions of tweets from the Kremlin’s ‘troll factory’ and discovered classic propaganda techniques reimagined for the social media age.

the post of tom joad@sh.itjust.works · edit-2 10 months ago

Its 3 right? Am i real? Why can’t ai guess that one?

2pt_perversion@lemmy.world · edit-2 10 months ago

Over simplification but partly it has to do with how LLMs split language into tokens and some of those tokens are multi-letter. To us when we look for R’s we split like S - T - R - A - W - B - E - R - R - Y where each character is a token, but LLMs split it something more like STR - AW - BERRY which makes predicting the correct answer difficult without a lot of training on the specific problem. If you asked it to count how many times STR shows up in “strawberrystrawberrystrawberry” it would have a better chance.

the post of tom joad@sh.itjust.works · 10 months ago

Thanks, you explained it well enough this layman kinda gets it!

tee9000@lemmy.world · edit-2 10 months ago

Llms look for patterns in their training data. So like if you asked 2+2= it would look its training and finds high likelihood the text that follows 2+2= is 4. Its not calculating, its finding the most likely completion of the pattern based on what data it has.

So its not deconstructing the word strawberry into letters and running a count… it tries to finish the pattern and fails at simple logic tasks that arent baked into the training data.

But a new model chatgpt-o1 checks against itself in ways i dont fully understand and scores like 85% on international mathematic standardized test now so they are making great improvements there. (Compared to a score of like 14% from the model that cant count the r’s in strawberry)