Regular-Fella t1_iu6m44m wrote on October 28, 2022 at 11:08 PM

Hi All, I want to find a relatively simple ML framework best suited for the following task. Let's say I have a total of exactly 20 strings of four characters each: drta, nowm, cite, krom, etc. These strings may be combined in ways that are "correct" and in ways that are incorrect, and every combination (or "ordering") is either correct or incorrect.

My training data would consist of one thousand correct combinations one thousand incorrect combinations, something like this:

drta, cite, krom , krom, nyan; correct drta, cite, pace; correct cite, cite, pace; correct cite, cite, krom; incorrect drta, krom, cite, nyan; incorrect nyan; correct nyan, cite; incorrect cite; incorrect

And so on...

(There may be between 1 and 10 strings in each ordering.)

After training the data, I'd like to be able to input new combinations of the strings and get an AI prediction as to the likelihood that that ordering is correct (0 being definitely incorrect and 1 being definitely correct).

What do y'all think would be a good place to start? I know JavaScript and could learn some Python if necessary. I'm trying to keep it as simple as possible for now, just to get a basic model working.

Thanks for any tips!

Consistent_River_959 t1_iudwvod wrote on October 30, 2022 at 4:25 PM

I’m not really sure if this tasks is suited for machine learning. Using permutations and querying a dictionary will be sufficient to complete this task.

However, if you want to play around with an ML model, I suggest tokenization of the inputs and making a simple logistic regression model.