#Cross checking every permutation and combination of set of letters with dictionary of words
57 messages · Page 1 of 1 (latest)
When your question is answered use !solved to mark the question as resolved.
Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.
If I understand right, you have a set of letters, and a list of words, and you want to find out which words in that list can be made with your set of letters
do you have to use all the letters, or can there be leftovers?
yep
there can be leftovers
so combinations or permutations
e.g. "doog" -> dog, god, good, go, do
only one of these uses all 4, but all of them are okay
I don't think regex is the right tool for the job, if I'm right you should just be able to use a counting solution
i said this in general:
i have a file with all english words which ive now succesfully parsed
but i want to minimise the dictionary to only be suitable words
so, a simple start was only adding words that were <= amount of starting letters
but i thought i could use regex to contain it to only words that could be made from the letters
the alternative being i create every combination and permutation of the letters, then cross check the dictionary and the permutations then keeping matches only, but it seems a bit inefficient
i thought regex would be a neat solution
my problem is ive only used regex when the letters are in a straight line, not jumbledd
a counting solution?
i havent heard of this before
so basically you want a function from a set of letters, and a list of strings, and filter that list for words that can be made from those letters
You should just be able to count how many times each letter appears in each word, and compare that to your set of letters
a hashmap of occurences?
for example, doog is
d: 1
g: 1
o: 2
and dog is
d: 1
g: 1
o: 1
so you should be able to make dog from doog
a dict is an abstract data structure, a hashmap is one particular implementation that you can use to implement a dict
gotcha
i havent learnt about structures properly yet so my understanding is a bit hazy
i plan on implementing a hashtable later for the dictionary, as i think itll be a good opportunity
a hashmap could work, map from char to int, but given that you are only looking at english words i.e. characters from a-z you could just use a fixed length array
so array of 26 ints, where array[0] is the number of times a appears, array[1] is the number of times b appears and so on
makes things easier to do in c since you're just using native types
ye you will be fine without one for this task
where i do
original[0] = 'a'?
like
it corresponds to letter a?
do i need to come up with some logic to fit it into the array?
so if your set of letters was "abacaba"
your array would look like
{4, 2, 1, 0, 0 ....... 0, 0}
do i use ascii - 91 or whatever?
you'd have to write a loop to compute that array from the word
you can index the array like this array[letter - 'a']
yep, cause char is basically just a byte, and c lets you do arithmetic on it
tho if you wanted it to work with any possible character you'd have to do an array of 256 numbers
yea i was watching some vids about string manipulation and a lot of them use 256
but im only doing words, and i have something to check that the original set of letters only contains letters
so 26 should be okay
as long as you only have lowercase letters a-z then you're fine, otherwise you're going to be indexing that array out of bounds
ive made a function that turns all strings of letters into uppercase
so i think that should be okay
they want us to display things in uppercase unfortunately
easiest way to avoid UB is to just check the index is in bounds before accessing
undefined behaviour
Undefined behavior occurs when you violate rules specified by the language. For example: Reading uninitialized memory, performing out-of-bounds memory access, signed integer overflow, and race conditions.
int i; // default init,
// indeterminate value
while(i < 10) {
printf("%d\n", i++);
}
int arr[10];
for(int i = 0; i < 20; i++) {
arr[i] = i; // [i] out of
// bounds
}
Compilers are not required to provide warnings or errors about UB. Often it is undetectable at compile-time. Performing actions which are UB can render your entire program's behavior undefined, leading to anything from crashing to summoning Eldritch Abominations.
Sanitizers can help identify UB. For more info, look into address sanitizer (ASan) and undefined behavior sanitizer (UBSan). See !howto asan
ahh okay