#review these 2 sorting utils

27 messages · Page 1 of 1 (latest)

crisp pike
#
void sort_id_value_array(int *arr, int count) {
    
    for (int *i = arr + 2 + 1; i < arr + 2 * count; i += 2) {
        
        int *j = i - 2; while (j >= arr && *j > *i) j -= 2; if (j == i - 2) continue;
        int temp[2]; j += 2;
        i--; j--;
        memcpy(&temp, i, sizeof(temp));
        memmove(j + 2, j, (i - j) * sizeof(int));
        memcpy(j, &temp, sizeof(temp));
        i++;
    }
}

void update_sorted_id_value_array(int *arr, int count, int id, int new_value) {
    
    int pos;
    {
        int left = 0, right = count;
        
        while (left < right) {
            int mid = left + (right - left) / 2;
            arr[2*mid + 1] < new_value ? (left = mid + 1) : (right = mid);
        }
        pos = left;
    }
    
    int idx = 0; while (arr[2*idx] != id) idx++;
    
    if (pos != idx) {
    
        int off = pos - idx;
    
        if (off > 0) {
            memmove(arr + 2*idx    , arr + 2*idx + 2, 2 *  off * sizeof(int));
            pos--;
        } else
            memmove(arr + 2*pos + 2, arr + 2*pos    , 2 * -off * sizeof(int));
    
        arr[2*pos] = id;
    }
    arr[2*pos + 1] = new_value;
}

Is the form good enough? Would having a struct .id .value help with clarity?

unreal kestrel
#

Hello

#

The code's current form is challenging to read and maintain due to manual pointer arithmetic and raw array manipulation. Transitioning to a struct-based approach would significantly enhance clarity and reduce errors. Here's a concise breakdown:

#

Pointer Arithmetic Complexity: The use of arr + 2*idx +1 and similar offsets is error-prone and obscures the intent, making it hard to track pairs (ID and value).

#

Maintenance Difficulty: Any change to the data structure (e.g., adding fields) requires meticulous adjustment of all pointer offsets.

#

Inefficient Lookups: The linear search for id in update_sorted_id_value_array is inefficient (O(n)), and the binary search logic is tightly coupled to the array's structure.

noble cloud
#

is this a chatgpt review?

robust yoke
#

I feel like it

robust yoke
#

In any case, @crisp pike , if you could explain what your code is supposed to be doing, I could have a crack

crisp pike
#

@robust yoke it's a selection sort for (id, value) list... And a function that can modify the value of an element in the sorted list, moving it in the correct position to keep the list sorted

#
struct id_value {
    int id;
    int value;
};

void sort_id_value_array(struct id_value *arr, int count) {
    
    for (int i = 1; i < count; i++) {
        int j = i; while (j > 0 && arr[j-1].value > arr[i].value) j--; if (j == i) continue;
        struct id_value temp = arr[i];
        memmove(&arr[j+1], &arr[j], (i - j) * sizeof(struct id_value));
        arr[j] = temp;
    }
}

void update_sorted_id_value_array(struct id_value *arr, int count, int id, int new_value) {
    
    int new_pos;
    {
        int left = 0, right = count;
        
        while (left < right) {
            int mid = left + (right - left) / 2;
            arr[mid].value < new_value ? (left = mid + 1) : (right = mid);
        }
        new_pos = left;
    }
    
    int old_pos = 0; while (arr[old_pos].id != id) old_pos++;
    
    if (new_pos != old_pos) {
        
        int off = new_pos - old_pos;
        
        if (off > 0) {
            memmove(&arr[old_pos], &arr[old_pos + 1], off * sizeof(struct id_value));
            new_pos--;
        } else {
            memmove(&arr[new_pos + 1], &arr[new_pos], -off * sizeof(struct id_value));
        }
        
        arr[new_pos].id = id;
    }
    arr[new_pos].value = new_value;
}
robust yoke
#

Firstly, a point on style: as it is written, sort_id_value_array is very hard to read and does not follow established conventions regarding indentation, line-breaks, and spacing.

void sort_id_value_array(struct id_value *arr, int count) {

    for (int i = 1; i < count; i++) {
        int j = i; while (j > 0 && arr[j-1].value > arr[i].value) j--; if (j == i) continue;
        struct id_value temp = arr[i];
        memmove(&arr[j+1], &arr[j], (i - j) * sizeof(struct id_value));
        arr[j] = temp;
    }
}

I would change it to:

void sort_id_value_array(struct id_value *arr, int count) {
    for (int i = 1; i < count; i++) {
        int j = i;

        while (j > 0 && arr[j-1].value > arr[i].value)
            j--;

        if (j == i) continue;

        struct id_value temp = arr[i];
        memmove(&arr[j+1], &arr[j], (i - j) * sizeof(struct id_value));
        arr[j] = temp;
    }
}
#

Secondly, and more importantly: this is not how selection sort is normally done, you have implemented an ad-hoc sorting algorithm which is only barely related to selection sort.

You current algorithm goes as follows:

  1. for each i in [1,len(l)]
  2. find the lowest index j, such that l[j] <= l[i]
  3. if such a `j` exists:
    
  4.   `tmp = l[i]`
    
  5.   `copying all values from [j,i] to [j+1,i+1]`, sequentially
    
  6.   `[j] = tmp`
    

This involves, in the worst case: 1 + 2 + ... + n = (n(n-1))/2 ~ n^2 loop iterations, which is expected for selection sort
Unfortunately, you also do O(i) copies per loop iteration, turning your total running time to O(n^3), which is significantly worse! This is a serious issue and essentially disqualifies the algorithm you have written from being considered a selection sort.
This is not selection sort, since you are inserting the next element of the unsorted portion of the array directly into its correct place in the sorted section, while selection sort involves selecting the least element of the unsorted portion of the array and inserting it at the end of the sorted portion.

I would recommend reading up on selection sort again and, perhaps, reading some example implementations. I have always recommended Sedgewick & Wayne's "Algorithms" as an accessible introduction to algorithms (and their analysis -- something which would have alerted you to your poor running bounds!).

#

@crisp pike

crisp pike
#

Sorry my bad I said selection, but it's indeed insertion sort.
I picked this algorithm because I expect the data I'm treating to be nearly sorted (or at least having many chuncks sorted) nonetheless since the size of the list it's not really big (couple hundred of elements max) and I know that more perfromant algorithms will bring overhead, probably it's not worth it... But I immagine I should test it

#

Btw thanks for the reading suggestion!

#

I hope this it's at least a solid insertion sort implementation😅

#

Interesting note

#

@robust yoke

robust yoke
# crisp pike <@296208468141015050>

Right, if it's an insertion sort, you still need to may want to swap elements "as you go" in this loop:

while (j > 0 && arr[j-1].value > arr[i].value)
            j--;

otherwise, you still end up with O(n^3) performance.
(Insertion sort can be done in O(n^2) time)

#

Note that, if partially sorted data is expected, then shell sort may give you benefits over insertion sort.

#

Once you have insertion sort done properly, you can generalize it to shell sort -- since shell sort is just running multiple insertion sorts "in parallel"

robust yoke
#

Apologies!!, I mis-analyzed.

#

Your code is O(n^2)

#

not O(n^3)

crisp pike
#

Oh ok it was bugging me because insertion is O(n^2), I think you multiplied memory swap complexity to comparison complexity rather than adding them(?) @robust yoke