#Reading a unicode file into a unicode array

4 messages · Page 1 of 1 (latest)

coral cosmos
#

I am trying to read a text file containing unicode characters. Here's the input file

yo World 💁👌🎍😍

Unfortunately, when I try to read the file and print it,

  1. The unicode characters are garbled in the console (windows powershell. the console itself supports these characters as I can echo them)
  2. The returning array only has 2 items in it for some reason.

This is what I see as output;

yo World 💁👌🎍😍yo World 💁👌🎍😍The size of array : 2
The size of array : 2
y o

I am not sure where I am going wrong and would appreciate any help in fixing my code. I am new to c (used to the usual scripting languages) and simple string handling is blowing my little brain :/

gloomy falconBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question run !howto ask.

coral cosmos
#

Here is my code;


#include <stdio.h>
#include <stdlib.h>
#include <uchar.h>

char32_t *read_unicode_file(const char *filename)
{
    // Open the file for reading
    FILE *file = fopen(filename, "r");
    // Check if the file was successfully opened
    if (file == NULL)
    {
        fprintf(stderr, "Error: unable to open file '%s'\n", filename);
        return NULL;
    }
    // Read the file one character at a time
    char32_t c;
    size_t length = 0;
    while ((c = fgetwc(file)) != WEOF)
    {
        printf("%lc", c);
        length++;
    }
    // Allocate memory for the array
    char32_t *array = malloc(length * sizeof(char32_t));
    // Check if memory was successfully allocated
    if (array == NULL)
    {
        fprintf(stderr, "Error: unable to allocate memory for array\n");
        return NULL;
    }

    // Read the file again, this time storing the characters in the array
    rewind(file);
    size_t i = 0;
    while ((c = fgetwc(file)) != WEOF)
    {
        array[i++] = c;
    }

    // Close the file
    fclose(file);

    printf("The size of array : %zu\n", sizeof(array) / sizeof(array[0]));

    // Return the array
    return array;
}

int main(int argc, char *argv[])
{
    // Read the file and store its contents in an array
    char32_t *array = read_unicode_file(argv[1]);
    // puts(sizeof(array) / sizeof(array[0]));
    printf("The size of array : %zu\n", (sizeof(array) / sizeof(array[0])));

    // Check if the array was successfully returned
    if (array == NULL)
    {
        return 1;
    }

    // Print the array
    size_t length = sizeof(array) / sizeof(char32_t);
    for (size_t i = 0; i < length; i++)
    {
        printf("%lc ", array[i]);
    }
    printf("\n");

    // Free the memory
    free(array);

    return 0;
}

gloomy falconBOT
#

This question thread is being automatically closed. If your question is not answered feel free to bump the post or re-ask. Take a look at !howto ask for tips on improving your question.