UTF-8 and standard C library

Take a look at this:

binf@home:~/codesnippets/C$ cat utf8.c
#include <stdio.h>
#include <string.h>

void main(void)
    printf("%s : %i\n","a",(int)strlen("a"));
    printf("%s : %i\n","æ",(int)strlen("æ"));

And compile:

binf@home:~/codesnippets/C$ gcc -o utf8 utf8.c

Output :

binf@home:~/codesnippets/C$ ./utf8
 a : 1
 æ : 2

Yeah, it is the Unicode up-code. When you are used to work in an ISO-8859-1 environment, you might take into consideration that more or less of system calls are made with the ASCII in mind.

One example straight to the point; on Linux, the dirent structure is defined as follows:

struct dirent {
               ino_t          d_ino;
               off_t          d_off;
               unsigned short d_reclen;
               unsigned char  d_type;
               char           d_name[256]; /* filename */

In an UTF-8 environment with variable-width encoding a character uses one to four bytes of the system’s assigned 256 bytes for a file name. And with my Unicode example your are limited to one half, a 128 character file name.

binf@home:~/codesnippets/C$ uname -rvm
3.8.0-35-generic #52~precise1-Ubuntu SMP Thu Jan 30 17:24:40 UTC 2014 x86_64



Sonic Room demo final for Windows


After about 24 hours of finetuning, we had to surrender and deliver our demo.
With the file uploaded in time for soft deadline, I could finally get some sleep.
Enex (Montrêal, Canada) was CET-6 hours behind me, and when Dran had to take a break and sleep, Enex continued, and we delivered in time but with a few flaws.


After the Gathering we all had a break from the computer, and because of that we did not upload the final version before a couple of weeks ago. (Yes, that was easter 2012.)

It’s a blog!

Now I have a blog.

I will try posting some original content, some unreleased code, some unseen pictures and videos, some oldshool stuff, and hopefully posts about future interests and  links to my music and projects.