English Amiga Board


Go Back   English Amiga Board > Coders > Coders. Language > Coders. C/C++

 
 
Thread Tools
Old 22 June 2017, 13:43   #1
chocsplease
Registered User
 
Join Date: Dec 2016
Location: london
Posts: 178
Odd problem reading a string containg non English characters

Hi,

Part of my code reads a file and then changes the apps messages to the ones stored there. In this way the user can localise the app simply by copying the prefs file, throwing it a google translate and then correcting any stupid and obvious mistakes. Quick and hopefully easy.

I'm testing the code using German but it's choking on non English characters.

The format of the lines is message=new message

Here's the code segment.
Code:
msg1=msg2=PrefsFile; /*give pointers default to the start of the prefs file*/
        
        printf("msg1=%s-\n",msg1);
        
        /*now read through prefs setting messages etc*/
        for (i=0;i<size;i++)
        {
            
            if ((i<size) && ((char)PrefsFile[i]==' ') && (msg1==&PrefsFile[i])) /*if not at end of file, and found space and at start of line*/
            {
                printf("found leading space, msg1 -%s- i=%d\n", msg1, i);
                while ((i<size) && ((char)PrefsFile[i]==' '))
                    i++; /*ignore leading spaces*/
                msg1=msg2=&PrefsFile[i]; /*update messages to be past spaces*/
            }    
            if (((char)PrefsFile[i]==';') && (msg1==&PrefsFile[i])) /*found comment and at start of a line*/
            {
                printf("FOUND COMMENT!\n");
                while ((i<size) && ((char)PrefsFile[i]!='\0'))
                    i++; /*scan to end of file or line ending null*/
            }
            
             if ((char)PrefsFile[i]=='=')
            {
                printf("prefs file equals found\n");
                /*temp change current pos to null and store the pos*/
                (char)PrefsFile[i]='\0';
                temp=strlen(msg1);
                printf("length of msg1 is %d\n",temp);
            
                /*make msg1 lowercase*/
                for (ii=0;ii<temp;ii++)
                {
                    msg1[ii]=tolower(msg1[ii]);
                }
                printf("msg1 is now -%s-\n",msg1);
                msg2=&(char)PrefsFile[i+1];    /*set replacement message to 1 char after equals*/
                printf("msg2 is now -%s-\n",msg2); /*<<< Its at this point that I find the problem*/
                err=update_message(msg1,msg2); /*change the required message or error if its not one we know*/
                (char)PrefsFile[i]='=';        /*put equals sign back*/
                if (err!=0) 
                {
                    Printf("Formatting error in prefs file - unknown message type?\n");
                    break;            /*got error so stop*/
                }    
            }
            else if ((i<size) && ((char)PrefsFile[i]=='\0')) /*at end of line so move msg pointers to new line, i will always be < size */
            {
                /*printf("updating msg pointers\n");*/
                msg1=msg2=&(char)PrefsFile[i+1];
                /*printf("updated ok!\n");*/
            }
This works until it hits a line like -

msg_baddisk = Datenträger kann nicht gelesen werden

When msg2 gets truncated to Datentr

The same thing happens whenever there is a German specific character in the new message. I know the PrefsFile is ok as just before this code I scan the whole thing and print out the characters and their ASCII values.

The code works fine on lines that do not contain a language specific character.

Could someone suggest what's going on? I'm once again baffled.

Last edited by chocsplease; 22 June 2017 at 14:00.
chocsplease is offline  
Old 22 June 2017, 14:44   #2
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,322
This smells the signed char problem...
Does it still occur if you define char as unsigned ?
meynaf is online now  
Old 22 June 2017, 14:50   #3
demolition
Unregistered User
 
demolition's Avatar
 
Join Date: Sep 2012
Location: Copenhagen / DK
Age: 43
Posts: 4,190
Is the file a true ASCII file, that is only using 7 bit for each character?
If the prefs file was made using a PC, then it might be some multi-char format. You could check in a hex-editor.

Edit: If it is written using ISO-8859-1, then perhaps the characters outside the lower 128 ones could cause problem when printed out, although they might still be correctly contained in the string array.
demolition is offline  
Old 22 June 2017, 22:34   #4
chocsplease
Registered User
 
Join Date: Dec 2016
Location: london
Posts: 178
Hi and many thanks for both your replies

Quote:
Originally Posted by meynaf View Post
This smells the signed char problem...
Does it still occur if you define char as unsigned ?
Meynaf you were spot on, switching from a standard char to specifying unsigned chars for the PrefsFile and msg1 & msg2 has fixed the problem. I had no idea that C would store chars as -127 to +128 rather than 0 to 255.

You live and learn.
chocsplease is offline  
Old 23 June 2017, 07:56   #5
meynaf
son of 68k
 
meynaf's Avatar
 
Join Date: Nov 2007
Location: Lyon / France
Age: 51
Posts: 5,322
Well, when accentuated characters start to fail in C, it's probably either charset problem (but you said you scanned the string before) or signed char...

C itself says nothing about signedness of char. It's up to the compiler. Some have an option for this.
That's for me one more reason to like asm - where this kind of thing never occurs

Anyway, it's a pleasure to have helped.
meynaf is online now  
 


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
Odd freezing problem jaycee support.WinUAE 7 17 July 2009 23:23
Really odd problem with A1000... Paul_s support.Hardware 11 21 July 2008 19:11
Odd Hardrive problem Jon-Vortexone support.Hardware 4 08 March 2004 15:28
DDoouubbllee characters problem Dastardly support.Apps 10 23 July 2003 00:29

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 08:53.

Top

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Page generated in 0.16251 seconds with 13 queries