Translation: Different file formats, why?

All development related issues welcome

Moderator: Moderator Team

Post Reply
Witch
Posts: 293
Joined: Thu Jul 24, 2008 12:30 pm
Location: Stockholm, Sweden
Contact:

Translation: Different file formats, why?

Post by Witch »

When looking at "first stage setup" code where you work with rc-files and sometimes C header files, why does the different countries use different file formats?
When there are uncertainties I'm slowed down and not able to move forward. :cry:

I thought that standard for all translation files would be in UTF-8, but my file format check below makes me confused and uncertain on the facts :(
:~/svn/reactos/base/setup/usetup/lang$ file sv-SE.h
sv-SE.h: ISO-8859 C program text

:~/svn/reactos/base/setup/usetup/lang$ file en-US.h
en-US.h: ASCII English text

:~/svn/reactos/base/setup/usetup/lang$ file nn-NO.h
nn-NO.h: ASCII text

:~/svn/reactos/base/setup/usetup/lang$ file ru-RU.h
ru-RU.h: Non-ISO extended-ASCII English text, with LF, NEL line terminators

:~/svn/reactos/base/setup/usetup/lang$ file it-IT.h
it-IT.h: Non-ISO extended-ASCII text, with LF, NEL line terminators

:~/svn/reactos/base/setup/usetup/lang$
User avatar
EmuandCo
Developer
Posts: 4723
Joined: Sun Nov 28, 2004 7:52 pm
Location: Germany, Bavaria, Steinfeld
Contact:

Re: Translation: Different file formats, why?

Post by EmuandCo »

Afaik it has to be the dos standard for your language... No utf8
ReactOS is still in alpha stage, meaning it is not feature-complete and is recommended only for evaluation and testing purposes.

If my post/reply offends or insults you, be sure that you know what sarcasm is...
Witch
Posts: 293
Joined: Thu Jul 24, 2008 12:30 pm
Location: Stockholm, Sweden
Contact:

Re: Translation: Different file formats, why?

Post by Witch »

So I'm using Linux (english-US), gedit and Bash terminal when I download and edit the ReactOS svn files. Do I have to convert my (swedish) rc-files manually on Linux to be compatible with Windows DOS standard?
I'm still confused.
User avatar
EmuandCo
Developer
Posts: 4723
Joined: Sun Nov 28, 2004 7:52 pm
Location: Germany, Bavaria, Steinfeld
Contact:

Re: Translation: Different file formats, why?

Post by EmuandCo »

The usetup files are NO resource files. RC Files are .rc files no Headers. They are normally utf8 or the region's own CPXXX Codepage
ReactOS is still in alpha stage, meaning it is not feature-complete and is recommended only for evaluation and testing purposes.

If my post/reply offends or insults you, be sure that you know what sarcasm is...
hto
Developer
Posts: 2193
Joined: Sun Oct 01, 2006 3:43 pm

Post by hto »

The sv-SE.h file uses CP-850; this is how USetup is implemented. GEdit probably can work with different encodings, but if not, then convert the file to UTF-8 before editing.

Something like this:

Code: Select all

iconv -f CP850 -t UTF8 < sv-SE.h > sv-SE.txt
or this:

Code: Select all

recode CP850/..UTF8 < sv-SE.h > sv-SE.txt
Than back to CP-850:

Code: Select all

iconv -t CP850 -f UTF8 < sv-SE.txt > sv-SE.h
or:

Code: Select all

recode UTF8..CP850/ < sv-SE.txt > sv-SE.h
igorko
Posts: 145
Joined: Thu Jun 18, 2009 3:12 pm

Re: Translation: Different file formats, why?

Post by igorko »

Easy explanation:

usetup/lang/xx-XX.h should be in your local ASCII codepage. Just check your ASCII codepage in internet and add it to gedit codepage list

All the rest of files (xx-XX.rc) are resource files. They should use UTF-8(without BOM if your editor supports it ). Also .rc files can use your local ANSI codepage, but it is better to use UTF-8(there will be less problems with multiplatforming for both OS and IDE) and just because UTF-8 ROCKS. :)

So if some rc files already exist for your locale and use ANSI you can convert them to UTF-8 or leave as is. When converting don't forget to move include in rsrc.rc below #pragma(65001).
If you want to translate new rc files, get english version, translate it and save file in UTF-8(without BOM if supported)

As for "DOS/Linux" standart (maybe you didn't want to ask this but i will answer anyway ;))there is also difference in EOLs(End Of Lines). It doesn't matter what EOl are you using so you can save files in both Windows and Linux without any additional steps.
Witch
Posts: 293
Joined: Thu Jul 24, 2008 12:30 pm
Location: Stockholm, Sweden
Contact:

Re: Translation: Different file formats, why?

Post by Witch »

The most annoying thing about using gedit on Linux is that when I explicitly tell it to "save as" in UTF8 format. Then it doesn't do what I tell it to do. :x

I've read on the Internet from somebody who had a similar question years ago. And it seems that if the text file doesn't contain any UTF8 characters then it will automatically save my files in ASCII format even though I tell it to explicitly "save as" UTF8.

Only when the text file do contain UTF8 characters will gedit save my file in UTF8 format. That is a little bit annoying since I want consistency even when no UTF8 characters are present in my files.

Converting 100 ANSI or ASCII files to UTF8 through some Linux scripting will probably be a breeze if I want to find that solution. But I'm just saying I don't like it when programs doesn't do what I tell it to do from the get go. :)
hto
Developer
Posts: 2193
Joined: Sun Oct 01, 2006 3:43 pm

Post by hto »

And it seems that if the text file doesn't contain any UTF8 characters then it will automatically save my files in ASCII format even though I tell it to explicitly "save as" UTF8.
That's how it should be. All ASCII texts are also UTF-8 texts, by virtue of the fact that ASCII is a subset of UTF-8. :)
Witch
Posts: 293
Joined: Thu Jul 24, 2008 12:30 pm
Location: Stockholm, Sweden
Contact:

Re: Translation: Different file formats, why?

Post by Witch »

I see, thanks for the confirmation!
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 16 guests