Subject: Re: Bug 1614
From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Fri Jun 22 2001 - 22:51:26 CDT
Dom Lachowicz wrote:
>
> > Can you explain what a Lid is. I'm assuming they're language IDs
> > defined somewhere by Microsoft. Or are they internal to wv?
> > Is there any part of a Word document that tells us it's a mac document
> > and if so does wv look at this?
> >
> > Still looking...
>
> Hey Andrew,
>
> a LID is a Language identifier. It's used to mark the codepage and locale of
> different runs of text. We also use it for spell-checking purposes.
>
> For more info on the MSWord spec or format, see another project of mine:
> www.wvware.com
Thanks. I've looked through the file format doc and the wv source and
it looks like there is no special handling for mac character sets in wv
at all. In particular it looks like wv needs to handle "chs" and
"chsTables" fields with a value of 256.
Another fuzzy area is that wv converts uses wvnLocaleToLIDConverter()
and wvLIDToCodePageConverter() to determine the character set.
The macintosh files definitely have a locale of 0x4D 'M' and a
codepage of "MacRoman" or "MACINTOSH" but do they have a lid?
Since a lid seems to be an MS LCID and I can't find anything on
MSDN I'm guessing not. Would it be safe to invent a new special
value for mac encodings?
This problem is likely larger than this since the Mac like Windows
uses a bunch of different encodings. This discussion is only in
regard to the Mac Roman/Western encoding.
Let me know if I should move this discussion to the wv mailing list.
Andrew.
-- http://linguaphile.sourceforge.net _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
This archive was generated by hypermail 2b25 : Fri Jun 22 2001 - 22:49:43 CDT