45 msgRe: Something is broken in repack
2 msg$prefix/lib/../$target/sys-include not in <&...
1 msggcc-4.1-20071210 is now available
1 msgClarification on section variable attribute usa...
1 msgAlias-analysis in gccint
1 msglibiberty: make install doesn't install obstack...
3 msgUsing -mlittle-endian or -mbig-endian options....
3 msgInserting arbitrary GIMPLE statements & ali...
2 msgRegarding
6 msgRevisiting GCC's minimum MPFR version
1 msgFwd: Cross compiler build stops
14 msgHelp with another constraint
5 msgVLIW scheduling and delayed branch
3 msgThe Regents of the University of California BSD...
1 msgWhere was gone?

Howto make another convertion with _identifiers...
\ Lijuan Hai (8 Dec 2007)
. \ Zack Weinberg (8 Dec 2007)

1 msggcc-4.3-20071207 is now available
3 msgRe: BITS_PER_UNIT less than 8
1 msgBroken link for Modula-3 front end.
26 msglibiberty/pex-unix vfork abuse?
Subject:Re: Howto make another convertion with _identifiers_ following '#' in libcpp
Group:Gcc
From:Zack Weinberg
Date:8 Dec 2007


Lijuan Hai wrote:
>
> I have a plan to convert UCN to alphabet instead of UTF8 in
> GCC-4.2.0, and already handled it in libcpp.

I would like to offer advice, but I don't understand what you are
trying to do. You say you want to "convert UCN[s] to [an] alphabet
instead of UTF8" but that doesn't make any sense. Alphabets are
abstract sets of glyphs commonly used to write a language. They are
not alternatives to UTF8 (a scheme for encoding integers as sequences
of bytes) or even to Unicode (a mapping from integers to glyphs).

The only thing I can guess is that you want to convert UCNs to some
specific character set other than Unicode, like EUC-JP or ISO8859.n.
In that case the first thing I must ask you is to read up on the
-fexec-charset option, and to explain why that doesn't do what you
need it to do.

> But I encountered a problem when compiling the code like following:
> -------------------cut-------------------
> 1: #define str(t) #t
> 2: int foo()
> 3: {
> 4: char* cc = str(\u1234);
> 5: if (!strcmp(cc, "\u1234"))
> 6: abort();
> 7: }
> -------------------cut-------------------
> With my changes, \u1234 is converted to alphabet in line 4 while
> kept in line 5. It's incorrect and also unexpected to convert it in
> line 4 for '#' makes it different from plain identifiers.

As I don't know what you mean by "converted to alphabet", I can't say
for sure, but if I had to guess, I'd say you inserted your code into
the routines for scanning identifiers? But at that point there is no
way to know that there is a '#' in effect. You need to postpone the
conversion, whatever it is, until much later; the point where cpplib
hands off identifiers to the compiler proper, or perhaps even the
assembly output macros, depending on your goal.

(Have you read the long comment at the top of libcpp/charset.c? Do
you understand all of the fine distinctions made there?)

zw


© 2004-2008 readlist.com