﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>IT博客-幽灵狼-随笔分类-Development</title><link>http://www.cnitblog.com/SpiWolf/category/1312.html</link><description>狼团队中的每一位成员，共同承担团体生存的责任， 并为此付出自己的独特技能和力量。 我是狼， 相信自己， 相信伙伴。</description><language>zh-cn</language><lastBuildDate>Wed, 28 Sep 2011 19:30:38 GMT</lastBuildDate><pubDate>Wed, 28 Sep 2011 19:30:38 GMT</pubDate><ttl>60</ttl><item><title>Writing Programs with NCURSES</title><link>http://www.cnitblog.com/SpiWolf/archive/2006/02/25/6933.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Sat, 25 Feb 2006 02:51:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2006/02/25/6933.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/6933.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2006/02/25/6933.html#Feedback</comments><slash:comments>6</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/6933.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/6933.html</trackback:ping><description><![CDATA[<A href="http://www.cs.mun.ca/~rod/ncurses/ncurses.html">http://www.cs.mun.ca/~rod/ncurses/ncurses.html</A><img src ="http://www.cnitblog.com/SpiWolf/aggbug/6933.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2006-02-25 10:51 <a href="http://www.cnitblog.com/SpiWolf/archive/2006/02/25/6933.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Linux Netlink Sockets</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5516.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Sat, 17 Dec 2005 09:46:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5516.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/5516.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5516.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/5516.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/5516.html</trackback:ping><description><![CDATA[<p>Netlink Sockets are the method that the Linux Kernel uses to pass
Routing, Interface and other miscellaneous networking information
around, both within the kernel and between the kernel and userspace. It
replaces the old <a href="http://www.wlug.org.nz/ioctl%282%29" class="wiki">ioctl(2)</a>
based method and is far far superior - infact as soon as the kernel
receives a networking ioctl it is converted to a netlink message before
being shipped off for further processing.</p>

<h3 id="hdr_basic_introduction">Basic Introduction</h3>

<p>The netlink protocol uses a special type of <a href="http://www.wlug.org.nz/socket%282%29" class="wiki">socket(2)</a>
to communicate with the Linux kernel. This socket is called a "Netlink
Socket" surprisingly enough and can be created by specifing AF_NETLINK
as the first argument to a <a href="http://www.wlug.org.nz/socket%282%29" class="wiki">socket(2)</a>
call, The socket type (second argument) can be either SOCK_DGRAM or
SOCK_RAW, it makes absolutely no difference!, the third argument
(netlink family) specifies which part of the linux networking stack you
want to modify, for example NETLINK_ROUTE can be specified to modify
the routing table (including interfaces), or NETLINK_ARPD can be
specified to allow the arp table to be manipulated. A full list of
available netlink families is found in <a href="http://www.wlug.org.nz/netlink%287%29" class="wiki">netlink(7)</a>.</p>

<p>NETLINK_ROUTE is the most commonly used netlink family as it is used
to add, delete and modify routes from the kernels routing table and can
also be used to add, delete and modify the interfaces on the machine.</p>

<p>Some of the basic Netlink principles are documented in <a href="http://rfc.net/rfc3549.html" class="interwiki"><img src="http://www.wlug.org.nz/phpwiki/themes/default/images/interwiki.png" alt="[link]" class="linkicon" border="0">RFC:<span class="wikipage">3549</span></a>.</p>

<h3 id="hdr_programming_netlink">Programming Netlink</h3>

<p>There is somewhat of a lack of easy to read documentation regarding
how to program using netlink sockets, however the information is all
there in the end. As a start try the <a href="http://www.wlug.org.nz/netlink%283%29" class="wiki">netlink(3)</a>, <a href="http://www.wlug.org.nz/netlink%287%29" class="wiki">netlink(7)</a>, <a href="http://www.wlug.org.nz/rtnetlink%283%29" class="wiki">rtnetlink(3)</a> and <a href="http://www.wlug.org.nz/rtnetlink%287%29" class="wiki">rtnetlink(7)</a>
manpages which provide a very technical description of the netlink
protocol, all the information that you need to write a program using
netlink is contained in these manpages.... should be easy from here
right?</p>

<p>The iproute2 package is the base implementation of the netlink
interface, it replaces all the old linux networking utilities
(ifconfig, route, etc) into a single binary called ip which performs
all of the tasks using the netlink interface. I highly recommend that
you use this package as a reference when coding netlink related
applications. In particular iproute2 contains a netlink library
(libnetlink) which deals with much of the low level protocol
interactions between your application and the kernel. Unfortunately the
library is not seperately packaged and you'll have to spend some time
extracting it from the iproute2 package before it is useful.</p>

<p>Coming Soon - Some basic examples of how to program using libnetlink -- Talk to <a href="http://www.wlug.org.nz/MattBrown" class="wiki">MattBrown</a> if you want them and they're not here yet!</p>

<p>(ha! It's been ages and you've not put up any examples!  So I've written one that shows route add/del events, see <a href="http://www.wlug.org.nz/LinuxNetlinkSocketExample" class="wiki">LinuxNetlinkSocketExample</a> --<a href="http://www.wlug.org.nz/PerryLorier" class="wiki">PerryLorier</a>).</p>

<h3 id="hdr_applications_known_to_use_netlink_sockets">Applications Known to Use Netlink Sockets</h3>

<ul>
<li><a href="http://www.wlug.org.nz/Quagga" class="wiki">Quagga</a></li><li>/sbin/ip (<a href="http://www.wlug.org.nz/IpRoute" class="wiki">IpRoute</a>2 package)</li>
</ul>

<h3 id="hdr_random_notes_things_i_wish_were_documented_somewhere_but_aren_t_">Random notes (things I wish were documented somewhere but aren't)</h3>

<ul>
<li>if you want to recieve RTM_NEWNEIGH messages, you need <tt>/proc/sys/net/ipv{4,6}/neigh/*/app_probes</tt> to be non 0.</li>
</ul>

<p>I don't know why.  They might have been drunk at the time -- <a href="http://www.wlug.org.nz/PerryLorier" class="wiki">PerryLorier</a><br>
The reason why is that much of the system parameters are moving this
way and they were just too lazy to convert other ones too I suspect -- <a href="http://www.wlug.org.nz/IanMcDonald" class="wiki">IanMcDonald</a><br>
</p>
<p>URL for this article: <a href="http://www.wlug.org.nz/LinuxNetlinkSockets">http://www.wlug.org.nz/LinuxNetlinkSockets</a><br>
</p>
<img src ="http://www.cnitblog.com/SpiWolf/aggbug/5516.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-12-17 17:46 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5516.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Linux Netlink Socket Example</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5515.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Sat, 17 Dec 2005 09:37:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5515.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/5515.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5515.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/5515.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/5515.html</trackback:ping><description><![CDATA[<div id="wikicontent">
            <div class="wikitext"><div><p>This
is a sample program that uses a netlink socket to listen to route
change events and prints out some rudimentary information about them.
It's very simple and boring, but hopefully useful.</p>
<p>This being a wiki, I also expect everyone to hack on this code and
make it nicer, this is pretty hideous, but I want to get on with my
real program now. So if you're reading this page your mission (if you
choose to accept it) is to clean up the below code a little bit
(doesn't need to be much).</p>
<p>See <a href="http://www.wlug.org.nz/LinuxNetlinkSockets" class="wiki">LinuxNetlinkSockets</a></p>
<hr><pre>#include &lt;asm/types.h&gt;<br><br>#include &lt;sys/socket.h&gt;<br>#include &lt;unistd.h&gt;<br>#include &lt;err.h&gt;<br>#include &lt;stdio.h&gt;<br>#include &lt;netinet/in.h&gt;<br><br>#include &lt;linux/netlink.h&gt;<br>#include &lt;linux/rtnetlink.h&gt;<br><br>#if 0<br>//#define MYPROTO NETLINK_ARPD<br>#define MYMGRP RTMGRP_NEIGH<br>// if you want the above you'll find that the kernel must be compiled with CONFIG_ARPD, and<br>// that you need MYPROTO=NETLINK_ROUTE, since the kernel arp code {re,ab}uses rtnl (NETLINK_ROUTE)<br><br>#else<br>#define MYPROTO NETLINK_ROUTE<br>#define MYMGRP RTMGRP_IPV4_ROUTE<br>#endif<br><br>struct msgnames_t {<br>        int id;<br>        char *msg;<br>} typenames[] = {<br>#define MSG(x) { x, #x }<br>        MSG(RTM_NEWROUTE),<br>        MSG(RTM_DELROUTE),<br>        MSG(RTM_GETROUTE),<br>#undef MSG<br>        {0,0}<br>};<br><br>char *lookup_name(struct msgnames_t *db,int id)<br>{<br>        static char name[512];<br>        struct msgnames_t *msgnamesiter;<br>        for(msgnamesiter=db;msgnamesiter-&gt;msg;++msgnamesiter) {<br>                if (msgnamesiter-&gt;id == id)<br>                        break;<br>        }<br>        if (msgnamesiter-&gt;msg) {<br>                return msgnamesiter-&gt;msg;<br>        }<br>        snprintf(name,sizeof(name),"#%i",id);<br>        return name;<br>}<br><br>int open_netlink()<br>{<br>        int sock = socket(AF_NETLINK,SOCK_RAW,MYPROTO);<br>        struct sockaddr_nl addr;<br><br>        memset((void *)&amp;addr, 0, sizeof(addr));<br><br>        if (sock&lt;0)<br>                return sock;<br>        addr.nl_family = AF_NETLINK;<br>        addr.nl_pid = getpid();<br>        addr.nl_groups = MYMGRP;<br>        if (bind(sock,(struct sockaddr *)&amp;addr,sizeof(addr))&lt;0)<br>                return -1;<br>        return sock;<br>}<br><br>int read_event(int sock)<br>{<br>        struct sockaddr_nl nladdr;<br>        struct msghdr msg;<br>        struct iovec iov[2];<br>        struct nlmsghdr nlh;<br>        char buffer[65536];<br>        int ret;<br>        iov[0].iov_base = (void *)&amp;nlh;<br>        iov[0].iov_len = sizeof(nlh);<br>        iov[1].iov_base = (void *)buffer;<br>        iov[1].iov_len = sizeof(buffer);<br>        msg.msg_name = (void *)&amp;(nladdr);<br>        msg.msg_namelen = sizeof(nladdr);<br>        msg.msg_iov = iov;<br>        msg.msg_iovlen = sizeof(iov)/sizeof(iov[0]);<br>        ret=recvmsg(sock, &amp;msg, 0);<br>        if (ret&lt;0) {<br>                return ret;<br>        }<br>        printf("Type: %i (%s)\n",(nlh.nlmsg_type),lookup_name(typenames,nlh.nlmsg_type));<br>        printf("Flag:");<br>#define FLAG(x) if (nlh.nlmsg_type &amp; x) printf(" %s",#x)<br>        FLAG(NLM_F_REQUEST);<br>        FLAG(NLM_F_MULTI);<br>        FLAG(NLM_F_ACK);<br>        FLAG(NLM_F_ECHO);<br>        FLAG(NLM_F_REPLACE);<br>        FLAG(NLM_F_EXCL);<br>        FLAG(NLM_F_CREATE);<br>        FLAG(NLM_F_APPEND);<br>#undef FLAG<br>        printf("\n");<br>        printf("Seq : %i\n",nlh.nlmsg_seq);<br>        printf("Pid : %i\n",nlh.nlmsg_pid);<br>        printf("\n");<br>        return 0;<br>}<br><br>int main(int argc, char *argv[])<br>{         int nls = open_netlink();<br>        if (nls&lt;0) {<br>                err(1,"netlink");<br>        }<br>        while (1)<br>                read_event(nls);<br>        return 0;<br>}</pre>
</div>
</div>
</div>
<img src ="http://www.cnitblog.com/SpiWolf/aggbug/5515.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-12-17 17:37 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5515.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Netlink Socket for Linux</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5514.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Sat, 17 Dec 2005 09:31:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5514.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/5514.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5514.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/5514.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/5514.html</trackback:ping><description><![CDATA[<h1 class="title">Kernel Korner - Why and How to Use Netlink Socket</h1>

           
         

<!-- begin content -->

  
     
     
    <span class="submitted">By <a href="http://www.linuxjournal.com/user/801239" title="View user profile.">Kevin He</a> on Wed, 2005-01-05 02:00.</span>
    <span class="taxonomy"><a href="http://www.linuxjournal.com/taxonomy/term/8">SysAdmin</a></span>

    
Use this bidirectional, versatile method to pass <a target="_blank" href="http://www.linuxjournal.com/#" style="border-bottom: 0.1em solid darkgreen; font-weight: normal; text-decoration: underline; color: darkgreen; padding-bottom: 1px; background-color: transparent;" class="iAs">data</a> between kernel and user space.

<div class="article" lang="en">

<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x8573a08"></a></h2>
</div>
<p>Due to the complexity of developing and maintaining the kernel, only the most essential and
performance-critical code are placed in the kernel. Other things, such as GUI, management and control code,
typically are programmed as user-space applications. This practice of splitting the implementation of certain
features between kernel and user space is quite common in <a target="_blank" href="http://www.linuxjournal.com/#" style="border-bottom: 0.1em solid darkgreen; font-weight: normal; text-decoration: underline; color: darkgreen; padding-bottom: 1px; background-color: transparent;" class="iAs">Linux</a>. Now the question is how can kernel code and
user-space code communicate with each other?</p>
<p>The answer is the various IPC methods that exist between kernel and user space, such as system call,
ioctl, proc filesystem or netlink socket. This article discusses netlink socket and reveals its advantages as
a network feature-friendly IPC.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x8573ab8"></a> Introduction</h2>
</div>
<p>Netlink socket is a special IPC used for transferring information between kernel and user-space processes.
It provides a full-duplex communication link between the two by way of standard socket APIs for user-space
processes and a special kernel API for kernel modules. Netlink socket uses the address family AF_NETLINK, as
compared to AF_INET used by TCP/IP socket. Each netlink socket feature defines its own protocol type in the
kernel header file include/linux/netlink.h.</p>
<p>The following is a subset of features and their protocol types currently supported by the netlink
socket:</p>
<div class="itemizedlist">
<ul type="disc"><li>
<p>NETLINK_ROUTE: communication channel between user-space routing dæmons, such as BGP, OSPF, RIP and
kernel packet forwarding module. User-space routing dæmons update the kernel routing table through this
netlink protocol type.</p>
</li><li>
<p>NETLINK_FIREWALL: receives packets sent by the IPv4 firewall code.</p>
</li><li>
<p>NETLINK_NFLOG: communication channel for the user-space iptable management tool and kernel-space Netfilter
module.</p>
</li><li>
<p>NETLINK_ARPD: for managing the arp table from user space.</p>
</li></ul>
</div>
<p>Why do the above features use netlink instead of system calls, ioctls or proc filesystems for
communication between user and kernel worlds? It is a nontrivial task to add system calls, ioctls or proc
files for new features; we risk polluting the kernel and damaging the stability of the system. Netlink socket
is simple, though: only a constant, the protocol type, needs to be added to netlink.h. Then, the kernel
module and application can talk using socket-style APIs immediately.</p>
<p>Netlink is asynchronous because, as with any other socket API, it provides a socket queue to smooth the
burst of messages. The system call for sending a netlink message queues the message to the receiver's netlink
queue and then invokes the receiver's reception handler. The receiver, within the reception handler's
context, can decide whether to process the message immediately or leave the message in the queue and process
it later in a different context. Unlike netlink, system calls require synchronous processing. Therefore, if
we use a system call to pass a message from user space to the kernel, the kernel scheduling granularity may
be affected if the time to process that message is long.</p>
<p>The code implementing a system call in the kernel is linked statically to the kernel in compilation time;
thus, it is not appropriate to include system call code in a loadable module, which is the case for most
device drivers. With netlink socket, no compilation time dependency exists between the netlink core of Linux
kernel and the netlink application living in loadable kernel modules.</p>
<p>Netlink socket supports multicast, which is another benefit over system calls, ioctls and proc. One
process can multicast a message to a netlink group address, and any number of other processes can listen to
that group address. This provides a near-perfect mechanism for event distribution from kernel to user
space.</p>
<p>System call and ioctl are simplex IPCs in the sense that a session for these IPCs can be initiated only by
user-space applications. But, what if a kernel module has an urgent message for a user-space application?
There is no way of doing that directly using these IPCs. Normally, applications periodically need to poll the
kernel to get the state changes, although intensive polling is expensive. Netlink solves this problem
gracefully by allowing the kernel to initiate sessions too. We call it the duplex characteristic of the
netlink socket.</p>
<p>Finally, netlink socket provides a BSD socket-style API that is well understood by the software
development community. Therefore, training costs are less as compared to using the rather cryptic system call
APIs and ioctls.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x8573dfc"></a> Relating to the BSD Routing Socket</h2>
</div>
<p>In BSD TCP/IP stack implementation, there is a special socket called the routing socket. It has an address
family of AF_ROUTE, a protocol family of PF_ROUTE and a socket type of SOCK_RAW. The routing socket in BSD is
used by processes to add or delete routes in the kernel routing table.</p>
<p>In Linux, the equivalent function of the routing socket is provided by the netlink socket protocol type
NETLINK_ROUTE. Netlink socket provides a functionality superset of BSD's routing socket.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x8573eac"></a> Netlink Socket APIs</h2>
</div>
<p>The standard socket APIs-socket(), sendmsg(), recvmsg() and close()-can be used by user-space applications
to <a target="_blank" href="http://www.linuxjournal.com/#" style="border-bottom: 0.1em solid darkgreen; font-weight: normal; text-decoration: underline; color: darkgreen; padding-bottom: 1px; background-color: transparent;" class="iAs">access</a> netlink socket. Consult the man pages for detailed definitions of these APIs. Here, we discuss how
to choose parameters for these APIs only in the context of netlink socket. The APIs should be familiar to
anyone who has written an ordinary network application using TCP/IP sockets.</p>
<p>To create a socket with socket(), enter:</p>
<pre class="programlisting"><tt>int socket(int domain, int type, int protocol)<br></tt>
</pre>
<br>
<br>
<p>The socket domain (address family) is AF_NETLINK, and the type of socket is either SOCK_RAW or SOCK_DGRAM,
because netlink is a datagram-oriented service.</p>
<p>The protocol (protocol type) selects for which netlink feature the socket is used. The following are some
predefined netlink protocol types: NETLINK_ROUTE, NETLINK_FIREWALL, NETLINK_ARPD, NETLINK_ROUTE6 and
NETLINK_IP6_FW. You also can add your own netlink protocol type easily.</p>
<p>Up to 32 multicast groups can be defined for each netlink protocol type. Each multicast group is
represented by a bit mask, 1&lt;&lt;i, where 0&lt;=i&lt;=31. This is extremely useful when a group of
processes and the kernel process coordinate to implement the same feature-sending multicast netlink messages
can reduce the number of system calls used and alleviate applications from the burden of maintaining the
multicast group membership.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x857400c"></a> bind()</h2>
</div>
<p>As for a TCP/IP socket, the netlink bind() API associates a local (source) socket address with the opened
socket. The netlink address structure is as follows:</p>
<pre class="programlisting"><tt>struct sockaddr_nl<br>{<br>  sa_family_t    nl_family;  /* AF_NETLINK   */<br>  unsigned short nl_pad;     /* zero         */<br>  __u32          nl_pid;     /* process pid */<br>  __u32          nl_groups;  /* mcast groups mask */<br>} nladdr;<br></tt>
</pre>
<br>
<br>
<p>When used with bind(), the nl_pid field of the sockaddr_nl can be filled with the calling process' own
pid. The nl_pid serves here as the local address of this netlink socket. The application is responsible for
picking a unique 32-bit integer to fill in nl_pid:</p>
<pre class="programlisting"><tt>NL_PID Formula 1:  nl_pid = getpid();<br></tt>
</pre>
<br>
<br>
<p>Formula 1 uses the process ID of the application as nl_pid, which is a natural choice if, for the given
netlink protocol type, only one netlink socket is needed for the process.</p>
<p>In scenarios where different threads of the same process want to have different netlink sockets opened
under the same netlink protocol, Formula 2 can be used to generate the nl_pid:</p>
<pre class="programlisting"><tt><br>NL_PID Formula 2: pthread_self() &lt;&lt; 16 | getpid();<br></tt>
</pre>
<br>
<br>
<p>In this way, different pthreads of the same process each can have their own netlink socket for the same
netlink protocol type. In fact, even within a single pthread it's possible to create multiple netlink sockets
for the same protocol type. Developers need to be more creative, however, in generating a unique nl_pid, and
we don't consider this to be a normal-use case.</p>
<p>If the application wants to receive netlink messages of the protocol type that are destined for certain
multicast groups, the bitmasks of all the interested multicast groups should be ORed together to form the
nl_groups field of sockaddr_nl. Otherwise, nl_groups should be zeroed out so the application receives only
the unicast netlink message of the protocol type destined for the application. After filling in the nladdr,
do the bind as follows:</p>
<pre class="programlisting"><tt><br>bind(fd, (struct sockaddr*)&amp;nladdr, sizeof(nladdr));<br></tt>
</pre>
<br>
<br>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x857421c"></a> Sending a Netlink Message</h2>
</div>
<p>In order to send a netlink message to the kernel or other user-space processes, another struct sockaddr_nl
nladdr needs to be supplied as the destination address, the same as sending a UDP packet with sendmsg(). If
the message is destined for the kernel, both nl_pid and nl_groups should be supplied with 0.</p>
<p>If the message is a unicast message destined for another process, the nl_pid is the other process' pid and
nl_groups is 0, assuming nlpid Formula 1 is used in the system.</p>
<p>If the message is a multicast message destined for one or multiple multicast groups, the bitmasks of all
the destination multicast groups should be ORed together to form the nl_groups field. We then can supply the
netlink address to the struct msghdr msg for the sendmsg() API, as follows:</p>
<pre class="programlisting"><tt><br>struct msghdr msg;<br>msg.msg_name = (void *)&amp;(nladdr);<br>msg.msg_namelen = sizeof(nladdr);<br></tt>
</pre>
<br>
<br>
<p>The netlink socket requires its own message header as well. This is for providing a common ground for
netlink messages of all protocol types.</p>
<p>Because the Linux kernel netlink core assumes the existence of the following header in each netlink
message, an application must supply this header in each netlink message it sends:</p>
<pre class="programlisting"><tt><br>struct nlmsghdr<br>{<br>  __u32 nlmsg_len;   /* Length of message */<br>  __u16 nlmsg_type;  /* Message type*/<br>  __u16 nlmsg_flags; /* Additional flags */<br>  __u32 nlmsg_seq;   /* Sequence number */<br>  __u32 nlmsg_pid;   /* Sending process PID */<br>};<br></tt>
</pre>
<br>
<br>
<p>nlmsg_len has to be completed with the total length of the netlink message, including the header, and is
required by netlink core. nlmsg_type can be used by applications and is an opaque value to netlink core.
nlmsg_flags is used to give additional control to a message; it is read and updated by netlink core.
nlmsg_seq and nlmsg_pid are used by applications to track the message, and they are opaque to netlink core as
well.</p>
<p>A netlink message thus consists of nlmsghdr and the message payload. Once a message has been entered, it
enters a buffer pointed to by the nlh pointer. We also can send the message to the struct msghdr msg:</p>
<pre class="programlisting"><tt><br>struct iovec iov;<br>iov.iov_base = (void *)nlh;<br>iov.iov_len = nlh-&gt;nlmsg_len;<br>msg.msg_iov = &amp;iov;<br>msg.msg_iovlen = 1;<br></tt>
</pre>
<br>
<br>
<p>After the above steps, a call to sendmsg() kicks out the netlink message:</p>
<pre class="programlisting"><tt><br>sendmsg(fd, &amp;msg, 0);<br></tt>
</pre>
<br>
<br>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x8574484"></a> Receiving Netlink Messages</h2>
</div>
<p>A receiving application needs to allocate a buffer large enough to hold netlink message headers and
message payloads. It then fills the struct msghdr msg as shown below and uses the standard recvmsg() to
receive the netlink message, assuming the buffer is pointed to by nlh:</p>
<pre class="programlisting"><tt><br>struct sockaddr_nl nladdr;<br>struct msghdr msg;<br>struct iovec iov;<br>iov.iov_base = (void *)nlh;<br>iov.iov_len = MAX_NL_MSG_LEN;<br>msg.msg_name = (void *)&amp;(nladdr);<br>msg.msg_namelen = sizeof(nladdr);<br>msg.msg_iov = &amp;iov;<br>msg.msg_iovlen = 1;<br>recvmsg(fd, &amp;msg, 0);<br></tt>
</pre>
<br>
<br>
<p>After the message has been received correctly, the nlh should point to the header of the just-received
netlink message. nladdr should hold the destination address of the received message, which consists of the
pid and the multicast groups to which the message is sent. And, the macro NLMSG_DATA(nlh), defined in
netlink.h, returns a pointer to the payload of the netlink message. A call to close(fd) closes the netlink
socket identified by file descriptor fd.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x8574560"></a> Kernel-Space Netlink APIs</h2>
</div>
<p>The kernel-space netlink API is supported by the netlink core in the kernel, net/core/af_netlink.c. From
the kernel side, the API is different from the user-space API. The API can be used by kernel modules to
access the netlink socket and to communicate with user-space applications. Unless you leverage the existing
netlink socket protocol types, you need to add your own protocol type by adding a constant to netlink.h. For
example, we can add a netlink protocol type for testing purposes by inserting this line into netlink.h:</p>
<pre class="programlisting"><tt>#define NETLINK_TEST  17<br></tt>
</pre>
<br>
<br>
<p>Afterward, you can reference the added protocol type anywhere in the Linux kernel.</p>
<p>In user space, we call socket() to create a netlink socket, but in kernel space, we call the following
API:</p>
<pre class="programlisting"><tt><br>struct sock *<br>netlink_kernel_create(int unit, <br>           void (*input)(struct sock *sk, int len));<br></tt>
</pre>
<br>
<br>
<p>The parameter unit is, in fact, the netlink protocol type, such as NETLINK_TEST. The function pointer,
input, is a callback function invoked when a message arrives at this netlink socket.</p>
<p>After the kernel has created a netlink socket for protocol NETLINK_TEST, whenever user space sends a
netlink message of the NETLINK_TEST protocol type to the kernel, the callback function, input(), which is
registered by netlink_kernel_create(), is invoked. The following is an example implementation of the callback
function input:</p>
<pre class="programlisting"><tt><br>void input (struct sock *sk, int len)<br>{<br> struct sk_buff *skb;<br> struct nlmsghdr *nlh = NULL;<br> u8 *payload = NULL;<br> while ((skb = skb_dequeue(&amp;sk-&gt;receive_queue)) <br>       != NULL) {<br> /* process netlink message pointed by skb-&gt;data */<br> nlh = (struct nlmsghdr *)skb-&gt;data;<br> payload = NLMSG_DATA(nlh);<br> /* process netlink message with header pointed by <br>  * nlh and payload pointed by payload<br>  */<br> }   <br>}<br></tt>
</pre>
<br>
<br>
<p>This input() function is called in the context of the sendmsg() system call invoked by the sending
process. It is okay to process the netlink message inside input() if it's fast. When the processing of
netlink message takes a long time, however, we want to keep it out of input() to avoid blocking other system
calls from entering the kernel. Instead, we can use a dedicated kernel thread to perform the following steps
indefinitely. Use <tt>skb = skb_recv_datagram(nl_sk)</tt> where <tt>nl_sk</tt> is the netlink socket returned
by netlink_kernel_create(). Then, process the netlink message pointed to by skb-&gt;data.</p>
<p>This kernel thread sleeps when there is no netlink message in nl_sk. Thus, inside the callback function
input(), we need to wake up only the sleeping kernel thread, like this:</p>
<pre class="programlisting"><tt><br>void input (struct sock *sk, int len)<br>{<br>  wake_up_interruptible(sk-&gt;sleep);<br>}<br></tt>
</pre>
<br>
<br>
<p>This is a more scalable communication model between user space and kernel. It also improves the
granularity of context switches.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d2bd8"></a> Sending Netlink Messages from the Kernel</h2>
</div>
<p>Just as in user space, the source netlink address and destination netlink address need to be set when
sending a netlink message. Assuming the socket buffer holding the netlink message to be sent is struct
sk_buff *skb, the local address can be set with:</p>
<pre class="programlisting"><tt><br>NETLINK_CB(skb).groups = local_groups;<br>NETLINK_CB(skb).pid = 0;   /* from kernel */<br></tt>
</pre>
<br>
<br>
<p>The destination address can be set like this:</p>
<pre class="programlisting"><tt><br>NETLINK_CB(skb).dst_groups = dst_groups;<br>NETLINK_CB(skb).dst_pid = dst_pid;<br></tt>
</pre>
<br>
<br>
<p>Such information is not stored in skb-&gt;data. Rather, it is stored in the netlink control block of the
socket buffer, skb.</p>
<p>To send a unicast message, use:</p>
<pre class="programlisting"><tt><br>int <br>netlink_unicast(struct sock *ssk, struct sk_buff <br>                *skb, u32 pid, int nonblock);<br></tt>
</pre>
<br>
<br>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d2d64"></a></h2>
</div>
<p>where <tt>ssk</tt> is the netlink socket returned by netlink_kernel_create(), <tt>skb-&gt;data</tt> points
to the netlink message to be sent and <tt>pid</tt> is the receiving application's pid, assuming NLPID Formula
1 is used. <tt>nonblock</tt> indicates whether the API should block when the receiving buffer is unavailable
or immediately return a failure.</p>
<p>You also can send a multicast message. The following API delivers a netlink message to both the process
specified by pid and the multicast groups specified by group:</p>
<pre class="programlisting"><tt><br>void <br>netlink_broadcast(struct sock *ssk, struct sk_buff <br>         *skb, u32 pid, u32 group, int allocation);<br></tt>
</pre>
<br>
<br>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d2ef0"></a></h2>
</div>
<p><tt>group</tt> is the ORed bitmasks of all the receiving multicast groups. <tt>allocation</tt> is the
kernel memory allocation type. Typically, GFP_ATOMIC is used if from interrupt context; GFP_KERNEL if
otherwise. This is due to the fact that the API may need to allocate one or many socket buffers to clone the
multicast message.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d2fcc"></a> Closing a Netlink Socket from the Kernel</h2>
</div>
<p>Given the struct sock *nl_sk returned by netlink_kernel_create(), we can call the following kernel API to
close the netlink socket in the kernel:</p>
<pre class="programlisting"><tt><br>sock_release(nl_sk-&gt;socket);<br></tt>
</pre>
<br>
<br>
<p>So far, we have shown only the bare minimum code framework to illustrate the concept of netlink
programming. We now will use our NETLINK_TEST netlink protocol type and assume it already has been added to
the kernel header file. The kernel module code listed here contains only the netlink-relevant part, so it
should be inserted into a complete kernel module skeleton, which you can find from many other reference
sources.</p>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d30a8"></a> Unicast Communication between Kernel and
Application</h2>
</div>
<p>In this example, a user-space process sends a netlink message to the kernel module, and the kernel module
echoes the message back to the sending process. Here is the user-space code:</p>
<pre class="programlisting"><tt><br>#include &lt;sys/socket.h&gt;<br>#include &lt;linux/netlink.h&gt;<br>#define MAX_PAYLOAD 1024  /* maximum payload size*/<br>struct sockaddr_nl src_addr, dest_addr;<br>struct nlmsghdr *nlh = NULL;<br>struct iovec iov;<br>int sock_fd;<br>void main() {<br> sock_fd = socket(PF_NETLINK, SOCK_RAW,NETLINK_TEST);<br> memset(&amp;src_addr, 0, sizeof(src_addr));<br> src__addr.nl_family = AF_NETLINK;      <br> src_addr.nl_pid = getpid();  /* self pid */<br> src_addr.nl_groups = 0;  /* not in mcast groups */<br> bind(sock_fd, (struct sockaddr*)&amp;src_addr, <br>      sizeof(src_addr));<br> memset(&amp;dest_addr, 0, sizeof(dest_addr));<br> dest_addr.nl_family = AF_NETLINK;<br> dest_addr.nl_pid = 0;   /* For Linux Kernel */<br> dest_addr.nl_groups = 0; /* unicast */<br> nlh=(struct nlmsghdr *)malloc(<br>                         NLMSG_SPACE(MAX_PAYLOAD));<br> /* Fill the netlink message header */<br> nlh-&gt;nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);<br> nlh-&gt;nlmsg_pid = getpid();  /* self pid */<br> nlh-&gt;nlmsg_flags = 0;<br> /* Fill in the netlink message payload */<br> strcpy(NLMSG_DATA(nlh), "Hello you!");<br> iov.iov_base = (void *)nlh;<br> iov.iov_len = nlh-&gt;nlmsg_len;<br> msg.msg_name = (void *)&amp;dest_addr;<br> msg.msg_namelen = sizeof(dest_addr);<br> msg.msg_iov = &amp;iov;<br> msg.msg_iovlen = 1;<br> sendmsg(fd, &amp;msg, 0);<br> /* Read message from kernel */<br> memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));<br> recvmsg(fd, &amp;msg, 0);<br> printf(" Received message payload: %s\n", <br>        NLMSG_DATA(nlh));<br>    <br> /* Close Netlink Socket */<br> close(sock_fd);<br>}    <br></tt>
</pre>
<br>
<br>
<p>And, here is the kernel code:</p>
<pre class="programlisting"><tt><br>struct sock *nl_sk = NULL;<br>void nl_data_ready (struct sock *sk, int len)<br>{<br>  wake_up_interruptible(sk-&gt;sleep);<br>}<br>void netlink_test() {<br> struct sk_buff *skb = NULL;<br> struct nlmsghdr *nlh = NULL;<br> int err;<br> u32 pid;     <br> nl_sk = netlink_kernel_create(NETLINK_TEST, <br>                                   nl_data_ready);<br> /* wait for message coming down from user-space */<br> skb = skb_recv_datagram(nl_sk, 0, 0, &amp;err);<br> nlh = (struct nlmsghdr *)skb-&gt;data;<br> printk("%s: received netlink message payload:%s\n", <br>        __FUNCTION__, NLMSG_DATA(nlh));<br> pid = nlh-&gt;nlmsg_pid; /*pid of sending process */<br> NETLINK_CB(skb).groups = 0; /* not in mcast group */<br> NETLINK_CB(skb).pid = 0;      /* from kernel */<br> NETLINK_CB(skb).dst_pid = pid;<br> NETLINK_CB(skb).dst_groups = 0;  /* unicast */<br> netlink_unicast(nl_sk, skb, pid, MSG_DONTWAIT);<br> sock_release(nl_sk-&gt;socket);<br>}    <br></tt>
</pre>
<br>
<br>
<p>After loading the kernel module that executes the kernel code above, when we run the user-space
executable, we should see the following dumped from the user-space program:</p>
<pre class="programlisting"><tt>Received message payload: Hello you!<br></tt>
</pre>
<br>
<br>
<p>And, the following message should appear in the output of dmesg:</p>
<pre class="programlisting"><tt>netlink_test: received netlink message payload: <br>Hello you!<br></tt>
</pre>
<br>
<br>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d3260"></a> Multicast Communication between Kernel and
Applications</h2>
</div>
<p>In this example, two user-space applications are listening to the same netlink multicast group. The kernel
module pops up a message through netlink socket to the multicast group, and all the applications receive it.
Here is the user-space code:</p>
<pre class="programlisting"><tt><br>#include &lt;sys/socket.h&gt;<br>#include &lt;linux/netlink.h&gt;<br>#define MAX_PAYLOAD 1024  /* maximum payload size*/<br>struct sockaddr_nl src_addr, dest_addr;<br>struct nlmsghdr *nlh = NULL;<br>struct iovec iov;<br>int sock_fd;<br>void main() {<br> sock_fd=socket(PF_NETLINK, SOCK_RAW, NETLINK_TEST);<br> memset(&amp;src_addr, 0, sizeof(local_addr));<br> src_addr.nl_family = AF_NETLINK;       <br> src_addr.nl_pid = getpid();  /* self pid */<br> /* interested in group 1&lt;&lt;0 */  <br> src_addr.nl_groups = 1;<br> bind(sock_fd, (struct sockaddr*)&amp;src_addr, <br>      sizeof(src_addr));<br> memset(&amp;dest_addr, 0, sizeof(dest_addr)); <br> nlh = (struct nlmsghdr *)malloc(<br>                          NLMSG_SPACE(MAX_PAYLOAD));<br> memset(nlh, 0, NLMSG_SPACE(MAX_PAYLOAD));      <br>    <br> iov.iov_base = (void *)nlh;<br> iov.iov_len = NLMSG_SPACE(MAX_PAYLOAD);<br> msg.msg_name = (void *)&amp;dest_addr;<br> msg.msg_namelen = sizeof(dest_addr);<br> msg.msg_iov = &amp;iov;<br> msg.msg_iovlen = 1;<br> printf("Waiting for message from kernel\n");<br> /* Read message from kernel */<br> recvmsg(fd, &amp;msg, 0);<br> printf(" Received message payload: %s\n", <br>        NLMSG_DATA(nlh));<br> close(sock_fd);<br>}    <br></tt>
</pre>
<br>
<br>
<p>And, here is the kernel code:</p>
<pre class="programlisting"><tt><br>#define MAX_PAYLOAD 1024 <br>struct sock *nl_sk = NULL;<br>void netlink_test() {<br> sturct sk_buff *skb = NULL;<br> struct nlmsghdr *nlh;<br> int err;<br> nl_sk = netlink_kernel_create(NETLINK_TEST, <br>                               nl_data_ready);<br> skb=alloc_skb(NLMSG_SPACE(MAX_PAYLOAD),GFP_KERNEL);<br> nlh = (struct nlmsghdr *)skb-&gt;data;<br> nlh-&gt;nlmsg_len = NLMSG_SPACE(MAX_PAYLOAD);<br> nlh-&gt;nlmsg_pid = 0;  /* from kernel */<br> nlh-&gt;nlmsg_flags = 0;<br> strcpy(NLMSG_DATA(nlh), "Greeting from kernel!");<br> /* sender is in group 1&lt;&lt;0 */<br> NETLINK_CB(skb).groups = 1;<br> NETLINK_CB(skb).pid = 0;  /* from kernel */<br> NETLINK_CB(skb).dst_pid = 0;  /* multicast */<br> /* to mcast group 1&lt;&lt;0 */<br> NETLINK_CB(skb).dst_groups = 1;<br> /*multicast the message to all listening processes*/<br> netlink_broadcast(nl_sk, skb, 0, 1, GFP_KERNEL);<br> sock_release(nl_sk-&gt;socket);<br>}    <br></tt>
</pre>
<br>
<br>
<p>Assuming the user-space code is compiled into the executable nl_recv, we can run two instances of
nl_recv:</p>
<pre class="programlisting"><tt><br>./nl_recv &amp;<br>Waiting for message from kernel<br>./nl_recv &amp;<br>Waiting for message from kernel<br></tt>
</pre>
<br>
<br>
<p>Then, after we load the kernel module that executes the kernel-space code, both instances of nl_recv
should receive the following message:</p>
<pre class="programlisting"><tt>Received message payload: Greeting from kernel!<br>Received message payload: Greeting from kernel!<br></tt>
</pre>
<br>
<br>
</div>
<div class="simplesect" lang="en">
<div class="titlepage">
<h2 class="title"><a name="N0x850ca10.0x85d3418"></a>Conclusion</h2>
</div>
<p>Netlink socket is a flexible interface for communication between user-space applications and kernel
modules. It provides an easy-to-use socket API to both applications and the kernel. It provides advanced
communication features, such as full-duplex, buffered I/O, multicast and asynchronous communication, which
are absent in other kernel/user-space IPCs.</p>
</div>
</div>

<p>Kevin Kaichuan He (<a href="mailto:hek_u5@yahoo.com">hek_u5@yahoo.com</a>) is a principal software
engineer at Solustek Corp. He currently is working on embedded system, device driver and networking protocols
projects. His previous work experience includes senior software engineer at Cisco Systems and research
assistant at CS, Purdue University. In his spare time, he enjoys digital photography, PS2 games and
literature.<br>
</p>
<p><font color="#ff0000">The URL of this article: <a href="http://www.linuxjournal.com/article/7356">http://www.linuxjournal.com/article/7356</a></font><br>
</p>

<img src ="http://www.cnitblog.com/SpiWolf/aggbug/5514.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-12-17 17:31 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/12/17/5514.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>I/O in FreeBSD</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/12/14/5359.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Wed, 14 Dec 2005 03:19:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/12/14/5359.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/5359.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/12/14/5359.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/5359.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/5359.html</trackback:ping><description><![CDATA[<P align=left>buf(9) manual<BR></P><img src ="http://www.cnitblog.com/SpiWolf/aggbug/5359.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-12-14 11:19 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/12/14/5359.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Linux国际化本地化和中文化</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4522.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Tue, 15 Nov 2005 03:39:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4522.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4522.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4522.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4522.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4522.html</trackback:ping><description><![CDATA[<!--StartFragment -->&nbsp;
<H4>Author:&nbsp;&nbsp;&nbsp;于明俭 &lt;justiny@turbolinux.com.cn &gt;</H4>
<H2>一 国际化、本地化和中文化</H2>
<OL>
<LI>国际化、本地化和多语言化的概念</LI><BR>
<P></P>
<P>一般来说， "国际化"是指把原来为英文设计的计算机系统或应用软件改写为同时支持多种语言和文化习俗的过程．在软件创作的初期，一般的编程语言，编译，开发都是只支持英文的， 为了适应更广的语言和文化习俗，软件有必要在设计结构和机制上支持多语言的扩展特性， 这一过程称为国际化．国际化仅仅是在软件设计上提供了使用多语 言的可能． </P>
<P>"本地化"是指把计算机系统或者应用软件转变为使用并兼容某种 特定语言的过程．比如，把原来为英文设计软件制作为支持中文的软 件就是本地化的一种．它主要包括翻译文本信息，界面信息，重新设计图标等等． </P>
<P>语言和文化习俗因地域不同而差别很大．对某一特定的地域的 语言环境称为"locale"．它不仅包括语言和货币单位，而且还包括数字标示格式， 日期和时间格式．国际化了的软件含有一个"locale" 的"参量"，使用该"locale"参量便可以设置某一区域所用的语言环境． </P>
<P>在国际化部分中只处理语言的部分叫"多语言化"．比如， 一个 "多语言化"的软件可以同时管理诸如英语，法语，中日韩文，阿拉伯语等． </P>
<P>在英文中， 国际化（Internationalization）被缩写为I18N， 即只 取首尾两个字母，中间字母为18个．同样地， 本地化（Localization） 缩写为L10N， 多语言化（Multilingualization）缩写为M17N． </P>
<P>在今天， Internet把世界各地的计算机联接了起来， 共享信息和 技术是必然的趋势和需要．因此各地的计算机系统可以互相交流变得越来越重要．在Linux系统向桌面普及的过程中， Linux软件也需要国 际化和本地化． </P>
<LI>中文化&nbsp;</LI>
<P></P>
<P>"中文化"是一个很模糊的概念．在Linux上的"中文化"它既包含使 软件或系统国际化，又包含使软件本地化．也就是说， "中文化"不仅仅 是只把软件本地化这么简单的事情， 更重要的是因为Linux直接支持中文的软件太少， 做"中文化"必须先做"国际化"． </P>
<P>由于历史的原因， 现阶段使用的中文又有简体中文和繁体中文之 分．所使用的编码也不同．支持中文的软件应该同时支持简体中文和繁体中文， 这对软件的国际化提出了更高的要求． </P>
<P>1999年是中国Linux发展和普及过程中最重要的一年， 其中涌现了许多制作中文Linux发布版本的公司．中文Linux的技术都是采取了中文化的捷径----中文平台．尽管都是中文平台， 但是具体实现的技术特点各不相同．充分展示了中文平台在Linux中文化过程中的魅力．中文平台在短期内发挥了巨大的作用， 加速Linux的中文化过程并推动Linux在中国的普及． </P>
<P>中文平台的主要技术特点是不用修改西文应用软件， 便可以显示和 输入中文（有的情况下会失效）．具体地说，就是利用自己的规范去修改 X系统的底层函数．从修改的层次上分为（1）修改函数库libX11.so，这种 方式是动态修改，又称外挂方式．外挂方式的实现可以是直接修改X11库 或使用LD_PRELOAD载入动态库修改．（2）修改X Server部分， 又称内嵌方式，它的实现也分为两种， 直接修改X Server部分和建立虚拟Display（X传输协议的部分代理）． </P>
<LI>X11 国际化的历史和级别&nbsp; </LI>
<P></P>
<P>早期的X11R4版本中， 仅仅含有支持单字节和双字节字体的函数， 所以它不能算是国际化的函数库．此后，一个叫做"mltalk"的X协会成立并着手研究X窗口系统的国际化问题．众多的X窗口系统供应商也参与了该组织．因为对国际化的研究刚刚开始， 所以mltalk提出的了一个基本问题：什么是X窗口系统的国际化？ 对它的解释也各不相同． 实际上， 即使是现在，人们对国际化的定义仍然存在分歧， 分歧的焦点主要集中于对软件或系统怎样程度的国际化才算是真正的国际化． </P>
<P>按国际化的级别来分， 下列几种情况都属于国际化： </P>
<OL>
<LI>语言可以切换．在系统启动时可以设置某种语言</LI>
<LI>使用不同语言的软件可以同时使用， 在应用软件启动时可以 设置某种语言</LI>
<LI>使用不同语言的软件可以同时使用， 而且应用软件的语言可 以动态切换</LI>
<LI>使用不同语言的软件可以同时使用， 而且在应用软件中可以 同时使用不同语言</LI></OL>显然，第（4）种国际化方式是最完善的方式， 其次是第（3）种，第（2）种 和第（1）种．mltalk 最终决定使用第（3）种，原因是需要支持第（4）种的 X窗口系统供应商是少数的．从目前Linux上的国际化情况看，支持第（2）， （3）种的国际化软件是最常见的，但是第（4）种软件比较少见，而且应用的 意义不是很大． 
<P>基于上述观点， X11R5 的目的是， 创建支持不用重新编译源代码 就可以适应于语言环境的应用软件开发平台．确切地说，就是国际化 的结构是基于标准C函数setlocale的．X11R5 确立了以下规范： </P>
<OL>
<LI>切换语言的机制</LI>
<LI>与语言无关的输出接口</LI>
<LI>与语言无关的输入接口</LI>
<LI>资源文件的国际化</LI>
<LI>X工具（Xt）的国际化</LI></OL>此后， 以X11R5 为基础， OSF/Motif 完成了国际化改造， 并且成为被用户广泛接受的高层图形软件库，直到今天， 一些大型的软件仍然使用 Motif 作为基础库使用， 如Java， Netscape等．X11R5的规范在制定的同时，为了检测规范的实用性， 开发了两套样本应用， 即 Xsi 和 Ximp．两套应用在输入协议上和对locale的支持上都不同，从而为开发商带来了不便． 
<P>X11R6 解决了X11R5中存在的问题， 主要的变化有， </P>
<OL>
<LI>定义了标准的输入协议</LI>
<LI>Locale数据格式定义</LI>
<LI>只采用了一种国际化工具的样本应用模块</LI></OL>在输出上， X11R6增加了从右到左的的书写方式， 以支持阿拉伯语和 希伯来语等，增加了从上到下的书写方式， 以支持中文和日文等的书写． 
<LI>国际化标准组织&nbsp; </LI>
<P></P>
<P>这里所说的国际化标准是国际化标准组织或一些相关组织制定的一些标准， 而且这些标准也会随时间不同而经常更新．国际化标准涉及到字符集，编码，字体处理，打印，文本绘制， 用户界面， 语言输入方法， 数据交换， 文化习俗，等方方面面． </P>
<P>下面列出一些制定国际化标准的组织： </P>
<UL>
<LI>Li18nux（Linux I18n）</LI>
<LI>ANSI（American National Standards Institute）</LI>
<LI>POSIX（Portable Operating System Interface for Computer Environments）</LI>
<LI>ISO（International Standards Organization）</LI>
<LI>IEEE（Institute of Electrical and Electronics Engineers）</LI>
<LI>Unicode Consortium</LI>
<LI>Open Group（X Consortium and OSF）</LI>
<LI>X/Open and XPG</LI></UL>其中， ANSI/ISO 制定了使用C编程语言编写国际化软件的通用接口．ISO 制定了字符集标准和其它影响locale名字的标准．IEEE提供了一些国际化的通用库函数和设置管理不同locale的用户命令． Open Group是Unix和X窗口 系统的国际化标准设立组织．Li18nux 是一个专门从事Linux上的软件国际化 规范制定的组织． 
<LI>国际化的意义&nbsp;</LI>
<P></P>
<P>国际化，特别是国际化中制定的标准，是当今开发国际化软件所必须 的．它也是软件开发的必然趋势．遵循国际化标准，可以更高效地开发和调试软件和移植软件， 降低软件的开发费用， 使用户更方便地使用软件．从国际环境来看， 新开发的基本的库函数都会支持国际化标准， 基于这些函数库所开发的应用软件理所当然地支持国际化标准，同时有大批的Linux 爱好者把以前不符合国际化标准的软件进行了改造，使它们在一定程度上符合国际化标准．使用国际化标准的软件， 淘汰非国际化标准的软件已成为 一种趋势． </P>
<P>从国际化的发展历史看， 其中许多标准都有日本的商业机构参与， 支持 日文的软件变得越来越多，而从日语软件移植为中文软件相对于直接移植西文软件相当容易， 有时甚至不用改动，这样就节省了许多不必要的劳动． 反过来， 符合国际化标准的中文软件又影响日语和韩语软件，成滚雪球之势向前发展．其次， 软件商的开发比较看好亚洲市场中的日本市场，在 Unix/Linux上的日语软件或操作系统一般是符合国际化标准的， 所以兼容这一标准是十分必要的．当然，目前的国际化标准也存在不足之处， 特别是对中文这一特殊语言（因为含有GB和Big5两种不能共存的编码）的处理上，应该由中国人在原来的基础上作相应的扩展． </P>
<P>对中文Linux来说， 遵循国际化也是必然的趋势．在以中文平台为基础 的中文Linux上，软件移植已成为必须解决的问题，这个问题的最终解决 方法就是遵循同一标准，就目前来说遵循国际化标准是唯一的方法．鉴于目前中文Linux上的中文平台的混乱状态，国际化标准是从无序到有序过渡 的必然途径． </P>
<P>软件的国际标准化也为最终用户带来极大的好处， 如同时支持简体中文 和繁体中文，中文操作为双字节操作， 中文输入能够在更大的程度上使用 标准输入接口带来的好处，如输入服务器的定位等交互式操作． </P>
<P>国际化的另一个特点是工作在应用软件级别， 所以国际化不会给X窗口 系统带来不稳定性． </P>
<LI>参考资料：</LI>
<UL>
<LI>Linux I18N： http：//www.li18nux.org/</LI></UL></OL>
<H2>二 Locale</H2>
<OL>
<LI>Locale 的概念</LI><BR>
<P></P>
<P>Locale 是ANSI C语言中最基本的支持国际化的标志， 对中文Linux来说， 如果它支持国际化，那么支持中文Locale是最基本的要求． </P>
<P>Locale 是软件在运行时的语言环境， 它包括语言（Language）， 地域（Territory）和字符集（Codeset）．其格式为： 语言[_地域[.字符集]]． 如对中文GBK字符集， locale的格式是：zh_CN.GBK． 目前Linux上的中文 Locale还不完善， glibc2.1.x中的许多涉及Locale的C函数还不正确．如果用户需要安装中文GBK Locale， 可以直接使用TLC6.0中的： </P>
<UL>
<LI>glibc-2.1.2（含有GBK模块）</LI>
<LI>localedata-zh-0.07</LI>
<LI>/usr/X11R6/lib/X11/locale/zh_CN.GBK/XLC_LOCALE（X 下的 GBK Locale）</LI></UL>Locale 包含了以下分类： 
<OL>
<LI>LC_COLLATE， 用于比较和排序．排序对中文来说也比较重要， 但是现在的glibc中的locale对中文支持有些问题．汉字排序的的方式有许多种， 按照发音（汉语拼音）或者汉字笔画来排序 是比较容易被接受的．</LI>
<LI>LC_CTYPE， 用于字符分类</LI>
<LI>LC_MONETORY， 用于货币单位</LI>
<LI>LC_NUMERIC， 用于数字显示格式．下面是不同国家的在货币符号 和数字格式上的不同：</LI>
<UL>
<LI>中国大陆： 1,234.56RMB</LI>
<LI>美国： $1,234.56</LI>
<LI>德国： 1.234,56DM</LI></UL>
<LI>LC_TIME， 用于时间和日期．时间可以用12小时或者24小时的 格式来计算．在小时和分钟之间可以用逗点或者冒号隔开．下面是一些Locale设置的时间和日期的格式：</LI>
<UL>
<LI>中国： 14点20分 2000年三月十四号</LI>
<LI>英国： 02：20pm 14/03/2000</LI>
<LI>美国： 02：20pm 03/14/2000</LI>
<LI>芬兰： 14.20 14.03.2000</LI></UL>
<LI>LC_MESSAGES， 用于国际化信息， 主要是提示信息，错误信息， 状态信息， 标题，标签， 按钮和菜单等．</LI></OL>Locale 通过ANSI C 函数setlocale（分类， locale）来初始化locale 数据．当locale设置为空时， locale的值便从系统的环境变量中取得．为了 方便应用软件， 设置所有的分类，可以采用下述方式： 
<DD>setlocale（LC_ALL， ""）;&nbsp; </DD>
<P></P>
<P>如果不成功， 该函数返回NULL．函数应该回落到setlocale（LC_ALL，"C"）． </P>
<LI>在X中使用Locale&nbsp;</LI>
<P></P>
<P>在X的客户程序中使用Locale的机制和在标准C函数中使用Locale的方式一样，除此之外， 在X库中还定义了另外两个函数来判断X的locale支持和设置locale 的修饰（XModifier），在X中使用Locale和libX11的基本步骤如下： </P>
<UL>
<LI>setlocale（）： 设置当前的locale</LI>
<LI>XSupportLocale（）： 用来判断X是否支持目前设置的locale．</LI>
<LI>XSetLocaleModifier（）： 它用来指定一系列的locale修正值． 它的参量的格式是@分类=赋值．目前唯一可用的是输入服务器的名称"im"．如果参量为空， 则根据系统的环境 变量XMODIFIERS查找．比如在系统上设置了环境变量：&nbsp; </LI>
<P></P>
<P>% setenv XMODIFIERS @im=Chinput （csh） 或 <BR>% export XMODIFIERS=im=Chinput （bash） </P>
<P>则客户程序将查找到输入服务器Chinput， "Chinput"是 输入服务器所设置的名称．&nbsp;&nbsp;</P></UL>
<LI>文化习俗的差别&nbsp; </LI>
<P></P>
<P>下面是在国际化和本地化过程中常常遇到的并且应当注意的地方， 对国际化软件的开发， 应该充分注意到各个地域的文化和习惯， 开发出通用的软件， 对于本地化过程，则应选择与本地域相符的习惯． </P>
<UL>
<LI>姓名，地址等特殊信息</LI><BR>姓名中的"姓"和"名"的先后次序， 地址书写的先后次序 电话号码的长度等等 
<LI>图标的通用性</LI><BR>图标是易于接受的用户界面， 设计时应考虑到地域习惯， 而且图标上不能有图形文字，否则需要重新设计本地图标， 并翻译图标上的文字． 
<LI>声音使用</LI><BR>不适当的声音或提示可能会引起人的反感．另外， 声音 的性别对某些国家是敏感的． 
<LI>颜色使用</LI><BR>颜色和色调与民俗有关， 比如红色在美国表示危险， 在中国 表示喜庆． 
<LI>纸张尺寸</LI><BR>打印纸的尺寸因地域而不同， 在选择缺省尺寸时应注意． 
<LI>键盘差别</LI><BR>在键盘上的键可能因国家而异， 键的个数也可能不一样． 
<LI>政治因素</LI><BR>在产品设计上， 尽量不要有政治敏感性部分．</UL>
<LI>参考资料：</LI>
<UL>
<LI>Linux 上的Locale</LI><BR>http：//www.ping.be/linux/locales/index.shtml 
<LI>GBK Locale</LI><BR>ftp：//ftp.turbolinux.com.cn/pub/turbolinux/TurboLinuxC-6.0/SRPMS/SRPMS/localedata-zh-0.07-1.src.rpm</UL></OL>
<H2>三 X 窗口系统的国际化</H2>在 X 窗口系统上的国际化， 特别是中文化， 主要体现在显示，输入和打印三个方面． 
<OL>
<LI>显示的国际化</LI>
<OL>
<LI>字符集和编码&nbsp; </LI>
<P></P>
<P>在Linux上经常使用的字符集是ISO 8859系列的字符集．它包含了10个 多语言的单字节编码字符集．它们分别是， </P>
<CENTER>
<TABLE border=2>
<TBODY>
<TR>
<TD width=200>字符集</TD>
<TD width=400>涵盖语言</TD></TR>
<TR>
<TD>ISO 8859-1（Latin1）</TD>
<TD width=400>拉丁一字符集， 包含绝大多数的欧洲语言， 例如French（fr）， Spanish （es）， Catalan （ca）， Basque （eu）， Portuguese （pt）， Italian （it）， Albanian （sq）， Rhaeto-Romanic （rm）， Dutch （nl）， German （de）， Danish （da）， Swedish （sv）， Norwegian （no）， Finnish （fi）， Faroese （fo）， Icelandic （is）， Irish （ga）， Scottish （gd）， English （en）， Afrikaans （af） 和 Swahili （sw）．影响了美洲， 澳洲和非洲．&nbsp;</TD></TR>
<TR>
<TD>ISO 8859-2（Latin2）</TD>
<TD width=400>拉丁二字符集， 包含了中欧和东欧的语言：Czech （cs）， Hungarian （hu）， Polish （pl）， Romanian （ro）， Croatian （hr）， Slovak （sk）， Slovenian （sl）， Sorbian．&nbsp;</TD></TR>
<TR>
<TD>ISO 8859-3（Latin3）</TD>
<TD width=400>拉丁三字符集， 包括： Esperanto （eo） and Maltese （mt）</TD></TR>
<TR>
<TD>ISO 8859-4（Latin4）</TD>
<TD width=400>拉丁四字符集， 包括： Estonian （et）， 巴尔地克 Latvian （lv） 和 Lithuanian （lt）， Greenlandic （kl） ， Lappish．&nbsp;</TD></TR>
<TR>
<TD>ISO 8859-5（西里尔语）</TD>
<TD width=400>Bulgarian （bg）， Byelorussian （be）， Macedonian （mk）， Russian （ru）， Serbian （sr）&nbsp;</TD></TR>
<TR>
<TD>ISO 8859-6（阿拉伯语）</TD>
<TD>阿拉伯语（ar）</TD></TR>
<TR>
<TD>ISO 8859-7（希腊语）</TD>
<TD>希腊语（el）</TD></TR>
<TR>
<TD>ISO 8859-8（希伯来语）</TD>
<TD>Hebrew （iw） 和Yiddish （ji）</TD></TR>
<TR>
<TD>ISO 8859-9（Latin5）</TD>
<TD>重排了Latin1， 用土耳其语的几个字母做了替换</TD></TR>
<TR>
<TD>ISO 8859-9（Latin6）</TD>
<TD>重排了Latin4， 去掉了某些符号， 增加了Inuit等</TD></TR>
<TR>
<TD>ISO 8859-11（泰国语）</TD>
<TD>泰国语（th）</TD></TR>
<TR>
<TD>ISO 8859-12</TD>
<TD>Celtic</TD></TR>
<TR>
<TD>ISO 8859-13（Latin7）</TD>
<TD>Baltic Rim 和 Lativian（lv）</TD></TR>
<TR>
<TD>ISO 8859-14（Latin8）</TD>
<TD>Gaelic 和 Welsh （cy）</TD></TR>
<TR>
<TD>ISO 8859-15（Latin9）</TD>
<TD>Latin1的变种， 修改了某些字母</TD></TR></TBODY></TABLE></CENTER>
<P>双字节字符集主要包含中文，日文和韩文．它由前导字节（Lead Byte） 和尾部字节（Trail Byte）构成，由于一个字符采用了两个字节， 在软件的 国际化方面又增加了一些麻烦，比如在显示上， 光标的位置不能位于汉字 之间，删除和移动时必须是整字操作等，在输入上， 一般需要预编辑服务器 才能输入汉字． 下表列出了中日韩语言编码的有关信息： </P>
<CENTER>
<TABLE border=2>
<TBODY>
<TR>
<TD>语言</TD>
<TD>字符集</TD>
<TD>代码页</TD>
<TD>前导字节范围</TD>
<TD>尾部字节范围</TD></TR>
<TR>
<TD rowSpan=2>简体中文</TD>
<TD>GB2312-1980</TD>
<TD>CP936</TD>
<TD>0xA1-0xF7</TD>
<TD>0xA1-0xFE</TD></TR>
<TR>
<TD>GBK</TD>
<TD>无</TD>
<TD>0x81-0xFE</TD>
<TD>0x40-0x7E， 0x80-0xFE</TD></TR>
<TR>
<TD>中文繁体</TD>
<TD>BIG-5</TD>
<TD>CP950</TD>
<TD>0x81-0xFE</TD>
<TD>0x40-0x7E， 0xA1-0xFE</TD></TR>
<TR>
<TD>日文</TD>
<TD>Shift-JIS</TD>
<TD>CP932</TD>
<TD>0x81-0x9F， 0xE0-0xFC</TD>
<TD>0x40-0xFC（0x7F除外）</TD></TR>
<TR>
<TD rowSpan=2>韩文</TD>
<TD>KSC-5601-1987</TD>
<TD>CP949</TD>
<TD>0x81-0xFE</TD>
<TD>0x41-0x5A，0x61-0x7A，0x81-0xFE</TD></TR>
<TR>
<TD>KSC-5601-1992</TD>
<TD>CP1361</TD>
<TD>0x84-0xD3 <BR>0xD8 <BR>0xD90-0xDE <BR>0xE0-0xF9 <BR>0x41，0xFE</TD>
<TD>0x41-0x7E <BR>0x81-0xFE <BR>0x31-0x7E</TD></TR></TBODY></TABLE></CENTER>
<P>最近， 信息产业部和国家质量技术监督局联合发布了两项新的中文信息处理基础性国家标准，为解决偏、生汉字的输入提供了方案。其中GB18030- 2000《信息技术和信息交换用汉字编码字符集、基本集的扩充》，为强制性国家标准．它收录了2.7万多个汉字，总编码空间超过150万个码位，为彻底解决邮政、户政、金融、 地理信息系统等迫切需要的人名、地名用字问题提供了解决方案，也为汉字研究、古籍整理等领域提供了统一的信息平台基础。这项标准还同时收录了藏文、蒙文、维吾尔文等主要的少数民族文字．字符集编码范围是： </P>
<CENTER>
<TABLE border=2>
<TBODY>
<TR>
<TH>字节数</TH>
<TH>编码空间</TH>
<TH>码位数目</TH></TR>
<TR>
<TD>单字节</TD>
<TD>0x00-0x80</TD>
<TD>129</TD></TR>
<TR>
<TD>双字节</TD>
<TD>第一字节：0x81-0xFE <BR>第二字节：0x40-0x7E，0x80-0xFE</TD>
<TD>23940</TD></TR>
<TR>
<TD>四字节</TD>
<TD>四字节范围分别是： <BR>0x80-0xFE，0x30-0x39，0x81-0xFE，0x30-0x39</TD>
<TD>1587600</TD></TR></TBODY></TABLE></CENTER>
<P>香港特别行政区也对Big5编码提出了"香港增补字符集"， 其目的，是收纳香港特区政府及市民在中文电子通讯中有需要使用的字符，来补充目前大五码和ISO10646编码标准内并未包含的字符，以作为一个通用的中文界面，方便大家能准确地以中文进行电子通讯。香港增补字符集有两套编码方案，一套适用於大五码系统，另一套适用於ISO10646平台。香港增补字符集的大五码版本，实际上是政府通用字库的增订版。ISO10646国际编码标准目前并未包含香港增补字符集内的所有字符。目前尚未收纳在ISO10646内的香港增 补字符集字符，均已提交国际标准化组织管辖下的表意文字小组，以考虑是否纳入ISO10646日后的新增版本内． </P>
<P>上述标准和草案应该是以后的中文Linux所应该遵循的． </P>
<LI>多字节字符（Multibyte）和宽字符（WideChar）的使用&nbsp; </LI>
<P></P>
<P>我们平时见到的以文本方式存在的字符都是多字节字符， 它主要用于文件存储和网络上的以流（Stream）的方式传输．一个GB编码的汉字需要两个字节．多字节字符的缺点是在中文处理上不方便，比如汉字的删除和光标的 移动都会有半汉字问题．为了文本处理的方便，在内部操作上通常是把汉字 与英文的混和字符串先转换成等宽度的字符串，即宽字符，为软件的内部处理 提供方便． </P>
<P>glibc2.1.x中多字节字符串和宽字符串的转换有时有问题．在X下还可以 使用另外一种方式完成转换，即使用XmbTextListToTextProperty（）和 XwcTextPropertyToTextList（） 联合完成转换． </P>
<LI>Unicode&nbsp; </LI>
<P></P>
<P>目前所使用的Unicode 是一种16位字宽的字符编码， 它由非赢利的计算机组织Unicode研讨会维护和改进．它起源于Xerox和Apple之间的合作研究．几个公司组成了一个非正式的论坛， 接着IBM， Microsoft等公司迅速加入． Unicode研讨会在1990年发表了Unicode标准版本1，同时国际标准化组织完成了一种类似的编码----ISO 10646．因为没有必要存在两套标准，所以Unicode 研讨会和国际标准化组织在1991到1992合二为一． 1994年， 中国和日本开始了基于ISO10646上的国家标准进行工作．现在， Unicode 开始用在许多产品中． </P>
<P>Unicode包含了当今计算机领域中广泛使用的所由字符， 如世界上大部分 的书面语言，印刷字符， 数字和技术符号，地理图形和标点符号．由于Unicode 的一致性，它在大多数情况下都可能简化软件的国际化过程．它取消了处理多种代码页的必要，并且由于是16位编码， 因此由双字节字符集所引起的额外 处理也不必要了． </P>
<P>但是， Unicode作为一种编码也有它的缺陷， 比如编码的位置与排序无关，所以使软件支持Unicode仅仅是国际化的第一步， 实际情况中还需要与语言相关的信息和规则．所以Unicode一般作为程序的内部处理编码， 必须提供与其它 编码的双向转换表． </P>
<P>最后需要说明的是， 虽然使用Unicode会使普通的英文文本大两倍， 但是使用Unicode的整个系统却不会增加太大，因为系统存放的文件大部分是二进制 文件格式， 同时，使用针对Unicode的压缩方式，可以把文件压缩成和使用对应 的8位正文一样大小． </P>
<LI>字体（Font）和字体集（FontSet）&nbsp; </LI>
<P></P>
<P>在X窗口系统下使用的字体都必须在X服务器中注册X逻辑字体描述（X Logical Font Description）名．它包括了字体的许多信息， 例如以下为西文字体和中文字体的两个例子． </P>
<OL>
<LI>-adobe-times-medium-r-normal--14-140-75-75-p-74-iso8859-1</LI>
<LI>-tlc-song-medium-r-normal--24-240-75-75-c-240-gbk-0</LI></OL>为了方便使用， 用户还可以给每一个字体加一个或多个别名， 别名文件 fonts.alias 放在字体目录下， 可以手工编辑．当字体目录变更或别名变更 后， 必须使用命令 "xset fp rehash"或重新启动X才起作用． 
<P>X 字体也可以通过字体服务器（Font Server）加载．这对于本地不放字体 的系统或X终端特别有用．加载的协议可以是TCP或DECNET． </P>
<P>X 窗口系统的字体在X Server中之存在一份， 当所由软件都不使用它时， 字体的内存自动施放． </P>
<P>字体中包含了制造商名， 字体类型， 权重， 字体大小， 字符集等信息．它们也可以缩写， 省去的部分用星号代替， 比如对上面的中文字体， 可以缩写为： </P>
<DD>-*-song-*-24-*-gbk-0</DD><BR>
<P></P>
<P>在实际应用中， 字符串往往是中文和英文的混和字符串， 所以必须使用两种字体来绘出该字符串， 这种指定两种或两种以上的字体的描述就是字体集．字体集一般的格式是把多种字体用逗号隔开， 比如， 指定下列字体集： </P>
<DD>"-adobe-helvetica-medium-r-normal--14-*-*-*-*-*-iso8859-*，\</DD>
<DD>-tlc-song-medium-r-normal--14-*-*-*-*-*-gbk-0"&nbsp; </DD>
<P></P>
<P>令人遗憾的是， 中文的GB编码和Big5编码有重叠区域， 不能区分开来， 所以字体集并不能同时指定GB和Big5的字体． </P>
<P>字体集的具体载入受到Locale的影响． </P>
<P>在许多已经国际化的软件和图形库中， 一般通过资源文件让用户指定字体集，比如gtk的简体中文资源文件为/etc/gtk/gtkrc.zh_CN， qt-1.44（国际化的）的资源文件是 ~/.qti18nrc 等等．&nbsp;&nbsp;</P></OL>
<LI>信息的国际化</LI><BR>
<P></P>
<P>信息（Message）国际化是软件国际化中比较重要的一环， 如果使软件可以 支持多种语言，在设计时就应当考虑到信息的国际化问题．现在的绝大多数 软件使用GNU的gettext作为基本工具．信息国际化的基本步骤是： </P>
<UL>
<LI>在软件初始化时设置使用setlocale（）设置Locale</LI>
<LI>使用gettext宏定义， 使程序看上去比较方便：</LI>
<LI>指定信息的位置：</LI>
<LI>指定翻译信息： _（"Some Strings"）;</LI>
<LI>在软件完成后，使用 xgettext 提取信息并翻译</LI>
<LI>使用msgfmt把信息文件转换为.mo文件， 安装到locale目录下</LI></UL><PRE>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* file this_app.c */<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #include &lt;locale.h&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #include &lt;libintl.h&gt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #define _（String）&nbsp; gettext（String）<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #define N_（String）&nbsp; gettext（String）<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; #define __（String） （String）<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int main（）{<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //由环境变量决定locale<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setlocale（LC_ALL， ""）;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //设置message的位置和文件名<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bindtextdomain（"this_app"， "/usr/share/locale"）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; textdomain（"this_app"）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf（_（"Some String"））;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;至此， 本程序的国际化过程已完成．编译并联接成可执行文件this_app.</PRE>
<DD>gcc -o this_app this_app.c</DD><BR>
<P></P>
<P>&nbsp;下面是本地化的过程.&nbsp;</P>
<UL>
<LI>提取要翻译的信息： xgettext -a -o this_app.po this_app.c</LI>
<LI>翻译信息&nbsp; </LI>
<P></P>
<P>&nbsp;在文件this_app.po 中含有"Some String"：&nbsp;</P>
<DD>msgid "Some String"</DD>
<DD>msgstr ""</DD><BR>
<P></P>
<P>&nbsp;翻译成：&nbsp;</P>
<DD>msgid "Some String"</DD>
<DD>msgstr "一些字符串"</DD>&nbsp; 
<LI>格式化信息文件： msgfmt -o this_app.mo this_app.po</LI>
<LI>拷贝信息文件到locale的目录下， 比如对于中文zh_CN， cp this_app.mo /usr/share/locale/zh_CN/LC_MESSAGES</LI>
<LI>执行文件： LC_ALL=zh_CN ./this_app</LI></UL></OL>
<LI>输入的国际化</LI><BR>
<P></P>
<P>在X窗口系统下输入主要有三种方式： </P>
<OL>
<LI>单次击键输入单字符</LI>
<LI>两个或多个组合键输入单字符</LI>
<LI>除键输入外， 还需要转换服务器</LI></OL>其中前两种用于输入西文字符， 比如对于欧洲语言的特殊字符的输入， 通常采用重映射键盘的方法．或者使用"加速键"的方法输入，加速键是键盘 上的特殊键， 按下后不会使光标向后移动． 
<P>在Linux下， 使用软件xkeycaps可以把键盘重新映射并且保存整个键盘 在映射后的对照表，使用命令xmodmap可以加载映射表． </P>
<P>对于中文输入， 主要使用第三种输入方式．针对各种语言的综合考虑， X 窗口系统在输入上定义了下列区域： </P>
<OL>
<LI>预编辑区域（Preedit Area）， 用于显示输入的过程， 当用户输入 字符时， 应立即显示在该区域</LI>
<LI>状态区域（Status Area）， 用于显示输入状态， 对中文来说， 用于显示输入方法，全角/半角状态， 中文/西文标点符号状态．</LI>
<LI>辅助区域（Auxiliary Area）， 显示可供选择的列表， 又称选择 区域， 它由输入服务器控制．</LI></OL>根据预编辑区域和状态区域的不同组合， X 窗口系统共定义了四种输入 的风格（Input Style）： 
<OL>
<LI>Root风格： 预编辑区域和选择区域都在应用软件之外， 它们都是 由输入服务器完成的，输入服务器所显示的界面是根窗口的子 窗口．如类似"中文之星"的独立的输入条模式．</LI>
<LI>OffTheSpot风格： 预编辑区域和选择区域在应用软件之内， 通常 是在窗口下方的某个固定区域内．如XEmacs的缺省输入模式．</LI>
<LI>OverTheSpot风格： 预编辑区域在当前的输入位置， 状态区域 在应用程序的某一固定区域．它通常又称为光标跟随模式，类似 于Windows下的智能ABC输入方法</LI>
<LI>OnTheSpot风格： 预编辑区域和选择区域都在应用软件之内， 内容是由输入服务器发送的，应用程序负责显示．</LI></OL>对中文输入来说， 最好的风格是（3），（4），（1）．对大部分中文输入方法，必须弹出辅助区域， 供用户选择， 只有少数的中文输入方法， 如五笔字型，比较适合（4）．对于状态区域，中文输入多数选用在Root风格的窗口的某个 位置或使用专用的控制条．在MS Windows下比较常用的光标跟随模式，可以 用（3），（4）来实现．鉴于Linux下有的用户把X Window设置成为虚屏模式，选择上述的任何一种模式都不尽满意． 
<P>对应用软件来说， 最简单的输入接口是Root风格， 它把显示部分交给输入服务器去做．编写软件时所用的代码量少，是对软件初步使用国际化 标准的最佳选择．从方便用户的角度来看， 应用软件，特别是高层的库函数应该同时支持四种输入风格．令人遗憾的是， 一般软件仅支持两到三种输入风格．所以在现在的输入服务器（IM Server）也很少支持四种风格，这似乎 成了鸡和蛋的问题． </P>
<P>下面列出几种常用软件和图形库的XIM支持情况： </P>
<CENTER>
<TABLE border=1>
<TBODY>
<TR>
<TD>Netscape</TD>
<TD>Root，OffTheSpot，OverTheSpot</TD></TR>
<TR>
<TD>Java&nbsp;</TD>
<TD>Root，OnTheSpot</TD></TR>
<TR>
<TD>Qt&nbsp;</TD>
<TD>Root，OverTheSpot</TD></TR>
<TR>
<TD>gtk+&nbsp;</TD>
<TD>Root，OverTheSpot</TD></TR>
<TR>
<TD>rxvt&nbsp;</TD>
<TD>Root，OffTheSpot，OverTheSpot</TD></TR></TBODY></TABLE></CENTER>
<P>中文输入需要客户软件和服务器软件的的密切配合， 它们之间是通过 XIM（X Input Method）协议来通讯的．输入服务器首先起动，在X Server里 注册自己，服务器的名字也被注册．当客户程序起动时， 到X Server里查寻有没有符合自己locale类型的输入服务器（如果用XMODIFIERS指定服务器名，则同时用locale和名字区分）．找到后，根据输入服务器提供的风格种类 选择一个最适合自己的风格．然后客户程序为每一个需要输入的窗口都建立一个自己的标示IC（Input Context）， 里面含有客户程序的信息， 以后的通讯则一直使用该标示． </P>
<P>下面是直接使用X Lib和服务器联接的过程， 在高层函数库中， 把这一 过程隐藏了起来： </P><PRE>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XIM im;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XIC ic;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ..．<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if（ （im = XOpenIM（display， NULL， NULL， NULL）） == NULL ） {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf（"Error ： XOpenIM !\n"）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; exit（0）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //指定预编辑的类型等..．<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if（ （ic = XCreateIC（im，&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XNInputStyle，&nbsp;&nbsp; XIMPreeditPosition | XIMStatusNothing，<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XNClientWindow， window，&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; NULL）） == NULL ） {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; printf（"Error ： XCreateIC（） ! \n"）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XCloseIM（im）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; exit（0）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ..．<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for（;;） {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XNextEvent（display， &amp;event）;<BR><BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //如果输入服务器接收并处理...继续<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if （XFilterEvent（&amp;event， None） == True）<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; switch（event.type） {<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; case Expose：<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XmbDrawString（...）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; case KeyPress：<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; count = XmbLookupString（ic，&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; （XKeyPressedEvent *） &amp;event，<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string， len， &amp;keysym， &amp;status）;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ..．<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR><BR></PRE>目前使用比较广泛的XIM输入服务器有Chinput（简体中文， 同时支持繁体）， xcin（繁体中文）， kinput2（日文） 和 hanIM/ami（韩文）． 
<P>中文输入服务器Chinput 选择了OverTheSpot风格作为缺省的输入模式，它与标准的输入风格略有不同，即把预编辑区域偏离输入位置， 使输入区 域同时作为状态区域，在很大程度满足了用户的输入习惯．同时它还使用辅助工具条显示和改变输入状态．Chinput还解决了同时使用GB和Big5编码的问题，被动输入（Passive Input）问题等．对于普通用户， 除了使用键盘 输入外，还可以使用手写识别输入和语音识别输入方式．目前的输入架构基本能够满足它们的要求．笔者在手写识别输入方面做了一些尝试， 发现对绝大部分软件是能够适合被动输入的． </P>
<LI>打印的国际化</LI><BR>
<P></P>
<P>在X窗口系统下的打印是一个很难解决的问题， 所以到目前为止没有形成 一个统一的打印标准．其原因之一就是X窗口系统在设计上把显示和打印完全分开了． </P>
<P>在Linux最常见的需要打印的文件格式是普通文本文件和PostScript文件．对于中文的普通文本文件的打印一般需要先转换为PostScript文件再打印．对于PostScript文件，如果应用软件在生成时含有中文字体信息， 则打印 比较容易实现，反之， 则很难实现甚至不可能打印． </P>
<P>目前中文文本文件常用的打印方法通常是，使用gb2ps/bg2ps/cnprint 等软件转换成PS文件打印，转换过程使用了中文的点阵字体．对已经形成的PS 文件的打印， 如果不包含中文字体，直接打印就会输出乱码，通常使用的方法 是将这一类PS文件过滤一下， 改为使用中文字体，然后再打印．如陈向阳先生的过滤软件ps2cps可以打印Netscape的存储文件．这种打印的缺点是有时输出的PS中汉字字符串和英文字符串对不齐．最好的方法是在 PostScript一级实现 中文打印，陈向阳先生对ghostscript进行了中文化， 可以直接使用TTF轻松打 印Netscape， Qt/KDE， lyx等软件输出的PS文件．这种从底层实现打印的方法 也是日文和韩文所采用的方法． </P>
<P>使用CID（adobe）字体打印的方法也在尝试之中． </P>
<P>总之， 目前的中文打印缺乏统一标准， 应用软件在输出打印PS文件时多数 不考虑双字节语言的问题，使打印变得更加复杂化， 所以当前的中文Linux发 布版本多数不支持中文打印， </P>
<LI>客户程序间通讯的国际化</LI><BR>
<P></P>
<P>客户程序间通讯（Interclient Communications Conventions，简称ICCC）是客户程序之间共享资源的手段之一．最常见的应用是文本的拷贝和粘贴和与窗口管理器通讯．但是如果两个应用程序之间所使用的字符集不同，粘贴就会出现问题， 甚至粘贴的内容会丢失．所以客户程序之间必须国际化了的通讯协议． </P>
<P>应用程序和窗口管理器之间的通讯也属于客户程序间通讯． </P>
<P>如果客户程序之间使用的字符集相同， 但是编码不同， 则不会丢失数据， 这时应该使用复合文本（COMPOUND TEXT）传输．X内部定义了COMPOUND_TEXT 的原子（Atom）用于传输中英文混和的字符串．对7字节编码， ASCII或者其它 ISO8859-1字符集， 客户程序通讯可以不用转换而直接使用XA_STRING原子传输． </P>
<H2>四 开发符合国际化标准的软件</H2>在X窗口系统下开发软件， 应尽量符合国际化标准．它包括， 设置合适的locale（见前面讲述的在X下使用locale），注意选择字符集和字体集， 本地化文本的处理，输入方法等等．这里推荐用户尽量使用在国际化方面已经比较完善的高层图形库， 如Qt， gtk+， Java等， 这样可以避免考虑以上问题．选择Motif时需要考虑资源的国际化问题和FontList等． 
<OL>
<LI>开发国际化软件&nbsp; </LI>
<P></P>
<P>使用已经支持国际化的高层图形库开发支持国际化的软件基本上可以不用考虑国际化问题．特别是输入问题，在标准的输入区内（单行输入和多行输入）， 都可以自动输入汉字．在字体处理上，注意使用字体集．许多软件需要在资源文件中指定字体和字体集， 所以开发的软件应提供一个缺省支持字体集的资源文件． </P>
<P>下面所介绍的开发国际化的软件是基于libX11的开发方法．除了前面所说的 在软件初始化时调用一些Locale的函数外，在实际编程时， 还应注意以下问题： </P>
<OL>
<LI>字体载入： 在处理字符串时， 使用FontSet， 而不是Font：&nbsp; </LI>
<P></P>
<P>XCreateFontSet（） - 建立字体集 <BR>XFreeFontSet（） - 释放字体集内存 <BR>XFontsOfFontSet（） - 返回XFontStruct和字体名 <BR>XBaseFontNameListOfFontSet（） - 返回字体集的名称 <BR>XLocaleOfFontSet（） - 返回XFontSet的locale名 <BR>XExtentsOfFontSet（） - 获得FontSet的最大Extents </P>
<LI>计算字符串的屏幕尺寸并画字符串：</LI><BR>
<P></P>
<P>Xmb/XwcDrawString（） - 只画字型（glyphs）的前景 <BR>Xmb/XwcDrawImageString（） - 画前景和背景 <BR>Xmb/XwcDrawText（） - 复杂的间隔和字体集 <BR>Xmb/XwcTextEscapement（） - X 方向像素 <BR>Xmb/XwcTextExtents（） - 字符串轮廓 </P>
<LI>客户程序间通讯：&nbsp;</LI>
<P></P>
<P>Xmb/wcTextListToTextProperty（） - 根据locale的文本转换 <BR>Xmb/wcTextPropertyToTextList（） - 根据locale的文本转换 <BR>XFreeStringList（） <BR>Xmb/wcFreeStringList（） - 释放StringList <BR>XSetWMProperties（） - 设置窗口管理器属性 <BR>XSetWMName（） - 设置窗口窗口名 <BR>XSetWMIconName（） - 设置窗口图标名 </P>
<LI>输入：&nbsp; </LI>
<P></P>
<P>XOpenIM（）/XCloseIM（） - 打开/关闭输入服务器 <BR>XDisplayOfIM（）/XLocaleOfIM（） <BR>XSetIMValues（）/XGetIMValues（） - 设置/获取输入服务器属性 <BR>XCreateIC（）/XDestroyIC（） - 建立/释放IC <BR>XIMOfIC（） <BR>XSetICValues（）/XGetICValues（） - 设置/获取IC的值 <BR>XSetICFocus（）/XUnsetICFocus（） - 聚焦/取消聚焦 <BR>XmbResetIC（）/XwcResetIC（） - 重设IC <BR>XFilterEvent（） - 过滤事件 <BR>Xmb/wcLookupString（） - 查找字符串 <BR>XRegister/UnregisterIMInstantiateCallback（） - 注册/取消回调&nbsp;</P></OL>
<LI>使非国际化软件国际化&nbsp; </LI>
<P></P>
<P>修改已经存在的非国际化软件， 应根据具体情况采用不同的补丁．需要 注意的是修改后的软件应与原来的软件兼容，不会对软件以前在西文和其它 语言的支持造成影响．Locale应该是软件的语言切换中心点．下面是笔者在修改软件的过程中一些经验， 仅供参考． </P>
<UL>
<LI>在软件初始化时设置Locale．</LI>
<LI>定义gettext的宏， 并且把它与信息文件绑定．</LI>
<LI>对所有静态信息使用gettext</LI>
<LI>对文本绘制使用字体集代替字体</LI>
<LI>绘制函数使用X下的多字节或宽字符函数</LI>
<LI>初始化和XIM服务器的联接</LI>
<LI>在事件循环中用XFilterEvent（）过滤事件到XIM服务器</LI>
<LI>使用Xmb/wcLookupString（）查找字符串</LI></UL></OL>
<H2>五 目前中文化中存在的问题</H2>现有的国际化标准中存在许多问题， 问题的原因主要出自目前的 国际化架构．对于中文化来说，这些问题显得更加突出． 
<OL>
<LI>编码动态切换的问题&nbsp; </LI>
<P></P>
<P>对中文软件来说， 同时支持多内码（GB和Big5）是比较完善的中文软件， 但是动态切换内码，特别是切换软件界面（如菜单项）的内码，是受到信息 （Message）国际化中 gettext 的限制的．一般来说， 一旦软件载入， 所有 文本信息便被初始化，而且在整个过程中不会再重新装载信息．退一步说，即使重新装载了信息， 由于所装载信息的长度发生了变化， 软件界面调整 布局也是十分困难的． </P>
<P>所以现有软件的动态编码切换仅仅是在部分区域实现， 例如Netscape． 遗憾的是， Netscape的编码切换并不彻底，它切换的仅仅是显示部分， 输入 部分仍然有问题．比如在zh_CN.GBK的环境下启动Netscape，当切换到有 输入条的繁体中文页面时，如果采用输入软件自动识别Input Context的编码的方式， 仍然会认为Netscape是GB编码， 输入结果不正确．如果输入 Big5编码，必须缩定输出的编码为Big5．Chinput在这方面做了一些尝试， 结论是可以输入Big5编码，但是在输入条中的显示不正确． </P>
<P>一般来说， 使用中文平台来动态切换编码更容易实现．在中文Linux 的发布版本中，有几个是可以使用中文平台来实现动态切换编码的， 其原理十分简单， 只要在应用程序或X服务器把某个窗口的编码状态记住就行了，以后的文本显示和输入都以此编码为标准．这种方法的缺点是， 应用程序初始界面上的中文由于转化了编码变成了乱码． </P>
<LI>中文编码自动识别问题&nbsp;</LI>
<P></P>
<P>在文本浏览，网页浏览或网页翻译时， 通常需要自动识别汉字的内码， 但是中文的GB编码和Big5编码有重叠区域，所以很难区分开．目前公开 源代码的识别软件很少， 识别结果不能令人满意，远没有达到目前商业软件 的识别水平． </P>
<LI>Linux上的中文平台到国际化的过渡&nbsp;</LI>
<P></P>
<P>但是从长远的角度看， 因为中文在对中文显示和输入上与国际化标准 存在很大差异，所以亟需一种从中文平台到国际化标准的过渡性方案．在 过渡的过程中， 中文平台可能会和国际化标准共同存在一段时间． </P>
<P>以CLE和TurboLinux为例， 它们在早期的版本中都采用了中文平台来 支持中文的显示和输入，随着支持国际化标准的软件的增多，逐步采用了 中文平台和国际化标准共同存在的版本作为过渡性版本．到目前为止，已经在缺省情况下放弃中文平台的使用．中文平台只是作为残留物包含在发布 版本中． </P>
<LI>Linux 文档中文化&nbsp;</LI>
<P></P>
<P>Linux文档， 主要是指Linux上的一些命令帮助文档（man文件）， 软件 手册和说明，软件本身的Message文件（po）．目前在这方面的工作还缺乏 统一的管理和广大Linux爱好者的参与．</P></OL>参考资料 
<UL>
<LI>Unicode： http：//www.unicode.org/</LI>
<LI>香港增补字符集： http：//www.digital21.gov.hk/chi/hkscs/introduction.html</LI>
<LI>CJK 有关信息： ftp：//ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf</LI>
<LI>Linux国际化资料： http：//i18n.linux.org.tw/</LI>
<LI>Linux国际化标准： http：//www.li18nux.org/</LI>
<LI>MicroSoft 国际化： http：//www.microsoft.com/globaldev/</LI></UL>
<H2>六 附录</H2>
<OL>
<LI>宽字符处理函数函数与普通函数对照表&nbsp; </LI>
<P></P>
<P>字符分类： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=100>宽字符函数</TH>
<TH align=left width=100>普通C函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>iswalnum（）</TD>
<TD>isalnum（）</TD>
<TD width=300>测试字符是否为数字或字母</TD></TR>
<TR vAlign=top>
<TD>iswalpha（）</TD>
<TD>isalpha（）</TD>
<TD width=300>测试字符是否是字母</TD></TR>
<TR vAlign=top>
<TD>iswcntrl（）</TD>
<TD>iscntrl（）</TD>
<TD width=300>测试字符是否是控制符</TD></TR>
<TR vAlign=top>
<TD>iswdigit（）</TD>
<TD>isdigit（）</TD>
<TD width=300>测试字符是否为数字</TD></TR>
<TR vAlign=top>
<TD>iswgraph（）</TD>
<TD>isgraph（）</TD>
<TD width=300>测试字符是否是可见字符</TD></TR>
<TR vAlign=top>
<TD>iswlower（）</TD>
<TD>islower（）</TD>
<TD width=300>测试字符是否是小写字符</TD></TR>
<TR vAlign=top>
<TD>iswprint（）</TD>
<TD>isprint（）</TD>
<TD width=300>测试字符是否是可打印字符</TD></TR>
<TR vAlign=top>
<TD>iswpunct（）</TD>
<TD>ispunct（）</TD>
<TD width=300>测试字符是否是标点符号</TD></TR>
<TR vAlign=top>
<TD>iswspace（）</TD>
<TD>isspace（）</TD>
<TD width=300>测试字符是否是空白符号</TD></TR>
<TR vAlign=top>
<TD>iswupper（）</TD>
<TD>isupper（）</TD>
<TD width=300>测试字符是否是大写字符</TD></TR>
<TR vAlign=top>
<TD>iswxdigit（）</TD>
<TD>isxdigit（）</TD>
<TD width=300>测试字符是否是十六进制的数字</TD></TR></TBODY></TABLE></P>
<P>大小写转换： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=100>宽字符函数</TH>
<TH align=left width=100>普通C函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>towlower（）</TD>
<TD>tolower（）</TD>
<TD width=300>把字符转换为小写</TD></TR>
<TR vAlign=top>
<TD>towupper（）</TD>
<TD>toupper（）</TD>
<TD width=300>把字符转换为大写</TD></TR></TBODY></TABLE></P>
<P>字符比较： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=100>宽字符函数</TH>
<TH align=left width=100>普通C函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>wcscoll（）</TD>
<TD>strcoll（）</TD>
<TD width=300>比较字符串</TD></TR></TBODY></TABLE></P>
<P>日期和时间转换： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=200>宽字符函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>strftime（）</TD>
<TD width=300>根据指定的字符串格式和locale设置格式化日期和时间</TD></TR>
<TR vAlign=top>
<TD>wcsftime（）</TD>
<TD width=300>根据指定的字符串格式和locale设置格式化日期和时间， 并返回宽字符串</TD></TR>
<TR vAlign=top>
<TD>strptime（）</TD>
<TD width=300>根据指定格式把字符串转换为时间值， 是strftime的反过程</TD></TR></TBODY></TABLE></P>
<P>打印和扫描字符串： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=200>宽字符函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>fprintf（）/fwprintf（）</TD>
<TD width=300>使用vararg参量的格式化输出</TD></TR>
<TR vAlign=top>
<TD>fscanf（）/fwscanf（）</TD>
<TD width=300>格式化读入</TD></TR>
<TR vAlign=top>
<TD>printf（）</TD>
<TD width=300>使用vararg参量的格式化输出到标准输出</TD></TR>
<TR vAlign=top>
<TD>scanf（）</TD>
<TD width=300>从标准输入的格式化读入</TD></TR>
<TR vAlign=top>
<TD>sprintf（）/swprintf（）</TD>
<TD width=300>根据vararg参量表格式化成字符串</TD></TR>
<TR vAlign=top>
<TD>sscanf（）</TD>
<TD width=300>以字符串作格式化读入</TD></TR>
<TR vAlign=top>
<TD>vfprintf（）/vfwprintf（）</TD>
<TD width=300>使用stdarg参量表格式化输出到文件</TD></TR>
<TR vAlign=top>
<TD>vprintf（）</TD>
<TD width=300>使用stdarg参量表格式化输出到标准输出</TD></TR>
<TR vAlign=top>
<TD>vsprintf（）/vswprintf（）</TD>
<TD width=300>格式化stdarg参量表并写到字符串</TD></TR></TBODY></TABLE></P>
<P>数字转换： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=100>宽字符函数</TH>
<TH align=left width=100>普通C函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>wcstod（）</TD>
<TD>strtod（）</TD>
<TD width=300>把宽字符的初始部分转换为双精度浮点数</TD></TR>
<TR vAlign=top>
<TD>wcstol（）</TD>
<TD>strtol（）</TD>
<TD width=300>把宽字符的初始部分转换为长整数</TD></TR>
<TR vAlign=top>
<TD>wcstoul（）</TD>
<TD>strtoul（）</TD>
<TD width=300>把宽字符的初始部分转换为无符号长整数</TD></TR></TBODY></TABLE></P>
<P>多字节字符和宽字符转换及操作： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=200>宽字符函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>mblen（）</TD>
<TD width=300>根据locale的设置确定字符的字节数</TD></TR>
<TR vAlign=top>
<TD>mbstowcs（）</TD>
<TD width=300>把多字节字符串转换为宽字符串</TD></TR>
<TR vAlign=top>
<TD>mbtowc（）/btowc（）</TD>
<TD width=300>把多字节字符转换为宽字符</TD></TR>
<TR vAlign=top>
<TD>wcstombs（）</TD>
<TD width=300>把宽字符串转换为多字节字符串</TD></TR>
<TR vAlign=top>
<TD>wctomb（）/wctob（）</TD>
<TD width=300>把宽字符转换为多字节字符</TD></TR></TBODY></TABLE>输入和输出： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=100>宽字符函数</TH>
<TH align=left width=100>普通C函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>fgetwc（）</TD>
<TD>fgetc（）</TD>
<TD width=300>从流中读入一个字符并转换为宽字符</TD></TR>
<TR vAlign=top>
<TD>fgetws（）</TD>
<TD>fgets（）</TD>
<TD width=300>从流中读入一个字符串并转换为宽字符串</TD></TR>
<TR vAlign=top>
<TD>fputwc（）</TD>
<TD>fputc（）</TD>
<TD width=300>把宽字符转换为多字节字符并且输出到标准输出</TD></TR>
<TR vAlign=top>
<TD>fputws（）</TD>
<TD>fputs（）</TD>
<TD width=300>把宽字符串转换为多字节字符并且输出到标准输出串</TD></TR>
<TR vAlign=top>
<TD>getwc（）</TD>
<TD>getc（）</TD>
<TD width=300>从标准输入中读取字符， 并且转换为宽字符</TD></TR>
<TR vAlign=top>
<TD>getwchar（）</TD>
<TD>getchar（）</TD>
<TD width=300>从标准输入中读取字符， 并且转换为宽字符</TD></TR>
<TR vAlign=top>
<TD>None</TD>
<TD>gets（）</TD>
<TD width=300>使用fgetws（）</TD></TR>
<TR vAlign=top>
<TD>putwc（）</TD>
<TD>putc（）</TD>
<TD width=300>把宽字符转换成多字节字符并且写到标准输出</TD></TR>
<TR vAlign=top>
<TD>putwchar（）</TD>
<TD>getchar（）</TD>
<TD width=300>把宽字符转换成多字节字符并且写到标准输出</TD></TR>
<TR vAlign=top>
<TD>None</TD>
<TD>puts（）</TD>
<TD width=300>使用fputws（）</TD></TR>
<TR vAlign=top>
<TD>ungetwc（）</TD>
<TD>ungetc（）</TD>
<TD width=300>把一个宽字符放回到输入流中</TD></TR></TBODY></TABLE></P>
<P>字符串操作： 
<TABLE cellPadding=2 border=2>
<TBODY>
<TR vAlign=top>
<TH align=left width=100>宽字符函数</TH>
<TH align=left width=100>普通C函数</TH>
<TH align=left width=300>描述</TH></TR>
<TR vAlign=top>
<TD>wcscat（）</TD>
<TD>strcat（）</TD>
<TD width=300>把一个字符串接到另一个字符串的尾部</TD></TR>
<TR vAlign=top>
<TD>wcsncat（）</TD>
<TD>strncat（）</TD>
<TD width=300>类似于wcscat（）， 而且指定粘接字符串的粘接长度.</TD></TR>
<TR>
<TD>wcschr（）</TD>
<TD>strchr（）</TD>
<TD width=300>查找子字符串的第一个位置</TD></TR>
<TR vAlign=top>
<TD>wcsrchr（）</TD>
<TD>strrchr（）</TD>
<TD width=300>从尾部开始查找子字符串出现的第一个位置</TD></TR>
<TR vAlign=top>
<TD>wcspbrk（）</TD>
<TD>strpbrk（）</TD>
<TD width=300>从一字符字符串中查找另一字符串中任何一个字符第一次出现的位置</TD></TR>
<TR vAlign=top>
<TD>wcswcs（）/wcsstr（）</TD>
<TD>strchr（）</TD>
<TD width=300>在一字符串中查找另一字符串第一次出现的位置</TD></TR>
<TR vAlign=top>
<TD>wcscspn（）</TD>
<TD>strcspn（）</TD>
<TD width=300>返回不包含第二个字符串的的初始数目</TD></TR>
<TR vAlign=top>
<TD>wcsspn（）</TD>
<TD>strspn（）</TD>
<TD width=300>返回包含第二个字符串的初始数目</TD></TR>
<TR>
<TD>wcscpy（）</TD>
<TD>strcpy（）</TD>
<TD width=300>拷贝字符串</TD></TR>
<TR vAlign=top>
<TD>wcsncpy（）</TD>
<TD>strncpy（）</TD>
<TD width=300>类似于wcscpy（）， 同时指定拷贝的数目</TD></TR>
<TR>
<TD>wcscmp（）</TD>
<TD>strcmp（）</TD>
<TD width=300>比较两个宽字符串</TD></TR>
<TR vAlign=top>
<TD>wcsncmp（）</TD>
<TD>strncmp（）</TD>
<TD width=300>类似于wcscmp（）， 还要指定比较字符字符串的数目</TD></TR>
<TR>
<TD>wcslen（）</TD>
<TD>strlen（）</TD>
<TD width=300>获得宽字符串的数目</TD></TR>
<TR>
<TD>wcstok（）</TD>
<TD>strtok（）</TD>
<TD width=300>根据标示符把宽字符串分解成一系列字符串</TD></TR>
<TR>
<TD>wcswidth（）</TD>
<TD>None</TD>
<TD width=300>获得宽字符串的宽度</TD></TR>
<TR vAlign=top>
<TD>wcwidth（）</TD>
<TD>None</TD>
<TD width=300>获得宽字符的宽度</TD></TR></TBODY></TABLE></P>
<P>另外还有对应于memory操作的 wmemcpy（）， wmemchr（）， wmemcmp（）， wmemmove（）， wmemset（）． </P>
<LI>X 窗口系统下支持中文的函数</LI>
<TABLE border=2>
<TBODY>
<TR>
<TD>支持西文的函数</TD>
<TD>支持中文的函数</TD>
<TD>描述</TD></TR>
<TR>
<TD>XLoadFont</TD>
<TD>XCreateFontSet</TD>
<TD>载入字体集</TD></TR>
<TR>
<TD>XTextExtents（16）</TD>
<TD>Xmb/wcTextExtents <BR>Xmb/wcTextPerCharExtents</TD>
<TD>返回文本的限制框</TD></TR>
<TR>
<TD>XDrawString</TD>
<TD>Xmb/wcDrawString</TD>
<TD>在窗口中画字符串， 背景填充</TD></TR>
<TR>
<TD>XDrawImageString</TD>
<TD>Xmb/wcDrawImageString</TD>
<TD>在窗口中画字符串</TD></TR>
<TR>
<TD>XDrawText</TD>
<TD>Xmb/wcDrawText</TD>
<TD>在窗口中画字符串</TD></TR>
<TR>
<TD>XLookupString</TD>
<TD>Xmb/wcLookupString</TD>
<TD>查找字符串</TD></TR></TBODY></TABLE><BR>
<LI>支持国际化的高层库</LI>
<UL>
<LI>OSF/Motif</LI>
<LI>Qt/kdelib</LI>
<LI>gtk+/gnome-lib</LI>
<LI>Perl</LI>
<LI>Java</LI></UL>
<LI>支持多语言的典型软件</LI>
<UL>
<LI>浏览器 Netscape</LI>
<LI>编辑器 XEmacs</LI>
<LI>编辑器 Mule</LI>
<LI>编辑器 vim</LI>
<LI>终端 rxvt</LI>
<LI>排版软件 LaTeX/lyx</LI>
<LI>PostScript/PDF： gs/acroread</LI>
<LI>图像处理： gimp</LI>
<LI>幻灯片制作 mgp</LI>
<LI>即将完成： StarOffice， KOffice</LI></UL>
<LI>支持Unicode的软件</LI>
<UL>
<LI>高级图形库函数 Qt 2.x</LI>
<LI>Java 语言开发工具 JDK</LI>
<LI>编辑器 yudit</LI>
<LI>专用的支持Unicode的 X 终端</LI>
<LI>基于GTK+的文本处理器 GScript</LI></UL></OL><img src ="http://www.cnitblog.com/SpiWolf/aggbug/4522.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-15 11:39 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4522.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Linux Unicode编程</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4523.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Tue, 15 Nov 2005 03:39:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4523.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4523.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4523.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4523.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4523.html</trackback:ping><description><![CDATA[<!--StartFragment -->&nbsp;
<P align=center>&nbsp;<!--StartFragment --> <STRONG><SPAN class=atitle2><A href="http://www-128.ibm.com/developerworks/cn/linux/i18n/unicode/linuni/">如何在程序中加入并使用 Unicode 以实现外语支持</A></SPAN> <SPAN class=atitle2></SPAN><BR></STRONG></P>
<P><NAME><STRONG>Author:&nbsp;&nbsp;&nbsp;&nbsp;Thomas W. Burger</STRONG></NAME><STRONG><BR></STRONG><BR><BR>作为一个计算机的多位字符表示系统，Unicode 支持世界上所有语言的编码和转换。这篇文章说明了 Linux 应用程序中的国际语言支持的重要性，以及规划 Unicode 支持并将之结合到 Linux 应用程序中去的思想。</P>
<P>Unicode 并不只是一个编程工具，它还是一个政治的、经济的工具。没有结合世界的语言支持的应用程序通常只能被那些能读写 ASCII 所支持语言的个人使用。这使得建立在 ASCII 基础之上的计算机技术脱离了世界上大部分人。Unicode 允许程序使用世界上任何一种字符集，因此它支持所有语言。</P>
<P>Unicode 让程序员为普通人提供用他们本国语言就能使用的软件。这样就不用再学一门外语了，而且更容易实现计算机技术社会和财政上的利益。很容易设想，如果用户必须为使用因特网浏览器而学习乌尔都语的话，您就难以看到计算机在美国的使用。Web 就更不会出现了。</P>
<P>Linux 承担了对 Unicode 很大程度上的支持。Unicode 支持被嵌入到内核和代码开发库中。在很大程度上，使用程序中几句简单的命令就能将它们自动的结合到代码中。</P>
<P>所有现代字符集的基础都是在 1968 年以 ANSIX3.4 版本出版的美国信息交换标准码（American Standard Code for Information Interchange，ASCII）。一个值得注意的例外是在 ASCII 之前定义的 IBM 的扩充的二进制编码的十进制交换码（Extended Binary Coded Decimal Information Code，EBCDIC）。ASCII 是一个编码字符集（coded character set，CCS），换句话说，它是整数到字符表示的映射。ASCII 编码字符集允许用一个八位（基于二进制的，用值 0 或 1 表示的）字段或字节（2^8 =256）表示 256 个字符。这是一个高度受限的编码字符集，它不能表示许多不同语言的所有字符（如中文和日文），不能表示科学符号，更不能表示古代文字（神秘符号和象形文字）和音乐符号。通过更改一个字节的长度而使更大的字符集得以被编码，这似乎有效但完全不切实际。所有的计算机都基于八位字节。解决方法是一种字符编码方案（Character encoding scheme，CES）― 用定长或变长的多字节序列能够表示比 256 大的数.这些数值接着通过编码字符集被映射到它们表示的字符。<BR><!--StartFragment -->&nbsp;</P>
<P><A name=1><SPAN class=atitle2><STRONG>Unicode 的定义</STRONG></SPAN></A><BR>Unicode 通常用作涉及双字节字符编码方案的通用术语。Unicode CCS 3.1 的官方称谓是 ISO10646-1 通用多八字节编码字符集（Universal Multiple Octet Coded Character Set，UCS）。Unicode 3.1 版本添加了 44,946 个新的编码字符。算上 Unicode 3.0 版本已经存在的 49,194 个字符，共计 94,140 个。</P>
<P>Unicode 编码字符集利用了一个由 128 个三维的组构成的四维编码空间。其中每个组包含 256 个二维平面。每个平面由 256 个一维的行组成，并且每个行有 256 个单元。每个单元在这个编码空间内对一个字符编码，或者被声明为未经使用。这种编码概念被称为 UCS-4；四个八位元用来表示指定组、平面、行和单元的每个字符。</P>
<P>第一个平面（第 00 组的第 00 平面）是基本多语言平面（Basic Multilingual Plane，BMP）。BMP 按字母、音节、表意符号和各种符号及数字定义了常规使用的字符。后续的平面用于附加字符或其它还没有发明的编码实体。我们需要这完整的范围去处理世界上的所有语言；特别是拥有将近 64,000 个字符的一些东亚语言。</P>
<P>BMP 被用作双字节的编码字符集，这种编码字符集确定为 ISO 10646 UCS-2 格式。ISO 10646 UCS-2 就是指 Unicode（并且两者相同）。BMP，像所有 UCS 平面那样，包含了 256 行，其中每行包含 256 个单元，字符仅仅按照 BMP 中的行和单元的八位元在单元中被编码。 这就允许 16 位编码字符能够被用来书写大多数商业上最重要的语言。UCS-2 不需要代码页切换、代码扩展或代码状态。UCS-2 是一种将 Unicode 结合到软件中的简单方法，但它只限于支持 Unicode BMP。</P>
<P>若要用 8 位字节表示一个多于 2^8 =256 个字符的字符编码系统（character coding system，CCS），就需要一种字符编码方案(character-encoding scheme，CES）。</P>
<P><A name=2><SPAN class=atitle2><STRONG>Unicode 转换</STRONG></SPAN></A><BR>在 UNIX 中，使用得最多的字符编码方案是 UTF-8。 它考虑到了对整个 Unicode 全部页和平面的全面支持，而且它仍能正确的识别 ASCII。除了 UTF-8 的其他选择还有：UCS-4、UTF-16、UTF-7.5、UTF-7、SCSU、HTML 和 JAVA。</P>
<P>Unicode 转换格式（Unicode Transformation Formats，UTFs）是一种通过映射多字节编码中的值来支持 Unicode 的字符编码方案。本文将分析最流行的格式 ― UTF-8 字符编码系统。</P>
<P><A name=N1006A><SPAN class=atitle3><STRONG>UTF-8</STRONG></SPAN></A><BR>UTF-8 转换格式正逐步成为一种占主导地位的交换国际文本信息的方法，因为它可以支持世界上所有的语言，而且它还与 ASCII 兼容。UTF-8 使用变长编码。从 0 到 0x7f（127）的字符把自身编码成单字节，而将值更大的字符编码成 2 到 6 个字节。</P>
<P><A name=N10073><SPAN class=atitle3><STRONG>表 1. UTF-8 编码</STRONG></SPAN></A><BR>
<TABLE border=0>
<TBODY>
<TR>
<TD>0x00000000 - 0x0000007F:</TD>
<TD width=10><BR></TD>
<TD>0 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxxx</I> </TD></TR>
<TR>
<TD>0x00000080 - 0x000007FF:</TD>
<TD width=10><BR></TD>
<TD>110 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> </TD></TR>
<TR>
<TD>0x00000800 - 0x0000FFFF:</TD>
<TD width=10><BR></TD>
<TD>1110 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> </TD></TR>
<TR>
<TD>0x00010000 - 0x001FFFFF:</TD>
<TD width=10><BR></TD>
<TD>11110 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> 10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> </TD></TR>
<TR>
<TD>0x00200000 - 0x03FFFFFF:</TD>
<TD width=10><BR></TD>
<TD>111110 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> 10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> </TD></TR>
<TR>
<TD>0x04000000 - 0x7FFFFFFF:</TD>
<TD width=10><BR></TD>
<TD>1111110 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">x</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> 10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> </TD></TR></TBODY></TABLE></P>
<P>字节 10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I>是一个扩展字节，它的 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> 位位置被以二进制表示的字符代码号的位所填充。这是能够代表被使用代码的最短的可能的多字节序列。 </P>
<P><A name=N10110><SPAN class=atitle3><STRONG>UTF-8 编码示例</STRONG></SPAN></A><BR>Unicode 字符版权标记字符 0xA9 = 1010 1001 用 UTF-8 编码如下所示：</P>
<BLOCKQUOTE xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman"><CODE>11000010 10101001 = 0xC2 0xA9</CODE> </FONT></BLOCKQUOTE>
<P>“不等于”符号字符 0x2260 = 0010 0010 0110 0000 编码如下所示：</P>
<BLOCKQUOTE xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman"><CODE>11100010 10001001 10100000 = 0xE2 0x89 0xA0</CODE> </FONT></BLOCKQUOTE>
<P>通过获取 <FONT face="Times New Roman"><CODE>continuation byte</CODE> 的值可以看到原始数据： </FONT></P>
<BLOCKQUOTE xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><CODE><FONT face="Times New Roman">[1110]0010 [10]001001 [10]100000 <BR>0010 001001 100000 <BR>0010 0010 0110 0000 = 0x2260 </FONT></CODE></BLOCKQUOTE>
<P>第一个字节定义后面紧跟的八位元数，如果是 7F 或更小，这就是等价的 ASCII 值。每个八位字节以 10 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">xxxxxx</I> 开头，确保字节不与 ASCII 的值混淆。 </P>
<P><A name=3><SPAN class=atitle2><STRONG>UTF 支持</STRONG></SPAN></A><BR>在 Linux 平台上使用 UTF-8 之前，请确信分发包里有 glibc 2.2 和 XFree86 4.0 或更新的版本。早先的版本缺少 UTF-8 语言环境支持和 ISO10646-1 X11 字体。</P>
<P>在 UTF-8 发布之前，Linux 用户使用各种不同特定语言的扩展 ASCII，像欧洲用户用 ISO 8859-1 或 ISO 8859-2，希腊用户使用 ISO 8859-7，俄罗斯用户使用 KOI-8 / ISO 8859-5/CP1251（西里尔字母）。这使得数据交换出现了很多问题，并且需要为这些编码之间的差异编写应用软件。这种语言支持是不完善的，而且数据交换没有经过测试。Linux 主要的发行商和应用程序开发者正致力于让主要以 UTF-8 格式表示的 Unicode 成为 Linux 中的标准。</P>
<P>为了识别 Unicode 文件，Microsoft 建议所有的 Unicode 文件应该以 ZERO WIDTH NOBREAK SPACE（U+FEFF）字符开头。这作为一个“特征符”或“字节顺序标记（byte-order mark，BOM）”来识别文件中使用的编码和字节顺序。但是，Linux/UNIX 并没有使用 BOM，因为它会破坏现有的 ASCII 文件的语法约定。在 POSIX 系统中，选中的语言环境识别了在一个过程中的所有输入输出文件期望的编码形式。</P>
<P>有两种方法可以将 UTF-8 支持添加到 Linux 应用程序中。第一种方法，数据都以 UTF-8 形式存放在各处，这样软件改动很少（被动的）。另一种方法，被读取的 UTF-8 数据用标准的 C 语言库函数转变成为宽字符数组（转换的）。在输出时，用函数 <FONT face="Times New Roman"><CODE>wcsrtombs()</CODE> 使字符串被转变回 UTF-8： </FONT></P>
<P><A name=listing1><B>清单 1. wcsrtombs()</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">#include &lt;wchar.h&gt; <BR>size_t wcsrtombs (char *dest, const wchar_t **src, size_t len, mbstate_t *ps);</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>方法的选择取决于应用程序的性质。大多数应用程序可以使用被动的方法操作。这就是在 UNIX 平台上使用 UTF-8 会如此流行的原因。像 <FONT face="Times New Roman"><CODE>cat</CODE> 和 <CODE>echo</CODE> 那样的程序就不需要修改。字节流仍只是字节流，并没有对它进行任何处理。ASCII 字符和控制代码在 UTF-8 语言环境中不改变。 </FONT></P>
<P>通过字节计数对字符进行计数的程序需要一些小小的改动。在 UTF-8 中应用程序不对任何扩展的字节进行计数。如果选择了 UTF-8 语言环境，C 语言库的 <FONT face="Times New Roman"><CODE>strlen(s)</CODE> 函数需要用 <CODE>mbstowcs()</CODE> 函数来代替： </FONT></P>
<P><A name=N10178><B>清单 2. mbstowcs() 函数</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">#include &lt;stdlib.h&gt;<BR>size_t mbstowcs(wchar_t *pwcs, const char *s, size_t n);</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P><FONT face="Times New Roman"><CODE>strlen</CODE> 的一种常见用法是估算显示宽度。中文和其它表意符号将占用两列位置。 <CODE>wcwidth()</CODE> 函数用来测试每个字符的显示宽度： </FONT></P>
<P><A name=listing3><B>清单 3. wcwidth() 函数</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">#include &lt;<BR>        </FONT><A href="http://thibs.menloschool.org/help/susv2/xsh/wchar.h.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman">wchar.h</FONT></A><FONT face="Times New Roman">&gt; <BR>int wcwidth(wchar_t wc);<BR>      </FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P><A name=4><SPAN class=atitle2><STRONG>Unicode 的 C 语言支持</STRONG></SPAN></A><BR>在正式情况下，从 GNU glibc 2.2 开始，wchar_t 类型只为 32 位的 ISO 10646 格式数值所特定使用，与当前使用的语言环境无关。通过 ISO C99 所要求的 __STDC_ISO_10646__ 宏的定义作为信号通知应用程序。 __STDC_ISO_10646__ 的定义用来指出 wchar_t 是 Unicode。精确的值是一个十进制的 yyyymmL 格式的常数。例如，使用：</P>
<P><A name=listing4><B>清单 4. 指出 wchar_t 是 Unicode</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">#define __STDC_ISO_10646__ 200104L</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>是为指出 wchar_t 类型的值是由 ISO/IEC 10646 和到指定的年月为止的所有修正与技术勘误定义的字符编码表示。</P>
<P>对 wchar_t 的利用如这个示例所示，使用宏确定在 ISO C99 可移植代码中写双引号的方法。</P>
<P><A name=listing5><B>清单 5. 确定写双引号的方法</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">#if __STDC_ISO_10646__  <BR>   printf("%lc", 0x201c);  <BR>#else  <BR>   putchar('"');  <BR>#fi</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P><A name=N101BE><SPAN class=atitle3><STRONG>语言环境</STRONG></SPAN></A><BR>激活 UTF-8 的恰当的办法是 POSIX 语言环境机制。语言环境是一种包含有关软件行为特定文化约定的配置设定。它包含了字符编码、日期／时间符号、分类规则以及度量系统。语言环境的名称通常由 ISO 639-1 语言、ISO 3166-1 国家或地区代码以及可选的编码名称和其它限定符组成。您可以用命令 <FONT face="Times New Roman"><CODE>locale -a</CODE> 获取所有安装在系统上的语言环境列表（通常在 /usr/lib/locale/）。 </FONT></P>
<P>如果没有预安装 UTF-8 语言环境，你可以用 <FONT face="Times New Roman"><CODE>localedef</CODE> 命令生成它。若要为某个特定用户生成并激活一个德语的 UTF-8 语言环境，请使用如下语句： </FONT></P>
<P><A name=listing6><B>清单 6. 为特定用户生成语言环境</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">localedef -v -c -i de_DE -f UTF-8 $HOME/local/locale/de_DE.UTF-8<BR>export LOCPATH=$HOME/local/locale<BR>export LANG=de_DE.UTF-8</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>有时候为所有用户添加 UTF-8 语言环境会很有用。root 用户使用如下指令就可以完成：</P>
<P><A name=listing7><B>清单 7. 为每个用户生成语言环境</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">localedef -v -c -i de_DE -f UTF-8 /usr/share/locale/de_DE.UTF-8</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>若要为每个用户将这个语言环境设为缺省值，可以将以下行添加到 /etc/profile 文件中：</P>
<P><A name=listing8><B>清单 8. 为所有用户设置缺省的语言环境</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">export LANG=de_DE.UTF-8</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>处理多字节字符代码序列的函数行为依赖于当前语言环境的 LC_CTYPE 类别；它确定了依赖语言环境的多字节编码。值 LANG=de_DE（德语）会导致输出按 ISO 8859-1 被格式化。值 LANG=de_DE.UTF-8 会把输出格式化成 UTF-8。语言环境设置会导致 <FONT face="Times New Roman"><CODE>printf</CODE> 中的 <CODE>%ls</CODE> 格式说明符调用 <CODE>wcsrtombs()</CODE> 函数以便于将宽字符的参数字符串转换成依赖语言环境的多字节编码。语言环境中的国家或地区标识符如：LC_CTYPE= en_GB （英国英语）和 LC_CTYPE= en_AU（澳大利亚英语），它们之间的差异只在 LC_MONETARY 类别中，原因在于货币的名称和打印货币数量的规则不同。 </FONT></P>
<P>请给您首选的语言环境设置环境变量 LANG。当一个 C 程序执行 <FONT face="Times New Roman"><CODE>setlocale()</CODE> 函数时： </FONT></P>
<P><A name=listing9><B>清单 9. setlocale() 函数</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">#include &lt;stdio.h&gt;<BR>#include &lt;locale.h&gt;<BR>//char *setlocale(int category, const char *locale);<BR>int main()<BR>{<BR>  if (!setlocale(LC_CTYPE, "")) <BR>  {<BR>    fprintf(stderr, "Locale not specified. Check LANG, LC_CTYPE, LC_ALL.<BR>");<BR>    return 1;<BR>  }</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>C 语言库将会依次测试环境变量 LC_ALL、LC_CTYPE 和 LANG。其中第一个含值的环境变量将决定为 LC_CTYPE 类别装入哪种语言环境数据。语言环境数据分裂成独立的类别。值 LC_CTYPE 定义了字符编码，而 LC_COLLATE 定义了排序顺序。我们用 LANG 环境变量为所有类别设置缺省语言环境，但 LC_* 变量可以用来覆盖单个类别。</P>
<P>您可以用命令 <FONT face="Times New Roman"><CODE>locale charmap</CODE> 查询当前语言环境中字符编码的名称。如果您从 LC_CTYPE 类别中成功选取了 UTF-8 语言环境，会输出 UTF-8。命令 <CODE>locale -m</CODE> 提供一张已安装的所有字符编码名称的列表。 </FONT></P>
<P>如果您使用专门的 C 语言库的多字节函数来完成所有外部字符编码和内部使用的 wchar_t 编码之间的转换，那么 C 语言库将承担责任，根据 LC_CTYPE 使用正确的编码方式。这甚至不需要程序被明确的编码成当前的多字节编码。</P>
<P>如果需要一个应用程序能明确的支持 UTF-8（或其它编码）转换方法而不用 libc 多字节函数，则应用程序必须确定是否需要激活 UTF-8 模式。带有 &lt;langinfo.h&gt; 库头文件的与 X/Open 兼容系统可以用如下代码：</P>
<P><A name=listing10><B>清单 10. 检测当前的语言环境是否使用了 UTF-8 编码</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">BOOL utf8_mode = FALSE;<BR><BR>if( !  strcmp(nl_langinfo(CODESET), "UTF-8")<BR>   utf8_mode = TRUE;</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>为检测当前语言环境是否使用了 UTF-8 编码。首先必须调用 <FONT face="Times New Roman"><CODE>setlocale(LC_CTYPE, "")</CODE> 函数，依据环境变量设置语言环境。nl_langinfo(CODESET) 函数也是由 <CODE>locale charmap</CODE> 命令调用，从而查找当前语言环境指定的编码名称。 </FONT></P>
<P>另一种可以使用的方法是查询语言环境变量：</P>
<P><A name=listing11><B>清单 11. 查询语言环境变量</B></A><BR>
<TABLE cellSpacing=0 cellPadding=5 width="100%" bgColor=#cccccc border=1>
<TBODY>
<TR>
<TD><PRE><CODE><FONT face="Times New Roman">char *s;<BR>BOOL utf8_mode = FALSE;<BR><BR>if ((s = getenv("LC_ALL")) || (s = getenv("LC_CTYPE")) || (s = getenv ("LANG"))) <BR><BR>{<BR>   if (strstr(s, "UTF-8"))<BR>      utf8_mode = TRUE;<BR>}</FONT></CODE></PRE></TD></TR></TBODY></TABLE></P>
<P>这项测试假设 UTF-8 语言环境名称中有值“UTF-8”，但实际情况并不总是如此，所以应该使用 <FONT face="Times New Roman"><CODE>nl_langinfo()</CODE> 方法。 </FONT></P>
<P><A name=5><SPAN class=atitle2><STRONG>总结</STRONG></SPAN></A><BR>为支持世界上的所有语言，需要一种具有八位字节字符编码策略的字符编码系统，它的字符应多于 ASCII（一种使用无符号字节的扩展版本）的 2^8 = 256 个字符。Unicode 就是这样一种字符编码系统，它具有由 128 个三维组（带有由大量字符编码方案的方法支持的 94,140 个定义好的字符值）组成的四维编码空间，在 Linux 中更流行的字符编码方案是 Unicode 转换格式 UTF-8。</P>
<P><A name=resources><SPAN class=atitle2>参考资料 </SPAN></A></P>
<UL>
<LI>您可以参阅本文在 developerWorks 全球站点上的 <A href="http://www.ibm.com/developerworks/library/l-linuni.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">英文原文</A>. <BR></LI>
<LI>请访问 Unicode 联盟的 <A href="http://www.unicode.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Unicode 主页</A>，这里定义了 Unicode 字符之间的行为和关系，并为实现者提供了技术信息。 <BR></LI>
<LI><A href="http://www.iso.ch/iso/en/ISOOnline.frontpage" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">国际标准组织（International Organization for Standardization，ISO）</A>是一个由 140 个国家组成的全球性的国家标准社团联盟。 <BR></LI>
<LI><A href="http://web.ansi.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">ANSI</A> 是个私有的、非营利组织，它管理并调整 U.S. 的志愿标准化以及一致性评价系统。 <BR></LI>
<LI><A href="http://www.ucalgary.ca/%7Ebgwong/n869.pdf" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">ISO C99 Draft</A>（Acrobat PDF 格式，556 页），是新的 C 语言标准，来自 Calgary 大学 Ben 的 C 编程课程。 <BR></LI>
<LI><A href="http://web.onetelnet.ch/%7Etwolf/tw/c/c9x_changes.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">C 语言的新 ISO 标准</A>讨论了 C9x 标准。 <BR></LI>
<LI>请阅读 Roman Czyborra 的 <A href="http://czyborra.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Unix 环境下的 Unicode</A>。 <BR></LI>
<LI>请查阅由 David A. Wheeler 撰写的 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Secure Programming for Linux and Unix HOWTO</I>中的 <A href="http://linux.math.tifr.res.in/howto/Secure-Programs-HOWTO.html#CHARACTER-ENCODING" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Character Encoding</A>章节。 <BR></LI>
<LI>请阅读 <A href="http://www.iana.org/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">IANA（Internet Assigned Numbers Authority）</A>中的 <A href="ftp://ftp.isi.edu/in-notes/rfc2278.txt" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">IANA Charset Registration Procedures</A>。 <BR></LI>
<LI>请参阅 Virginia 大学图书馆 Robertson Media 中心的 <A href="http://www.lib.virginia.edu/dmmc/Music/UnicodeMusic/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Unicode Music Symbols</A>。 <BR></LI>
<LI>请看看 <A href="http://www.egt.ie/standards/iso10646/bmp-roadmap-table.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">graphic representation of the Roadmap to the BMP, Plane 0 of the UCS</A>。这些表包含了由 0 号，也就是通用字符集（Universal Character Set，UCS）的基本多语言平面（Basic Multilingual Plane，BMP）实际大小的映射组成的。Everson Gunn Teoranta 是一个自 1990 年开办的支持少数民族语言团体的软件和出版公司，由 Michael Everson 和 Marion Gunn 共同建立。 <BR></LI>
<LI>请浏览 <A href="http://www.cl.cam.ac.uk/%7Emgk25/unicode.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">UTF-8 and Unicode FAQ for UNIX/Linux</A>，Markus Kuhn 的综合性的 one-stop 信息资源，关于您如何在 POSIX 系统（Linux，UNIX）使用 Unicode/UTF-8。 <BR></LI>
<LI>请检查 Technology Appraisals Ltd 的 <A href="http://www.techapps.co.uk/ucs.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Solution Given by the Universal Character Set</A>，其中提供了独立的、高质量的有关电子商务系统、电子信息传递、XML、网络和 IT 安全的信息、教育和培训。 <BR></LI>
<LI>请阅读 Mulberry Technologies, Inc 的 <A href="http://www.mulberrytech.com/papers/unicode/sld001.htm" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Unicode presentation titled“10646 and All That”</A>，一个专攻基于 SGML 和 XML 系统的电子出版物的咨询公司。 <BR></LI>
<LI><A href="http://www.cis.ohio-state.edu/cgi-bin/rfc/rfc2279.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">UTF-8, a transformation format of ISO 10646</A> 是由俄亥俄州立大学的计算机和信息科学系指定的因特网社区的因特网标准跟踪协议。 <BR></LI>
<LI>请咨询 Linux 程序员手册上的 <A href="http://www.cl.cam.ac.uk/%7Emgk25/ucs/man-utf-8.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">UTF-8 ― an ASCII compatible multi-byte Unicode encoding</A>。 <BR></LI>
<LI>请阅读 <A href="http://www.unicode.org/unicode/reports/tr15/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">Unicode Standard Annex#15 Unicode Normalization Forms</A>，一篇描写了四种 Unicode 文本标准化格式规范的文档。有了这些格式，等价的（规范或是兼容的）文本将会有同样的二进制表式。当实现工具在标准化的格式中保留了一个字符串，可以确保有一个以二进制形式表现的独一无二的等价字符串。 <BR></LI>
<LI>请阅读 man-pages.net 上的 <A href="http://man-pages.net/linux/man3/mbstowcs.3.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman"><CODE>mbstowcs</CODE> </FONT></A>，它把多字节字符串转换成了宽字符的字符串，man-pages.net 为 Linux 手册页面提供了永久的基于 Web 的归档文件。 <BR></LI>
<LI>请阅读 Menlo 学校的主页上的 <A href="http://thibs.menloschool.org/help/susv2/xsh/wcwidth.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman"><CODE>wcwidth</CODE> </FONT></A>，它能决定一个宽字符代码值的所占列位置的列数。 <BR></LI>
<LI>请阅读 Hewlett Packard 的开发者资源站点的 Linux 程序员手册上的 <A href="http://devresource.hp.com/STKL/man/RH6.1/wcsrtombs_3.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman"><CODE>wcsrtombs</CODE> </FONT></A>，它能将宽字符的字符串转化为多字节字符串。 <BR></LI>
<LI>请阅读 MKS 工具箱文档中的 <A href="http://www.mkssoftware.com/docs/man3/setlocale.3.asp" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><FONT face="Times New Roman"><CODE>setlocale()</CODE> </FONT></A>，它能改变或查询语言环境。MKS 软件公司是在 Windows 环境或混合 UNIX/Linux 和 Windows 环境中用于系统管理和开发的 Windows 自动化工具的领先供应商。 <BR></LI>
<LI>请学习 <A href="http://www-128.ibm.com/developerworks/cn/linux/i18n/unicode/linuni/oss.software.ibm.com/icu/&amp;origin=l" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">IBM Classes for Unicode (ICU)</A>，一个 C 语言和 C++ 语言库，它在许多平台上提供了健壮的和功能完善的 Unicode 支持。 <BR></LI>
<LI>请参阅 IBM 的 <A href="http://www-128.ibm.com/developerworks/cn/linux/i18n/unicode/linuni/oss.software.ibm.com/icu/userguide/unicodeBasics.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">“Introduction to Unicode”站点</A>，这里深入涵盖了 Unicode 基础知识。 <BR></LI>
<LI>在 IBM 的关于新兴技术的 <A href="http://www.alphaworks.ibm.com/&amp;origin=l" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml"><I>alphaWorks</I>站点 </A>。请参阅： 
<UL xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">
<LI><A href="http://www.alphaworks.ibm.com/tech/unicodecompressor&amp;origin=l">UnicodeCompressor</A>，这里提供了使用标准 Unicode 压缩方案的压缩和解压缩 Unicode 文本的工具 </LI>
<LI><A href="http://www.alphaworks.ibm.com/tech/unicodenormalizer&amp;origin=l">Unicode Normalizer</A>，为实现快速排序和搜索将 Java 字符串对象转换为标准 Unicode 格式。 <BR></LI></UL></LI>
<LI>请阅读 TW Burger 撰写的 <A href="http://www.ibm.com/developerworks/library/u-cyr/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">“Cyrillic in Unicode”</A>和 Jim Melnick 撰写的 <A href="http://www.ibm.com/developerworks/library/os-mult.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">“Multilingual forms in Unicode”</A>，也在 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">developerWorks</I>上。 <BR></LI>
<LI>请在 <I xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">developerWorks</I>上浏览 <A href="http://www-128.ibm.com/developerworks/cn/linux/index.html" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">更多 Linux 参考资料</A>。 </LI></UL>
<P><A name=author1></A><STRONG><SPAN class=atitle2>关于作者</SPAN><BR></STRONG>TW Burger 从 1979 年起曾经做过编程、讲授中等计算机课程以及撰写有关计算机技术方面的书。他正在经营一个信息技术咨询公司。您可以通过 <A href="mailto:twburger@bigfoot.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dw="http://www.ibm.com/developerworks/" xmlns:h="http://www.w3.org/1999/xhtml">twburger@bigfoot.com</A> 与他联系。 </P>
<P></P>
<P></P>
<P></P><img src ="http://www.cnitblog.com/SpiWolf/aggbug/4523.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-15 11:39 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4523.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>NetBSD Code Style</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4521.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Tue, 15 Nov 2005 03:38:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4521.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4521.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4521.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4521.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4521.html</trackback:ping><description><![CDATA[<!--StartFragment -->&nbsp;
<P>/* $NetBSD: style,v 1.36 2005/08/25 17:51:58 briggs Exp $ */</P>
<P>/*<BR>&nbsp;* The revision control tag appears first, with a blank line after it.<BR>&nbsp;* Copyright text appears after the revision control tag.<BR>&nbsp;*/</P>
<P>/*<BR>&nbsp;* The NetBSD source code style guide.<BR>&nbsp;* (Previously known as KNF - Kernel Normal Form).<BR>&nbsp;*<BR>&nbsp;*&nbsp;from: @(#)style&nbsp;1.12 (Berkeley) 3/18/94<BR>&nbsp;*/<BR>/*<BR>&nbsp;* An indent(1) profile approximating the style outlined in<BR>&nbsp;* this document lives in /usr/share/misc/indent.pro.&nbsp; It is a<BR>&nbsp;* useful tool to assist in converting code to KNF, but indent(1)<BR>&nbsp;* output generated using this profile must not be considered to<BR>&nbsp;* be an authoritative reference.<BR>&nbsp;*/</P>
<P>/*<BR>&nbsp;* Source code revision control identifiers appear after any copyright<BR>&nbsp;* text.&nbsp; Use the appropriate macros from &lt;sys/cdefs.h&gt;.&nbsp; Usually only one<BR>&nbsp;* source file per program contains a __COPYRIGHT() section.<BR>&nbsp;* Historic Berkeley code may also have an __SCCSID() section.<BR>&nbsp;* Only one instance of each of these macros can occur in each file.<BR>&nbsp;*/<BR>#include &lt;sys/cdefs.h&gt;<BR>__COPYRIGHT("@(#) Copyright (c) 2000\n\<BR>&nbsp;The NetBSD Foundation, inc. All rights reserved.\n");<BR>__RCSID("$NetBSD: style,v 1.36 2005/08/25 17:51:58 briggs Exp $");</P>
<P>/*<BR>&nbsp;* VERY important single-line comments look like this.<BR>&nbsp;*/</P>
<P>/* Most single-line comments look like this. */</P>
<P>/*<BR>&nbsp;* Multi-line comments look like this.&nbsp; Make them real sentences.&nbsp; Fill<BR>&nbsp;* them so they look like real paragraphs.<BR>&nbsp;*/</P>
<P>/*<BR>&nbsp;* Attempt to wrap lines longer than 80 characters appropriately.<BR>&nbsp;* Refer to the examples below for more information.<BR>&nbsp;*/</P>
<P>/*<BR>&nbsp;* EXAMPLE HEADER FILE:<BR>&nbsp;*<BR>&nbsp;* A header file should protect itself against multiple inclusion.<BR>&nbsp;* E.g, &lt;sys/socket.h&gt; would contain something like:<BR>&nbsp;*/<BR>#ifndef _SYS_SOCKET_H_<BR>#define _SYS_SOCKET_H_<BR>/*<BR>&nbsp;* Contents of #include file go between the #ifndef and the #endif at the end.<BR>&nbsp;*/<BR>#endif /* !_SYS_SOCKET_H_ */<BR>/*<BR>&nbsp;* END OF EXAMPLE HEADER FILE.<BR>&nbsp;*/</P>
<P>/*<BR>&nbsp;* Kernel include files come first.<BR>&nbsp;*/<BR>#include &lt;sys/types.h&gt;&nbsp;&nbsp;/* Non-local includes in brackets. */</P>
<P>/*<BR>&nbsp;* If it's a network program, put the network include files next.<BR>&nbsp;* Group the includes files by subdirectory.<BR>&nbsp;*/<BR>#include &lt;net/if.h&gt;<BR>#include &lt;net/if_dl.h&gt;<BR>#include &lt;net/route.h&gt;<BR>#include &lt;netinet/in.h&gt;<BR>#include &lt;protocols/rwhod.h&gt;</P>
<P>/*<BR>&nbsp;* Then there's a blank line, followed by the /usr include files.<BR>&nbsp;* The /usr include files should be sorted!<BR>&nbsp;*/<BR>#include &lt;assert.h&gt;<BR>#include &lt;errno.h&gt;<BR>#include &lt;inttypes.h&gt;<BR>#include &lt;stdio.h&gt;<BR>#include &lt;stdlib.h&gt;</P>
<P>/*<BR>&nbsp;* Global pathnames are defined in /usr/include/paths.h.&nbsp; Pathnames local<BR>&nbsp;* to the program go in pathnames.h in the local directory.<BR>&nbsp;*/<BR>#include &lt;paths.h&gt;</P>
<P>/* Then, there's a blank line, and the user include files. */<BR>#include "pathnames.h"&nbsp;&nbsp;/* Local includes in double quotes. */</P>
<P>/*<BR>&nbsp;* ANSI function declarations for private functions (i.e. functions not used<BR>&nbsp;* elsewhere) and the main() function go at the top of the source module. <BR>&nbsp;* Don't associate a name with the types.&nbsp; I.e. use:<BR>&nbsp;*&nbsp;void function(int);<BR>&nbsp;* Use your discretion on indenting between the return type and the name, and<BR>&nbsp;* how to wrap a prototype too long for a single line.&nbsp; In the latter case,<BR>&nbsp;* lining up under the initial left parenthesis may be more readable.<BR>&nbsp;* In any case, consistency is important!<BR>&nbsp;*/<BR>static char *function(int, int, float, int);<BR>static int dirinfo(const char *, struct stat *, struct dirent *,<BR>&nbsp;&nbsp;&nbsp;&nbsp; struct statfs *, int *, char **[]);<BR>static void usage(void);<BR>int main(int, char *[]);</P>
<P>/*<BR>&nbsp;* Macros are capitalized, parenthesized, and should avoid side-effects.<BR>&nbsp;* Spacing before and after the macro name may be any whitespace, though<BR>&nbsp;* use of TABs should be consistent through a file.<BR>&nbsp;* If they are an inline expansion of a function, the function is defined<BR>&nbsp;* all in lowercase, the macro has the same name all in uppercase.<BR>&nbsp;* If the macro is an expression, wrap the expression in parenthesis.<BR>&nbsp;* If the macro is more than a single statement, use ``do { ... } while (0)'',<BR>&nbsp;* so that a trailing semicolon works.&nbsp; Right-justify the backslashes; it<BR>&nbsp;* makes it easier to read. The CONSTCOND comment is to satisfy lint(1).<BR>&nbsp;*/<BR>#define&nbsp;MACRO(v, w, x, y)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\<BR>do {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\<BR>&nbsp;v = (x) + (y);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\<BR>&nbsp;w = (y) + 2;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\<BR>} while (/* CONSTCOND */ 0)</P>
<P>#define&nbsp;DOUBLE(x) ((x) * 2)</P>
<P>/* Enum types are capitalized.&nbsp; No comma on the last element. */<BR>enum enumtype {<BR>&nbsp;ONE,<BR>&nbsp;TWO<BR>} et;</P>
<P>/*<BR>&nbsp;* When declaring variables in structures, declare them organized by use in<BR>&nbsp;* a manner to attempt to minimize memory wastage because of compiler alignment<BR>&nbsp;* issues, then by size, and then by alphabetical order. E.g, don't use<BR>&nbsp;* ``int a; char *b; int c; char *d''; use ``int a; int b; char *c; char *d''.<BR>&nbsp;* Each variable gets its own type and line, although an exception can be made<BR>&nbsp;* when declaring bitfields (to clarify that it's part of the one bitfield).<BR>&nbsp;* Note that the use of bitfields in general is discouraged.<BR>&nbsp;*<BR>&nbsp;* Major structures should be declared at the top of the file in which they<BR>&nbsp;* are used, or in separate header files, if they are used in multiple<BR>&nbsp;* source files.&nbsp; Use of the structures should be by separate declarations<BR>&nbsp;* and should be "extern" if they are declared in a header file.<BR>&nbsp;*<BR>&nbsp;* It may be useful to use a meaningful prefix for each member name.<BR>&nbsp;* E.g, for ``struct softc'' the prefix could be ``sc_''.<BR>&nbsp;*/<BR>struct foo {<BR>&nbsp;struct foo *next;&nbsp;/* List of active foo */<BR>&nbsp;struct mumble amumble;&nbsp;/* Comment for mumble */<BR>&nbsp;int bar;<BR>&nbsp;unsigned int baz:1,&nbsp;/* Bitfield; line up entries if desired */<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fuz:5,<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zap:2;<BR>&nbsp;uint8_t flag;<BR>};<BR>struct foo *foohead;&nbsp;&nbsp;/* Head of global foo list */</P>
<P>/* Make the structure name match the typedef. */<BR>typedef struct BAR {<BR>&nbsp;int level;<BR>} BAR;</P>
<P>/* C99 uintN_t is preferred over u_intN_t. */<BR>uint32_t zero;</P>
<P>/*<BR>&nbsp;* All major routines should have a comment briefly describing what<BR>&nbsp;* they do.&nbsp; The comment before the "main" routine should describe<BR>&nbsp;* what the program does.<BR>&nbsp;*/<BR>int<BR>main(int argc, char *argv[])<BR>{<BR>&nbsp;long num;<BR>&nbsp;int ch;<BR>&nbsp;char *ep;</P>
<P>&nbsp;/*<BR>&nbsp; * At the start of main(), call setprogname() to set the program<BR>&nbsp; * name.&nbsp; This does nothing on NetBSD, but increases portability<BR>&nbsp; * to other systems.<BR>&nbsp; */<BR>&nbsp;setprogname(argv[0]);</P>
<P>&nbsp;/*<BR>&nbsp; * For consistency, getopt should be used to parse options.&nbsp; Options<BR>&nbsp; * should be sorted in the getopt call and the switch statement, unless<BR>&nbsp; * parts of the switch cascade.&nbsp; Elements in a switch statement that<BR>&nbsp; * cascade should have a FALLTHROUGH comment.&nbsp; Numerical arguments<BR>&nbsp; * should be checked for accuracy.&nbsp; Code that cannot be reached should<BR>&nbsp; * have a NOTREACHED comment.<BR>&nbsp; */<BR>&nbsp;while ((ch = getopt(argc, argv, "abn")) != -1) {<BR>&nbsp;&nbsp;switch (ch) {&nbsp;&nbsp;/* Indent the switch. */<BR>&nbsp;&nbsp;case 'a':&nbsp;&nbsp;/* Don't indent the case. */<BR>&nbsp;&nbsp;&nbsp;aflag = 1;<BR>&nbsp;&nbsp;&nbsp;/* FALLTHROUGH */<BR>&nbsp;&nbsp;case 'b':<BR>&nbsp;&nbsp;&nbsp;bflag = 1;<BR>&nbsp;&nbsp;&nbsp;break;<BR>&nbsp;&nbsp;case 'n':<BR>&nbsp;&nbsp;&nbsp;errno = 0;<BR>&nbsp;&nbsp;&nbsp;num = strtol(optarg, &amp;ep, 10);<BR>&nbsp;&nbsp;&nbsp;if (num &lt;= 0 || *ep != '\0' || (errno == ERANGE &amp;&amp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (num == LONG_MAX || num == LONG_MIN)) )<BR>&nbsp;&nbsp;&nbsp;&nbsp;errx(1, "illegal number -- %s", optarg);<BR>&nbsp;&nbsp;&nbsp;break;<BR>&nbsp;&nbsp;case '?':<BR>&nbsp;&nbsp;default:<BR>&nbsp;&nbsp;&nbsp;usage();<BR>&nbsp;&nbsp;&nbsp;/* NOTREACHED */<BR>&nbsp;&nbsp;}<BR>&nbsp;}<BR>&nbsp;argc -= optind;<BR>&nbsp;argv += optind;</P>
<P>&nbsp;/*<BR>&nbsp; * Space after keywords (while, for, return, switch).&nbsp; No braces are<BR>&nbsp; * used for control statements with zero or only a single statement,<BR>&nbsp; * unless it's a long statement.<BR>&nbsp; *<BR>&nbsp; * Forever loops are done with for's, not while's.<BR>&nbsp; */<BR>&nbsp;for (p = buf; *p != '\0'; ++p)<BR>&nbsp;&nbsp;continue;&nbsp;&nbsp;/* Explicit no-op */<BR>&nbsp;for (;;)<BR>&nbsp;&nbsp;stmt;</P>
<P>&nbsp;/*<BR>&nbsp; * Parts of a for loop may be left empty.&nbsp; Don't put declarations<BR>&nbsp; * inside blocks unless the routine is unusually complicated.<BR>&nbsp; */<BR>&nbsp;for (; cnt &lt; 15; cnt++) {<BR>&nbsp;&nbsp;stmt1;<BR>&nbsp;&nbsp;stmt2;<BR>&nbsp;}</P>
<P>&nbsp;/* Second level indents are four spaces. */<BR>&nbsp;while (cnt &lt; 20)<BR>&nbsp;&nbsp;z = a + really + long + statement + that + needs + two lines +<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; gets + indented + four + spaces + on + the + second +<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and + subsequent + lines;</P>
<P>&nbsp;/*<BR>&nbsp; * Closing and opening braces go on the same line as the else.<BR>&nbsp; * Don't add braces that aren't necessary except in cases where<BR>&nbsp; * there are ambiguity or readability issues.<BR>&nbsp; */<BR>&nbsp;if (test) {<BR>&nbsp;&nbsp;/*<BR>&nbsp;&nbsp; * I have a long comment here.<BR>&nbsp;&nbsp; */<BR>#ifdef zorro<BR>&nbsp;&nbsp;z = 1;<BR>#else<BR>&nbsp;&nbsp;b = 3;<BR>#endif<BR>&nbsp;} else if (bar) {<BR>&nbsp;&nbsp;stmt;<BR>&nbsp;&nbsp;stmt;<BR>&nbsp;} else<BR>&nbsp;&nbsp;stmt;</P>
<P>&nbsp;/* No spaces after function names. */<BR>&nbsp;if ((result = function(a1, a2, a3, a4)) == NULL)<BR>&nbsp;&nbsp;exit(1);</P>
<P>&nbsp;/*<BR>&nbsp; * Unary operators don't require spaces, binary operators do.<BR>&nbsp; * Don't excessively use parenthesis, but they should be used if<BR>&nbsp; * statement is really confusing without them, such as:<BR>&nbsp; * a = b-&gt;c[0] + ~d == (e || f) || g &amp;&amp; h ? i : j &gt;&gt; 1;<BR>&nbsp; */<BR>&nbsp;a = ((b-&gt;c[0] + ~d == (e || f)) || (g &amp;&amp; h)) ? i : (j &gt;&gt; 1);<BR>&nbsp;k = !(l &amp; FLAGS);</P>
<P>&nbsp;/*<BR>&nbsp; * Exits should be EXIT_SUCCESS on success, and EXIT_FAILURE on<BR>&nbsp; * failure.&nbsp; Don't denote all the possible exit points, using the<BR>&nbsp; * integers 1 through 127.&nbsp; Avoid obvious comments such as "Exit<BR>&nbsp; * 0 on success.". Since main is a function that returns an int,<BR>&nbsp; * prefer returning from it, than calling exit.<BR>&nbsp; */<BR>&nbsp;return EXIT_SUCCESS;<BR>}</P>
<P>/*<BR>&nbsp;* The function type must be declared on a line by itself<BR>&nbsp;* preceding the function.<BR>&nbsp;*/<BR>static char *<BR>function(int a1, int a2, float fl, int a4)<BR>{<BR>&nbsp;/*<BR>&nbsp; * When declaring variables in functions declare them sorted by size,<BR>&nbsp; * then in alphabetical order; multiple ones per line are okay.<BR>&nbsp; * Function prototypes should go in the include file "extern.h".<BR>&nbsp; * If a line overflows reuse the type keyword.<BR>&nbsp; *<BR>&nbsp; * DO NOT initialize variables in the declarations.<BR>&nbsp; */<BR>&nbsp;extern u_char one;<BR>&nbsp;extern char two;<BR>&nbsp;struct foo three, *four;<BR>&nbsp;double five;<BR>&nbsp;int *six, seven;<BR>&nbsp;char *eight, *nine, ten, eleven, twelve, thirteen;<BR>&nbsp;char fourteen, fifteen, sixteen;</P>
<P>&nbsp;/*<BR>&nbsp; * Casts and sizeof's are not followed by a space.&nbsp; NULL is any<BR>&nbsp; * pointer type, and doesn't need to be cast, so use NULL instead<BR>&nbsp; * of (struct foo *)0 or (struct foo *)NULL.&nbsp; Also, test pointers<BR>&nbsp; * against NULL.&nbsp; I.e. use:<BR>&nbsp; *<BR>&nbsp; *&nbsp;(p = f()) == NULL<BR>&nbsp; * not:<BR>&nbsp; *&nbsp;!(p = f())<BR>&nbsp; *<BR>&nbsp; * Don't use `!' for tests unless it's a boolean.<BR>&nbsp; * E.g. use "if (*p == '\0')", not "if (!*p)".<BR>&nbsp; *<BR>&nbsp; * Routines returning ``void *'' should not have their return<BR>&nbsp; * values cast to more specific pointer types.<BR>&nbsp; *<BR>&nbsp; * Use err/warn(3), don't roll your own!<BR>&nbsp; */<BR>&nbsp;if ((four = malloc(sizeof(struct foo))) == NULL)<BR>&nbsp;&nbsp;err(1, NULL);<BR>&nbsp;if ((six = (int *)overflow()) == NULL)<BR>&nbsp;&nbsp;errx(1, "Number overflowed.");</P>
<P>&nbsp;/* No parentheses are needed around the return value. */<BR>&nbsp;return eight;<BR>}</P>
<P>/*<BR>&nbsp;* Use ANSI function declarations.&nbsp; ANSI function braces look like<BR>&nbsp;* old-style (K&amp;R) function braces.<BR>&nbsp;* As per the wrapped prototypes, use your discretion on how to format<BR>&nbsp;* the subsequent lines.<BR>&nbsp;*/<BR>static int<BR>dirinfo(const char *p, struct stat *sb, struct dirent *de, struct statfs *sf,<BR>&nbsp;int *rargc, char **rargv[])<BR>{&nbsp;/* Insert an empty line if the function has no local variables. */</P>
<P>&nbsp;/*<BR>&nbsp; * In system libraries, catch obviously invalid function arguments<BR>&nbsp; * using _DIAGASSERT(3).<BR>&nbsp; */<BR>&nbsp;_DIAGASSERT(p != NULL);<BR>&nbsp;_DIAGASSERT(filedesc != -1);</P>
<P>&nbsp;if (stat(p, sb) &lt; 0)<BR>&nbsp;&nbsp;err(1, "Unable to stat %s", p);</P>
<P>&nbsp;/*<BR>&nbsp; * To printf quantities that might be larger that "long", include<BR>&nbsp; * &lt;inttypes.h&gt;, cast quantities to intmax_t or uintmax_t and use<BR>&nbsp; * PRI?MAX constants, which may be found in &lt;machine/int_fmtio.h&gt;.<BR>&nbsp; */<BR>&nbsp;(void)printf("The size of %s is %" PRIdMAX " (%#" PRIxMAX ")\n", p,<BR>&nbsp;&nbsp;&nbsp;&nbsp; (intmax_t)sb-&gt;st_size, (uintmax_t)sb-&gt;st_size);</P>
<P>&nbsp;/*<BR>&nbsp; * To printf quantities of known bit-width, use the corresponding<BR>&nbsp; * defines (generally only done within NetBSD for quantities that<BR>&nbsp; * exceed 32-bits).<BR>&nbsp; */<BR>&nbsp;(void)printf("%s uses %" PRId64 " blocks and has flags %#" PRIx32 "\n",<BR>&nbsp;&nbsp;&nbsp;&nbsp; p, sb-&gt;st_blocks, sb-&gt;st_flags);</P>
<P>&nbsp;/*<BR>&nbsp; * There are similar constants that should be used with the *scanf(3)<BR>&nbsp; * family of functions: SCN?MAX, SCN?64, etc.<BR>&nbsp; */<BR>}</P>
<P>/*<BR>&nbsp;* Functions that support variable numbers of arguments should look like this.<BR>&nbsp;* (With the #include &lt;stdarg.h&gt; appearing at the top of the file with the<BR>&nbsp;* other include files).<BR>&nbsp;*/<BR>#include &lt;stdarg.h&gt;</P>
<P>void<BR>vaf(const char *fmt, ...)<BR>{<BR>&nbsp;va_list ap;</P>
<P>&nbsp;va_start(ap, fmt);<BR>&nbsp;STUFF;<BR>&nbsp;va_end(ap);&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;/* No return needed for void functions. */<BR>}</P>
<P>static void<BR>usage(void)<BR>{</P>
<P>&nbsp;/*<BR>&nbsp; * Use printf(3), not fputs/puts/putchar/whatever, it's faster and<BR>&nbsp; * usually cleaner, not to mention avoiding stupid bugs.<BR>&nbsp; * Use snprintf(3) or strlcpy(3)/strlcat(3) instead of sprintf(3);<BR>&nbsp; * again to avoid stupid bugs.<BR>&nbsp; *<BR>&nbsp; * Usage statements should look like the manual pages.&nbsp; Options w/o<BR>&nbsp; * operands come first, in alphabetical order inside a single set of<BR>&nbsp; * braces.&nbsp; Followed by options with operands, in alphabetical order,<BR>&nbsp; * each in braces.&nbsp; Followed by required arguments in the order they<BR>&nbsp; * are specified, followed by optional arguments in the order they<BR>&nbsp; * are specified.&nbsp; A bar (`|') separates either/or options/arguments,<BR>&nbsp; * and multiple options/arguments which are specified together are<BR>&nbsp; * placed in a single set of braces.<BR>&nbsp; *<BR>&nbsp; * Use getprogname() instead of hardcoding the program name.<BR>&nbsp; *<BR>&nbsp; * "usage: f [-ade] [-b b_arg] [-m m_arg] req1 req2 [opt1 [opt2]]\n"<BR>&nbsp; * "usage: f [-a | -b] [-c [-de] [-n number]]\n"<BR>&nbsp; */<BR>&nbsp;(void)fprintf(stderr, "usage: %s [-ab]\n", getprogname());<BR>&nbsp;exit(EXIT_FAILURE);<BR>}<BR></P><img src ="http://www.cnitblog.com/SpiWolf/aggbug/4521.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-15 11:38 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4521.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>ANSI Escape Sequence</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4519.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Tue, 15 Nov 2005 03:37:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4519.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4519.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4519.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4519.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4519.html</trackback:ping><description><![CDATA[<!--StartFragment -->&nbsp;
<CENTER>
<H1>ANSI Escape Sequence</H1></CENTER>
<CENTER>
<H2>Clear Display</H2></CENTER>
<TABLE border=1>
<TBODY>
<TR align=middle>
<TD bgColor=#0080ff><FONT color=#ffffff>Function</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>ESC Sequence</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>Description&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Clear Screen</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>ESC[2J</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Clear the whole screen and position the cursor to the top left corner.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Clear Line</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>ESC[K</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Clear line, from cursor position to the right most position of line.&nbsp;</FONT></TD></TR></TBODY></TABLE>
<CENTER>
<H2><FONT color=#000000>Cursor Movement</FONT></H2></CENTER>
<TABLE border=1>
<TBODY>
<TR align=middle>
<TD bgColor=#0080ff><FONT color=#ffffff>Function</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>ESC Sequence</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>Description&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Move Up</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>num</I><B>A</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Move the cursor up <I><TT>num</TT></I> positions&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Move Down</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>num</I><B>B</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Move the cursor down <I><TT>num</TT></I> positions&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Move Right</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>num</I><B>C</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Move the cursor right <I><TT>num</TT></I> positions&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Move Left</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>num</I><B>D</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Move the cursor left <I><TT>num</TT></I> positions&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Move to Position</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>row</I>;<I>col</I><B>H</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Move the cursor to the (<I><TT>col</TT></I>, <I><TT>row</TT></I>) position. Note that the row comes before column; that is, y comes before x. Either <I><TT>col</TT></I> or <I><TT>row</TT></I> can be omitted. Row and column both start with "1," not zero. (1, 1) corresponds to the top-left corner of the screen.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Move to Position</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>row</I>;<I>col</I><B>f</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Same as above.&nbsp;</FONT></TD></TR></TBODY></TABLE>
<CENTER>
<H2><FONT color=#000000>Save and Restore Cursor Position</FONT></H2></CENTER>
<TABLE border=1>
<TBODY>
<TR align=middle>
<TD bgColor=#0080ff><FONT color=#ffffff>Function</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>ESC Sequence</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>Description&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Save Cursor Positon</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>ESC[s</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Save the cursor position for later restoration.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Restore Cursor Positon</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>ESC[u</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Restore the cursor position previously saved.&nbsp;</FONT></TD></TR></TBODY></TABLE>
<CENTER>
<H2><FONT color=#000000>Character Mode</FONT></H2></CENTER>
<TABLE border=1>
<TBODY>
<TR align=middle>
<TD bgColor=#0080ff><FONT color=#ffffff>Function</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>ESC Sequence</FONT></TD>
<TD bgColor=#0080ff><FONT color=#ffffff>Description&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Change Character Mode</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>attr</I><B>m</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Change the character mode with attribute <I><TT>attr</TT></I>. The attributes are numbers listed below.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Change Character Mode</FONT></TD>
<TD bgColor=#fffff0><TT><FONT color=#000000><B>ESC[</B><I>attr</I><B>;</B><I>...</I><B>;</B><I>attr</I><B>m</B></FONT></TT></TD>
<TD bgColor=#fffff0><FONT color=#000000>Change the character mode with attributes <TT><I>attr</I><B>;</B><I>...</I><B>;</B><I>attr</I></TT>. The attributes are numbers listed below.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>All Off</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>0</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>All attributes turned off. (Except for foreground and background color).&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>High Intensity</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>1</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Bold.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Low Intensity</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>2</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Normal.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Italic</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>3</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Work only on some systems.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Underline</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>4</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Underline font.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Blink</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>5</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Blinking font.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Rapid Blink</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>6</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Works only on some systems.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Reverse Video</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>7</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Swapping the foreground color and the background color.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Invisible</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>8</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Do not display characters.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>30</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Black.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>31</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Red.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>32</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Green.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>33</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Yellow.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>34</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Blue.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>35</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Magenta.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>36</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Cyan.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Foreground Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>37</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>White.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>40</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Black.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>41</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Red.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>42</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Green.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>43</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Yellow.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>44</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Blue.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>45</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Magenta.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>46</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>Cyan.&nbsp;</FONT></TD></TR>
<TR align=middle>
<TD bgColor=#fffff0><FONT color=#000000>Background Color</FONT></TD>
<TD bgColor=#fffff0><B><TT><FONT color=#000000>47</FONT></TT></B></TD>
<TD bgColor=#fffff0><FONT color=#000000>White.&nbsp;</FONT></TD></TR></TBODY></TABLE>
<CENTER>
<P><FONT color=#000000>thanks to Kenneth Kin Lum.</FONT></P></CENTER><img src ="http://www.cnitblog.com/SpiWolf/aggbug/4519.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-15 11:37 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4519.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>ANSI.SYS</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4520.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Tue, 15 Nov 2005 03:37:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4520.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4520.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4520.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4520.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4520.html</trackback:ping><description><![CDATA[<!--StartFragment -->&nbsp;<PRE> <BR>                                  ANSI.SYS<BR> <BR>Defines functions that change display graphics, control cursor movement, and<BR>reassign keys. The ANSI.SYS device driver supports ANSI terminal emulation<BR>of escape sequences to control your system's screen and keyboard. An ANSI<BR>escape sequence is a sequence of ASCII characters, the first two of which<BR>are the escape character (1Bh) and the left-bracket character (5Bh). The<BR>character or characters following the escape and left-bracket characters<BR>specify an alphanumeric code that controls a keyboard or display function.<BR>ANSI escape sequences distinguish between uppercase and lowercase letters;<BR>for example,"A" and "a" have completely different meanings.<BR> <BR>This device driver must be loaded by a &lt; DEVICE &gt; or &lt; DEVICEHIGH &gt; command in<BR>your CONFIG.SYS file.<BR> <BR>Note:  In this topic bold letters in syntax and ANSI escape sequences<BR>       indicate text you must type exactly as it appears.<BR> <BR>Syntax<BR> <BR>    DEVICE=[drive:][path]ANSI.SYS [/X] [/K] [/R]<BR> <BR>Parameter<BR> <BR>[drive:][path]<BR>   Specifies the location of the ANSI.SYS file.<BR> <BR>Switches<BR> <BR>/X<BR>    Remaps extended keys independently on 101-key keyboards.<BR> <BR>/K<BR>    Causes ANSI.SYS to treat a 101-key keyboard like an 84-key<BR>    keyboard. This is equivalent to the command SWITCHES=/K.<BR>    If you usually use the SWITCHES=/K command, you will need<BR>    to use the /K switch with ANSI.SYS.<BR> <BR>/R<BR>     Adjusts line scrolling to improve readability when ANSI.SYS<BR>     is used with screen-reading programs (which make computers<BR>     more accessible to people with disabilities).<BR> <BR>Parameters used in ANSI escape sequences<BR> <BR>Pn<BR>    Numeric parameter. Specifies a decimal number.<BR> <BR>Ps<BR>    Selective parameter. Specifies a decimal number that you use to select<BR>    a function. You can specify more than one function by separating the<BR>    parameters with semicolons.<BR> <BR>PL<BR>    Line parameter. Specifies a decimal number that represents one of the<BR>    lines on your display or on another device.<BR> <BR>Pc<BR>    Column parameter. Specifies a decimal number that represents one of the<BR>    columns on your screen or on another device.<BR> <BR>ANSI escape sequences for cursor movement, graphics, and keyboard settings<BR> <BR>In the following list of ANSI escape sequences, the abbreviation ESC<BR>represents the ASCII escape character 27 (1Bh), which appears at the<BR>beginning of each escape sequence.<BR> <BR>ESC[PL;PcH<BR>    Cursor Position: Moves the cursor to the specified position<BR>    (coordinates). If you do not specify a position, the cursor moves to the<BR>    home position��the upper-left corner of the screen (line 0, column<BR>    0). This escape sequence works the same way as the following Cursor<BR>    Position escape sequence.<BR> <BR>ESC[PL;Pcf<BR>    Cursor Position: Works the same way as the preceding Cursor Position<BR>    escape sequence.<BR> <BR>ESC[PnA<BR>    Cursor Up: Moves the cursor up by the specified number of lines without<BR>    changing columns. If the cursor is already on the top line, ANSI.SYS<BR>    ignores this sequence.<BR> <BR>ESC[PnB<BR>    Cursor Down: Moves the cursor down by the specified number of lines<BR>    without changing columns. If the cursor is already on the bottom line,<BR>    ANSI.SYS ignores this sequence.<BR> <BR>ESC[PnC<BR>    Cursor Forward: Moves the cursor forward by the specified number of<BR>    columns without changing lines. If the cursor is already in the<BR>    rightmost column, ANSI.SYS ignores this sequence.<BR> <BR>ESC[PnD<BR>    Cursor Backward: Moves the cursor back by the specified number of<BR>    columns without changing lines. If the cursor is already in the leftmost<BR>    column, ANSI.SYS ignores this sequence.<BR> <BR>ESC[s<BR>    Save Cursor Position: Saves the current cursor position. You can move<BR>    the cursor to the saved cursor position by using the Restore Cursor<BR>    Position sequence.<BR> <BR>ESC[u<BR>    Restore Cursor Position: Returns the cursor to the position stored<BR>    by the Save Cursor Position sequence.<BR> <BR>ESC[2J<BR>    Erase Display: Clears the screen and moves the cursor to the home<BR>    position (line 0, column 0).<BR> <BR>ESC[K<BR>    Erase Line: Clears all characters from the cursor position to the<BR>    end of the line (including the character at the cursor position).<BR> <BR>ESC[Ps;...;Psm<BR>    Set Graphics Mode: Calls the graphics functions specified by the<BR>    following values. These specified functions remain active until the next<BR>    occurrence of this escape sequence. Graphics mode changes the colors and<BR>    attributes of text (such as bold and underline) displayed on the<BR>    screen.<BR> <BR>    Text attributes<BR>       0    All attributes off<BR>       1    Bold on<BR>       4    Underscore (on monochrome display adapter only)<BR>       5    Blink on<BR>       7    Reverse video on<BR>       8    Concealed on<BR> <BR>    Foreground colors<BR>       30    Black<BR>       31    Red<BR>       32    Green<BR>       33    Yellow<BR>       34    Blue<BR>       35    Magenta<BR>       36    Cyan<BR>       37    White<BR> <BR>    Background colors<BR>       40    Black<BR>       41    Red<BR>       42    Green<BR>       43    Yellow<BR>       44    Blue<BR>       45    Magenta<BR>       46    Cyan<BR>       47    White<BR> <BR>    Parameters 30 through 47 meet the ISO 6429 standard.<BR> <BR>ESC[=psh<BR>    Set Mode: Changes the screen width or type to the mode specified<BR>    by one of the following values:<BR> <BR>       0      40 x 148 x 25 monochrome (text)<BR>       1      40 x 148 x 25 color (text)<BR>       2      80 x 148 x 25 monochrome (text)<BR>       3      80 x 148 x 25 color (text)<BR>       4      320 x 148 x 200 4-color (graphics)<BR>       5      320 x 148 x 200 monochrome (graphics)<BR>       6      640 x 148 x 200 monochrome (graphics)<BR>       7      Enables line wrapping<BR>      13      320 x 148 x 200 color (graphics)<BR>      14      640 x 148 x 200 color (16-color graphics)<BR>      15      640 x 148 x 350 monochrome (2-color graphics)<BR>      16      640 x 148 x 350 color (16-color graphics)<BR>      17      640 x 148 x 480 monochrome (2-color graphics)<BR>      18      640 x 148 x 480 color (16-color graphics)<BR>      19      320 x 148 x 200 color (256-color graphics)<BR> <BR>ESC[=Psl<BR>    Reset Mode: Resets the mode by using the same values that Set Mode<BR>    uses, except for 7, which disables line wrapping. The last character<BR>    in this escape sequence is a lowercase L.<BR> <BR>ESC[code;string;...p<BR>    Set Keyboard Strings: Redefines a keyboard key to a specified string.<BR>    The parameters for this escape sequence are defined as follows:<BR> <BR>      Code is one or more of the values listed in the following table.<BR>       These values represent keyboard keys and key combinations. When using<BR>       these values in a command, you must type the semicolons shown in this<BR>       table in addition to the semicolons required by the escape sequence.<BR>       The codes in parentheses are not available on some keyboards.<BR>       ANSI.SYS will not interpret the codes in parentheses for those<BR>       keyboards unless you specify the /X switch in the DEVICE command for<BR>       ANSI.SYS.<BR> <BR>      String is either the ASCII code for a single character or a string<BR>       contained in quotation marks. For example, both 65 and "A" can be<BR>       used to represent an uppercase A.<BR> <BR>IMPORTANT:  Some of the values in the following table are not valid for all<BR>            computers. Check your computer's documentation for values that<BR>            are different.<BR> <BR>Key                       Code      SHIFT+code  CTRL+code  ALT+code<BR>���������������������������������������������������������������������������<BR> <BR>F1                        0;59      0;84        0;94       0;104<BR> <BR>F2                        0;60      0;85        0;95       0;105<BR> <BR>F3                        0;61      0;86        0;96       0;106<BR> <BR>F4                        0;62      0;87        0;97       0;107<BR> <BR>F5                        0;63      0;88        0;98       0;108<BR> <BR>F6                        0;64      0;89        0;99       0;109<BR> <BR>F7                        0;65      0;90        0;100      0;110<BR> <BR>F8                        0;66      0;91        0;101      0;111<BR> <BR>F9                        0;67      0;92        0;102      0;112<BR> <BR>F10                       0;68      0;93        0;103      0;113<BR> <BR>F11                       0;133     0;135       0;137      0;139<BR> <BR>F12                       0;134     0;136       0;138      0;140<BR> <BR>HOME (num keypad)         0;71      55          0;119      ��<BR> <BR>UP ARROW (num keypad)     0;72      56          (0;141)    ��<BR> <BR>PAGE UP (num keypad)      0;73      57          0;132      ��<BR> <BR>LEFT ARROW (num keypad)   0;75      52          0;115      ��<BR> <BR>RIGHT ARROW (num          0;77      54          0;116      ��<BR>keypad)<BR> <BR>END (num keypad)          0;79      49          0;117      ��<BR> <BR>DOWN ARROW (num keypad)   0;80      50          (0;145)    ��<BR> <BR>PAGE DOWN (num keypad)    0;81      51          0;118      ��<BR> <BR>INSERT (num keypad)       0;82      48          (0;146)    ��<BR> <BR>DELETE  (num keypad)      0;83      46          (0;147)    ��<BR> <BR>HOME                      (224;71)  (224;71)    (224;119)  (224;151)<BR> <BR>UP ARROW                  (224;72)  (224;72)    (224;141)  (224;152)<BR> <BR>PAGE UP                   (224;73)  (224;73)    (224;132)  (224;153)<BR> <BR>LEFT ARROW                (224;75)  (224;75)    (224;115)  (224;155)<BR> <BR>RIGHT ARROW               (224;77)  (224;77)    (224;116)  (224;157)<BR> <BR>END                       (224;79)  (224;79)    (224;117)  (224;159)<BR> <BR>DOWN ARROW                (224;80)  (224;80)    (224;145)  (224;154)<BR> <BR>PAGE DOWN                 (224;81)  (224;81)    (224;118)  (224;161)<BR> <BR>INSERT                    (224;82)  (224;82)    (224;146)  (224;162)<BR> <BR>DELETE                    (224;83)  (224;83)    (224;147)  (224;163)<BR> <BR>PRINT SCREEN              ��        ��          0;114      ��<BR> <BR>PAUSE/BREAK               ��        ��          0;0        ��<BR> <BR>BACKSPACE                 8         8           127        (0)<BR> <BR>ENTER                     13        ��          10         (0<BR> <BR>TAB                       9         0;15        (0;148)    (0;165)<BR> <BR>NULL                      0;3       ��          ��         ��<BR> <BR>A                         97        65          1          0;30<BR> <BR>B                         98        66          2          0;48<BR> <BR>C                         99        66          3          0;46<BR> <BR>D                         100       68          4          0;32<BR> <BR>E                         101       69          5          0;18<BR> <BR>F                         102       70          6          0;33<BR> <BR>G                         103       71          7          0;34<BR> <BR>H                         104       72          8          0;35<BR> <BR>I                         105       73          9          0;23<BR> <BR>J                         106       74          10         0;36<BR> <BR>K                         107       75          11         0;37<BR> <BR>L                         108       76          12         0;38<BR> <BR>M                         109       77          13         0;50<BR> <BR>N                         110       78          14         0;49<BR> <BR>O                         111       79          15         0;24<BR> <BR>P                         112       80          16         0;25<BR> <BR>Q                         113       81          17         0;16<BR> <BR>R                         114       82          18         0;19<BR> <BR>S                         115       83          19         0;31<BR> <BR>T                         116       84          20         0;20<BR> <BR>U                         117       85          21         0;22<BR> <BR>V                         118       86          22         0;47<BR> <BR>W                         119       87          23         0;17<BR> <BR>X                         120       88          24         0;45<BR> <BR>Y                         121       89          25         0;21<BR> <BR>Z                         122       90          26         0;44<BR> <BR>1                         49        33          ��         0;120<BR> <BR>2                         50        64          0          0;121<BR> <BR>3                         51        35          ��         0;122<BR> <BR>4                         52        36          ��         0;123<BR> <BR>5                         53        37          ��         0;124<BR> <BR>6                         54        94          30         0;125<BR> <BR>7                         55        38          ��         0;126<BR> <BR>8                         56        42          ��         0;126<BR> <BR>9                         57        40          ��         0;127<BR> <BR>0                         48        41          ��         0;129<BR> <BR>-                         45        95          31         0;130<BR> <BR>=                         61        43          ��-        0;131<BR> <BR>[                         91        123         27         0;26<BR> <BR>]                         93        125         29         0;27<BR> <BR>                          92        124         28         0;43<BR> <BR>;                         59        58          ��         0;39<BR> <BR>'                         39        34          ��         0;40<BR> <BR>,                         44        60          ��         0;51<BR> <BR>.                         46        62          ��         0;52<BR> <BR>/                         47        63          ��         0;53<BR> <BR>`                         96        126         ��         (0;41)<BR> <BR>ENTER (keypad)            13        ��          10         (0;166)<BR> <BR>/ (keypad)                47        47          (0;142)    (0;74)<BR> <BR>* (keypad)                42        (0;144)     (0;78)     ��<BR> <BR>- (keypad)                45        45          (0;149)    (0;164)<BR> <BR>+ (keypad)                43        43          (0;150)    (0;55)<BR> <BR>5 (keypad)                (0;76)    53          (0;143)    ��<BR> <BR><BR><BR></PRE><img src ="http://www.cnitblog.com/SpiWolf/aggbug/4520.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-15 11:37 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/15/4520.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Problem when compiling unpv12e in FreeBSD(no problems with unpv13e)</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4348.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Fri, 11 Nov 2005 07:25:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4348.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4348.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4348.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4348.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4348.html</trackback:ping><description><![CDATA[<font size="4"><span style="font-weight: bold;">Problem when build lib:</span></font><br>

<br style="color: rgb(255, 0, 0);">

<span style="color: rgb(255, 20, 147);">spiwolf@fb$ cd lib</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">spiwolf@fb$ gmake</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
gcc -g -O2 -Wall -c mcast_leave.c</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
mcast_leave.c: In function `mcast_leave':</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
mcast_leave.c:26: `IPV6_DROP_MEMBERSHIP' undeclared (first use in this function)</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
mcast_leave.c:26: (Each undeclared identifier is reported only once</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
mcast_leave.c:26: for each function it appears in.)</span><br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
*** Error code 1</span><br style="color: rgb(255, 20, 147);">


<br style="color: rgb(255, 20, 147);">

<span style="color: rgb(255, 20, 147);">
Stop in /home/gabriel/unpv12e/lib.</span><br style="color: rgb(255, 20, 147);">


<br>

<br>

<font size="4"><span style="font-weight: bold;">The resolution for it is:</span></font><br>

<br>

<br style="color: rgb(0, 0, 255);">


<span style="font-size: 10.5pt; font-family: &quot;Times New Roman&quot;; color: rgb(0, 0, 255);" lang="EN-US">change the following:<br>
mcast_leave.c: Change IPV6_DROP_MEMBERSHIP to IPV6_LEAVE_GROUP<br>
mcast_join.c: Change IPV6_ADD_MEMBERSHIP to IPV6_JOIN_GROUP<br>
<br>
IIRC these names were changed by a later RFC (2553, which obsoletes 2133 and is
obsoleted by 3493). Matter of fact, I was grousing about this, a while ago,
about this very issue:
http://forums.devshed.com/t53905/s.html?perpage=15&amp;pagenumber=2</span><br>

<img src ="http://www.cnitblog.com/SpiWolf/aggbug/4348.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-11 15:25 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4348.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Debugging</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4347.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Fri, 11 Nov 2005 07:24:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4347.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4347.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4347.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4347.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4347.html</trackback:ping><description><![CDATA[<div class="SECT1">
<h1 class="SECT1"><a id="DEBUGGING" name="DEBUGGING">2.6 Debugging</a></h1>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN892" name="AEN892">2.6.1 The Debugger</a></h2>

<p>The debugger that comes with FreeBSD is called <tt class="COMMAND">gdb</tt> (<b class="APPLICATION">GNU debugger</b>). You start it up by typing</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">gdb <var class="REPLACEABLE">progname</var></kbd>
</pre>

<p>although most people prefer to run it inside <b class="APPLICATION">Emacs</b>. You can
do this by:</p>

<pre class="SCREEN"><kbd class="USERINPUT">M-x gdb RET <var class="REPLACEABLE">progname</var> RET</kbd>
</pre>

<p>Using a debugger allows you to run the program under more controlled circumstances.
Typically, you can step through the program a line at a time, inspect the value of
variables, change them, tell the debugger to run up to a certain point and then stop, and
so on. You can even attach to a program that is already running, or load a core file to
investigate why the program crashed. It is even possible to debug the kernel, though that
is a little trickier than the user applications we will be discussing in this
section.</p>

<p><tt class="COMMAND">gdb</tt> has quite good on-line help, as well as a set of info
pages, so this section will concentrate on a few of the basic commands.</p>

<p>Finally, if you find its text-based command-prompt style off-putting, there is a
graphical front-end for it (<a href="http://www.freebsd.org/ports/devel.html" target="_top">xxgdb</a>) in the ports collection.</p>

<p>This section is intended to be an introduction to using <tt class="COMMAND">gdb</tt>
and does not cover specialized topics such as debugging the kernel.</p>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN913" name="AEN913">2.6.2 Running a program in the
debugger</a></h2>

<p>You will need to have compiled the program with the <var class="OPTION">-g</var>
option to get the most out of using <tt class="COMMAND">gdb</tt>. It will work without,
but you will only see the name of the function you are in, instead of the source code. If
you see a line like:</p>

<pre class="SCREEN">... (no debugging symbols found) ...<br></pre>

<p>when <tt class="COMMAND">gdb</tt> starts up, you will know that the program was not
compiled with the <var class="OPTION">-g</var> option.</p>

<p>At the <tt class="COMMAND">gdb</tt> prompt, type <kbd class="USERINPUT">break
main</kbd>. This will tell the debugger to skip over the preliminary set-up code in the
program and start at the beginning of your code. Now type <kbd class="USERINPUT">run</kbd> to start the program--it will start at the beginning of the
set-up code and then get stopped by the debugger when it calls <code class="FUNCTION">main()</code>. (If you have ever wondered where <code class="FUNCTION">main()</code> gets called from, now you know!).</p>

<p>You can now step through the program, a line at a time, by pressing <tt class="COMMAND">n</tt>. If you get to a function call, you can step into it by pressing
<tt class="COMMAND">s</tt>. Once you are in a function call, you can return from stepping
into a function call by pressing <tt class="COMMAND">f</tt>. You can also use <tt class="COMMAND">up</tt> and <tt class="COMMAND">down</tt> to take a quick look at the
caller.</p>

<p>Here is a simple example of how to spot a mistake in a program with <tt class="COMMAND">gdb</tt>. This is our program (with a deliberate mistake):</p>

<pre class="PROGRAMLISTING">#include &lt;stdio.h&gt;<br><br>int bazz(int anint);<br><br>main() {<br>    int i;<br><br>    printf("This is my program\n");<br>    bazz(i);<br>    return 0;<br>}<br><br>int bazz(int anint) {<br>    printf("You gave me %d\n", anint);<br>    return anint;<br>}<br></pre>

<p>This program sets <var class="SYMBOL">i</var> to be <var class="LITERAL">5</var> and
passes it to a function <code class="FUNCTION">bazz()</code> which prints out the number
we gave it.</p>

<p>When we compile and run the program we get</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -g -o temp temp.c</kbd><br><samp class="PROMPT">%</samp> <kbd class="USERINPUT">./temp</kbd><br>This is my program<br>anint = 4231<br></pre>

<p>That was not what we expected! Time to see what is going on!</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">gdb temp</kbd><br>GDB is free software and you are welcome to distribute copies of it<br> under certain conditions; type "show copying" to see the conditions.<br>There is absolutely no warranty for GDB; type "show warranty" for details.<br>GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc.<br>(gdb) <kbd class="USERINPUT">break main</kbd>               Skip the set-up code<br>Breakpoint 1 at 0x160f: file temp.c, line 9.    <tt class="COMMAND">gdb</tt> puts breakpoint at <code class="FUNCTION">main()</code><br>(gdb) <kbd class="USERINPUT">run</kbd>                   Run as far as <code class="FUNCTION">main()</code><br>Starting program: /home/james/tmp/temp      Program starts running<br><br>Breakpoint 1, main () at temp.c:9       <tt class="COMMAND">gdb</tt> stops at <code class="FUNCTION">main()</code><br>(gdb) <kbd class="USERINPUT">n</kbd>                       Go to next line<br>This is my program              Program prints out<br>(gdb) <kbd class="USERINPUT">s</kbd>                       step into <code class="FUNCTION">bazz()</code><br>bazz (anint=4231) at temp.c:17          <tt class="COMMAND">gdb</tt> displays stack frame<br>(gdb)<br></pre>

<p>Hang on a minute! How did <var class="SYMBOL">anint</var> get to be <var class="LITERAL">4231</var>? Did we not we set it to be <var class="LITERAL">5</var> in
<code class="FUNCTION">main()</code>? Let's move up to <code class="FUNCTION">main()</code> and have a look.</p>

<pre class="SCREEN">(gdb) <kbd class="USERINPUT">up</kbd>                   Move up call stack<br>#1  0x1625 in main () at temp.c:11      <tt class="COMMAND">gdb</tt> displays stack frame<br>(gdb) <kbd class="USERINPUT">p i</kbd>                   Show us the value of <var class="SYMBOL">i</var><br>$1 = 4231                   <tt class="COMMAND">gdb</tt> displays <var class="LITERAL">4231</var>
</pre>

<p>Oh dear! Looking at the code, we forgot to initialize <var class="SYMBOL">i</var>. We
meant to put</p>

<pre class="PROGRAMLISTING">...<br>main() {<br>    int i;<br><br>    i = 5;<br>    printf("This is my program\n");<br>...<br></pre>

<p>but we left the <var class="LITERAL">i=5;</var> line out. As we did not initialize
<var class="SYMBOL">i</var>, it had whatever number happened to be in that area of memory
when the program ran, which in this case happened to be <var class="LITERAL">4231</var>.</p>

<div class="NOTE">
<blockquote class="NOTE">
<p><b>Note:</b> <tt class="COMMAND">gdb</tt> displays the stack frame every time we go
into or out of a function, even if we are using <tt class="COMMAND">up</tt> and <tt class="COMMAND">down</tt> to move around the call stack. This shows the name of the
function and the values of its arguments, which helps us keep track of where we are and
what is going on. (The stack is a storage area where the program stores information about
the arguments passed to functions and where to go when it returns from a function
call).</p>
</blockquote>
</div>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN1002" name="AEN1002">2.6.3 Examining a core file</a></h2>

<p>A core file is basically a file which contains the complete state of the process when
it crashed. In “the good old days”, programmers had to print out hex listings
of core files and sweat over machine code manuals, but now life is a bit easier.
Incidentally, under FreeBSD and other 4.4BSD systems, a core file is called <tt class="FILENAME"><var class="REPLACEABLE">progname</var>.core</tt> instead of just <tt class="FILENAME">core</tt>, to make it clearer which program a core file belongs to.</p>

<p>To examine a core file, start up <tt class="COMMAND">gdb</tt> in the usual way.
Instead of typing <tt class="COMMAND">break</tt> or <tt class="COMMAND">run</tt>,
type</p>

<pre class="SCREEN">(gdb) <kbd class="USERINPUT">core <var class="REPLACEABLE">progname</var>.core</kbd>
</pre>

<p>If you are not in the same directory as the core file, you will have to do <kbd class="USERINPUT">dir /path/to/core/file</kbd> first.</p>

<p>You should see something like this:</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">gdb a.out</kbd><br>GDB is free software and you are welcome to distribute copies of it<br> under certain conditions; type "show copying" to see the conditions.<br>There is absolutely no warranty for GDB; type "show warranty" for details.<br>GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc.<br>(gdb) <kbd class="USERINPUT">core a.out.core</kbd><br>Core was generated by `a.out'.<br>Program terminated with signal 11, Segmentation fault.<br>Cannot access memory at address 0x7020796d.<br>#0  0x164a in bazz (anint=0x5) at temp.c:17<br>(gdb)<br></pre>

<p>In this case, the program was called <tt class="FILENAME">a.out</tt>, so the core file
is called <tt class="FILENAME">a.out.core</tt>. We can see that the program crashed due
to trying to access an area in memory that was not available to it in a function called
<code class="FUNCTION">bazz</code>.</p>

<p>Sometimes it is useful to be able to see how a function was called, as the problem
could have occurred a long way up the call stack in a complex program. The <tt class="COMMAND">bt</tt> command causes <tt class="COMMAND">gdb</tt> to print out a
back-trace of the call stack:</p>

<pre class="SCREEN">(gdb) <kbd class="USERINPUT">bt</kbd><br>#0  0x164a in bazz (anint=0x5) at temp.c:17<br>#1  0xefbfd888 in end ()<br>#2  0x162c in main () at temp.c:11<br>(gdb)<br></pre>

<p>The <code class="FUNCTION">end()</code> function is called when a program crashes; in
this case, the <code class="FUNCTION">bazz()</code> function was called from <code class="FUNCTION">main()</code>.</p>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN1036" name="AEN1036">2.6.4 Attaching to a running
program</a></h2>

<p>One of the neatest features about <tt class="COMMAND">gdb</tt> is that it can attach
to a program that is already running. Of course, that assumes you have sufficient
permissions to do so. A common problem is when you are stepping through a program that
forks, and you want to trace the child, but the debugger will only let you trace the
parent.</p>

<p>What you do is start up another <tt class="COMMAND">gdb</tt>, use <tt class="COMMAND">ps</tt> to find the process ID for the child, and do</p>

<pre class="SCREEN">(gdb) <kbd class="USERINPUT">attach <var class="REPLACEABLE">pid</var></kbd>
</pre>

<p>in <tt class="COMMAND">gdb</tt>, and then debug as usual.</p>

<p>“That is all very well,” you are probably thinking, “but by the time
I have done that, the child process will be over the hill and far away”. Fear not,
gentle reader, here is how to do it (courtesy of the <tt class="COMMAND">gdb</tt> info
pages):</p>

<pre class="SCREEN">...<br>if ((pid = fork()) &lt; 0)     /* _Always_ check this */<br>    error();<br>else if (pid == 0) {        /* child */<br>    int PauseMode = 1;<br><br>    while (PauseMode)<br>        sleep(10);  /* Wait until someone attaches to us */<br>    ...<br>} else {            /* parent */<br>    ...<br></pre>

<p>Now all you have to do is attach to the child, set <var class="SYMBOL">PauseMode</var>
to <var class="LITERAL">0</var>, and wait for the <code class="FUNCTION">sleep()</code>
call to return!</p>
</div>
</div>



<div class="NAVFOOTER">
<hr align="left" width="100%">
<table summary="Footer navigation table" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody><tr>
<td align="left" valign="top" width="33%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-make.html" accesskey="P">Prev</a></td>
<td align="center" valign="top" width="34%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/index.html" accesskey="H">Home</a></td>
<td align="right" valign="top" width="33%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/emacs.html" accesskey="N">Next</a></td>
</tr>

<tr>
<td align="left" valign="top" width="33%">Make</td>
<td align="center" valign="top" width="34%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools.html" accesskey="U">Up</a></td>
<td align="right" valign="top" width="33%">Using Emacs as a Development Environment</td>
</tr>
</tbody></table>
</div>



<p align="center"><small>This, and other documents, can be downloaded from <a href="ftp://ftp.freebsd.org/pub/FreeBSD/doc/">ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/</a>.</small></p>



<p align="center"><small>For questions about FreeBSD, read the <a href="http://www.freebsd.org/docs.html">documentation</a> before contacting &lt;<a href="mailto:questions@FreeBSD.org">questions@FreeBSD.org</a>&gt;.<br>
For questions about this documentation, e-mail &lt;<a href="mailto:doc@FreeBSD.org">doc@FreeBSD.org</a>&gt;.</small></p>
<img src ="http://www.cnitblog.com/SpiWolf/aggbug/4347.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-11 15:24 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4347.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Compiling</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4345.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Fri, 11 Nov 2005 07:23:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4345.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4345.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4345.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4345.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4345.html</trackback:ping><description><![CDATA[&nbsp;
<div class="SECT1">
<h1 class="SECT1"><a id="TOOLS-COMPILING" name="TOOLS-COMPILING">2.4 Compiling with <tt class="COMMAND">cc</tt></a></h1>
<p>This section deals only with the GNU compiler for C and C++, since
that comes with the base FreeBSD system. It can be invoked by either <tt class="COMMAND">cc</tt> or <tt class="COMMAND">gcc</tt>.
The details of producing a program with an interpreter vary
considerably between interpreters, and are usually well covered in the
documentation and on-line help for the interpreter.</p>
<p>Once you have written your masterpiece, the next step is to convert
it into something that will (hopefully!) run on FreeBSD. This usually
involves several steps, each of which is done by a separate program.</p>
<div class="PROCEDURE">
<ol type="1"><li>
<p>Pre-process your source code to remove comments and do other tricks like expanding macros in C.</p></li><li>
<p>Check the syntax of your code to see if you have obeyed the rules of the language. If you have not, it will complain!</p></li><li>
<p>Convert the source code into assembly language--this is very close
to machine code, but still understandable by humans. Allegedly. <a id="AEN333" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#FTN.AEN333" name="AEN333"><span class="footnote">[1]</span></a></p></li><li>
<p>Convert the assembly language into machine code--yep, we are talking bits and bytes, ones and zeros here.</p></li><li>
<p>Check that you have used things like functions and global variables
in a consistent way. For example, if you have called a non-existent
function, it will complain.</p></li><li>
<p>If you are trying to produce an executable from several source code files, work out how to fit them all together.</p></li><li>
<p>Work out how to produce something that the system's run-time loader will be able to load into memory and run.</p></li><li>
<p>Finally, write the executable on the filesystem.</p></li></ol></div>
<p>The word <i class="FIRSTTERM">compiling</i> is often used to refer to just steps 1 to 4--the others are referred to as <i class="FIRSTTERM">linking</i>. Sometimes step 1 is referred to as <i class="FIRSTTERM">pre-processing</i> and steps 3-4 as <i class="FIRSTTERM">assembling</i>.</p>
<p>Fortunately, almost all this detail is hidden from you, as <tt class="COMMAND">cc</tt> is a front end that manages calling all these programs with the right arguments for you; simply typing</p><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc foobar.c</kbd>
</pre>
<p>will cause <tt class="FILENAME">foobar.c</tt> to be compiled by all the steps above. If you have more than one file to compile, just do something like</p><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc foo.c bar.c</kbd>
</pre>
<p>Note that the syntax checking is just that--checking the syntax. It
will not check for any logical mistakes you may have made, like putting
the program into an infinite loop, or using a bubble sort when you
meant to use a binary sort. <a id="AEN363" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#FTN.AEN363" name="AEN363"><span class="footnote">[2]</span></a></p>
<p>There are lots and lots of options for <tt class="COMMAND">cc</tt>, which are all in the manual page. Here are a few of the most important ones, with examples of how to use them.</p>
<div class="VARIABLELIST">
<dl><dt><var class="OPTION">-o <var class="REPLACEABLE">filename</var></var></dt><dd>
<p>The output name of the file. If you do not use this option, <tt class="COMMAND">cc</tt> will produce an executable called <tt class="FILENAME">a.out</tt>. <a id="AEN376" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#FTN.AEN376" name="AEN376"><span class="footnote">[3]</span></a></p>
<div class="INFORMALEXAMPLE"><a id="AEN378" name="AEN378"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc foobar.c</kbd>               executable is <tt class="FILENAME">a.out</tt><br><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -o foobar foobar.c</kbd>     executable is <tt class="FILENAME">foobar</tt><br>       <br></pre></div></dd><dt><var class="OPTION">-c</var></dt><dd>
<p>Just compile the file, do not link it. Useful for toy programs where you just want to check the syntax, or if you are using a <tt class="FILENAME">Makefile</tt>.</p>
<div class="INFORMALEXAMPLE"><a id="AEN394" name="AEN394"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -c foobar.c</kbd><br>       <br></pre></div>
<p>This will produce an <i class="FIRSTTERM">object file</i> (not an executable) called <tt class="FILENAME">foobar.o</tt>. This can be linked together with other object files into an executable.</p></dd><dt><var class="OPTION">-g</var></dt><dd>
<p>Create a debug version of the executable. This makes the compiler
put information into the executable about which line of which source
file corresponds to which function call. A debugger can use this
information to show the source code as you step through the program,
which is <span class="emphasis"><i class="EMPHASIS">very</i></span> useful; the disadvantage is that all this extra information makes the program much bigger. Normally, you compile with <var class="OPTION">-g</var> while you are developing a program and then compile a “release version” without <var class="OPTION">-g</var> when you are satisfied it works properly.</p>
<div class="INFORMALEXAMPLE"><a id="AEN410" name="AEN410"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -g foobar.c</kbd><br>       <br></pre></div>
<p>This will produce a debug version of the program. <a id="AEN415" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#FTN.AEN415" name="AEN415"><span class="footnote">[4]</span></a></p></dd><dt><var class="OPTION">-O</var></dt><dd>
<p>Create an optimized version of the executable. The compiler performs
various clever tricks to try to produce an executable that runs faster
than normal. You can add a number after the <var class="OPTION">-O</var>
to specify a higher level of optimization, but this often exposes bugs
in the compiler's optimizer. For instance, the version of <tt class="COMMAND">cc</tt> that comes with the 2.1.0 release of FreeBSD is known to produce bad code with the <var class="OPTION">-O2</var> option in some circumstances.</p>
<p>Optimization is usually only turned on when compiling a release version.</p>
<div class="INFORMALEXAMPLE"><a id="AEN429" name="AEN429"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -O -o foobar foobar.c</kbd><br>       <br></pre></div>
<p>This will produce an optimized version of <tt class="FILENAME">foobar</tt>.</p></dd></dl></div>
<p>The following three flags will force <tt class="COMMAND">cc</tt> to check that your code complies to the relevant international standard, often referred to as the <acronym class="ACRONYM">ANSI</acronym> standard, though strictly speaking it is an <acronym class="ACRONYM">ISO</acronym> standard.</p>
<div class="VARIABLELIST">
<dl><dt><var class="OPTION">-Wall</var></dt><dd>
<p>Enable all the warnings which the authors of <tt class="COMMAND">cc</tt> believe are worthwhile. Despite the name, it will not enable all the warnings <tt class="COMMAND">cc</tt> is capable of.</p></dd><dt><var class="OPTION">-ansi</var></dt><dd>
<p>Turn off most, but not all, of the non-<acronym class="ACRONYM">ANSI</acronym>&nbsp;C features provided by <tt class="COMMAND">cc</tt>. Despite the name, it does not guarantee strictly that your code will comply to the standard.</p></dd><dt><var class="OPTION">-pedantic</var></dt><dd>
<p>Turn off <span class="emphasis"><i class="EMPHASIS">all</i></span> <tt class="COMMAND">cc</tt>'s non-<acronym class="ACRONYM">ANSI</acronym>&nbsp;C features.</p></dd></dl></div>
<p>Without these flags, <tt class="COMMAND">cc</tt> will allow you to
use some of its non-standard extensions to the standard. Some of these
are very useful, but will not work with other compilers--in fact, one
of the main aims of the standard is to allow people to write code that
will work with any compiler on any system. This is known as <i class="FIRSTTERM">portable code</i>.</p>
<p>Generally, you should try to make your code as portable as possible,
as otherwise you may have to completely rewrite the program later to
get it to work somewhere else--and who knows what you may be using in a
few years time?</p>
<div class="INFORMALEXAMPLE"><a id="AEN466" name="AEN466"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -Wall -ansi -pedantic -o foobar foobar.c</kbd>
</pre></div>
<p>This will produce an executable <tt class="FILENAME">foobar</tt> after checking <tt class="FILENAME">foobar.c</tt> for standard compliance.</p>
<div class="VARIABLELIST">
<dl><dt><var class="OPTION">-l<var class="REPLACEABLE">library</var></var></dt><dd>
<p>Specify a function library to be used at link time.</p>
<p>The most common example of this is when compiling a program that
uses some of the mathematical functions in C. Unlike most other
platforms, these are in a separate library from the standard C one and
you have to tell the compiler to add it.</p>
<p>The rule is that if the library is called <tt class="FILENAME">lib<var class="REPLACEABLE">something</var>.a</tt>, you give <tt class="COMMAND">cc</tt> the argument <var class="OPTION">-l<var class="REPLACEABLE">something</var></var>. For example, the math library is <tt class="FILENAME">libm.a</tt>, so you give <tt class="COMMAND">cc</tt> the argument <var class="OPTION">-lm</var>. A common “gotcha” with the math library is that it has to be the last library on the command line.</p>
<div class="INFORMALEXAMPLE"><a id="AEN491" name="AEN491"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -o foobar foobar.c -lm</kbd><br>       <br></pre></div>
<p>This will link the math library functions into <tt class="FILENAME">foobar</tt>.</p>
<p>If you are compiling C++ code, you need to add <var class="OPTION">-lg++</var>, or <var class="OPTION">-lstdc++</var>
if you are using FreeBSD 2.2 or later, to the command line argument to
link the C++ library functions. Alternatively, you can run <tt class="COMMAND">c++</tt> instead of <tt class="COMMAND">cc</tt>, which does this for you. <tt class="COMMAND">c++</tt> can also be invoked as <tt class="COMMAND">g++</tt> on FreeBSD.</p>
<div class="INFORMALEXAMPLE"><a id="AEN504" name="AEN504"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -o foobar foobar.cc -lg++</kbd>     For FreeBSD 2.1.6 and earlier<br><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -o foobar foobar.cc -lstdc++</kbd>  For FreeBSD 2.2 and later<br><samp class="PROMPT">%</samp> <kbd class="USERINPUT">c++ -o foobar foobar.cc</kbd><br>       <br></pre></div>
<p>Each of these will both produce an executable <tt class="FILENAME">foobar</tt> from the C++ source file <tt class="FILENAME">foobar.cc</tt>. Note that, on <span class="TRADEMARK">UNIX</span>® systems, C++ source files traditionally end in <tt class="FILENAME">.C</tt>, <tt class="FILENAME">.cxx</tt> or <tt class="FILENAME">.cc</tt>, rather than the <span class="TRADEMARK">MS-DOS</span>® style <tt class="FILENAME">.cpp</tt> (which was already used for something else). <tt class="COMMAND">gcc</tt>
used to rely on this to work out what kind of compiler to use on the
source file; however, this restriction no longer applies, so you may
now call your C++ files <tt class="FILENAME">.cpp</tt> with impunity!</p></dd></dl></div>
<div class="SECT2">
<h2 class="SECT2"><a id="AEN525" name="AEN525">2.4.1 Common <tt class="COMMAND">cc</tt> Queries and Problems</a></h2>
<div class="QANDASET">
<dl><dt>2.4.1.1. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.1.">I am trying to write a program which uses the <code class="FUNCTION">sin()</code> function and I get an error like this. What does it mean?</a></dt><dt>2.4.1.2. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.2.">All right, I wrote this simple program to practice using <var class="OPTION">-lm</var>. All it does is raise 2.1 to the power of 6.</a></dt><dt>2.4.1.3. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.3.">So how do I fix this?</a></dt><dt>2.4.1.4. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.4.">I compiled a file called <tt class="FILENAME">foobar.c</tt> and I cannot find an executable called <tt class="FILENAME">foobar</tt>. Where has it gone?</a></dt><dt>2.4.1.5. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.5.">OK, I have an executable called <tt class="FILENAME">foobar</tt>, I can see it when I run <tt class="COMMAND">ls</tt>, but when I type in <tt class="COMMAND">foobar</tt> at the command prompt it tells me there is no such file. Why can it not find it?</a></dt><dt>2.4.1.6. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.6.">I called my executable <tt class="FILENAME">test</tt>, but nothing happens when I run it. What is going on?</a></dt><dt>2.4.1.7. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.7.">I compiled my program and it seemed to run all right at first, then there was an error and it said something about “<tt class="ERRORNAME">core dumped</tt>”. What does that mean?</a></dt><dt>2.4.1.8. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.8.">Fascinating stuff, but what I am supposed to do now?</a></dt><dt>2.4.1.9. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.9.">When my program dumped core, it said something about a “<tt class="ERRORNAME">segmentation fault</tt>”. What is that?</a></dt><dt>2.4.1.10. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.10.">Sometimes when I get a core dump it says “<tt class="ERRORNAME">bus error</tt>”. It says in my <span class="TRADEMARK">UNIX</span> book that this means a hardware problem, but the computer still seems to be working. Is this true?</a></dt><dt>2.4.1.11. <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#Q2.4.1.11.">This
dumping core business sounds as though it could be quite useful, if I
can make it happen when I want to. Can I do this, or do I have to wait
until there is an error?</a></dt></dl>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.1." name="Q2.4.1.1."></a><b>2.4.1.1.</b> I am trying to write a program which uses the <code class="FUNCTION">sin()</code> function and I get an error like this. What does it mean?</p>
<div class="INFORMALEXAMPLE"><a id="AEN533" name="AEN533"></a><pre class="SCREEN">/var/tmp/cc0143941.o: Undefined symbol `_sin' referenced from text segment<br>         <br></pre></div></div>
<div class="ANSWER">
<p>When using mathematical functions like <code class="FUNCTION">sin()</code>, you have to tell <tt class="COMMAND">cc</tt> to link in the math library, like so:</p>
<div class="INFORMALEXAMPLE"><a id="AEN539" name="AEN539"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -o foobar foobar.c -lm</kbd><br>         <br></pre></div></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.2." name="Q2.4.1.2."></a><b>2.4.1.2.</b> All right, I wrote this simple program to practice using <var class="OPTION">-lm</var>. All it does is raise 2.1 to the power of 6.</p>
<div class="INFORMALEXAMPLE"><a id="AEN547" name="AEN547"></a><pre class="PROGRAMLISTING">#include &lt;stdio.h&gt;<br><br>int main() {<br>    float f;<br><br>    f = pow(2.1, 6);<br>    printf("2.1 ^ 6 = %f\n", f);<br>    return 0;<br>}<br>         <br></pre></div>
<p>and I compiled it as:</p>
<div class="INFORMALEXAMPLE"><a id="AEN550" name="AEN550"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc temp.c -lm</kbd><br>         <br></pre></div>
<p>like you said I should, but I get this when I run it:</p>
<div class="INFORMALEXAMPLE"><a id="AEN555" name="AEN555"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">./a.out</kbd><br>2.1 ^ 6 = 1023.000000<br>         <br></pre></div>
<p>This is <span class="emphasis"><i class="EMPHASIS">not</i></span> the right answer! What is going on?</p></div>
<div class="ANSWER">
<p>When the compiler sees you call a function, it checks if it
has already seen a prototype for it. If it has not, it assumes the
function returns an <span class="TYPE">int</span>, which is definitely not what you want here.</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.3." name="Q2.4.1.3."></a><b>2.4.1.3.</b> So how do I fix this?</p></div>
<div class="ANSWER">
<p>The prototypes for the mathematical functions are in <tt class="FILENAME">math.h</tt>.
If you include this file, the compiler will be able to find the
prototype and it will stop doing strange things to your calculation!</p>
<div class="INFORMALEXAMPLE"><a id="AEN570" name="AEN570"></a><pre class="PROGRAMLISTING">#include &lt;math.h&gt;<br>#include &lt;stdio.h&gt;<br><br>int main() {<br>...<br>         <br></pre></div>
<p>After recompiling it as you did before, run it:</p>
<div class="INFORMALEXAMPLE"><a id="AEN573" name="AEN573"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">./a.out</kbd><br>2.1 ^ 6 = 85.766121<br>         <br></pre></div>
<p>If you are using any of the mathematical functions, <span class="emphasis"><i class="EMPHASIS">always</i></span> include <tt class="FILENAME">math.h</tt> and remember to link in the math library.</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.4." name="Q2.4.1.4."></a><b>2.4.1.4.</b> I compiled a file called <tt class="FILENAME">foobar.c</tt> and I cannot find an executable called <tt class="FILENAME">foobar</tt>. Where has it gone?</p></div>
<div class="ANSWER">
<p>Remember, <tt class="COMMAND">cc</tt> will call the executable <tt class="FILENAME">a.out</tt> unless you tell it differently. Use the <var class="OPTION">-o&nbsp;<var class="REPLACEABLE">filename</var></var> option:</p>
<div class="INFORMALEXAMPLE"><a id="AEN591" name="AEN591"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc -o foobar foobar.c</kbd><br>         <br></pre></div></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.5." name="Q2.4.1.5."></a><b>2.4.1.5.</b> OK, I have an executable called <tt class="FILENAME">foobar</tt>, I can see it when I run <tt class="COMMAND">ls</tt>, but when I type in <tt class="COMMAND">foobar</tt> at the command prompt it tells me there is no such file. Why can it not find it?</p></div>
<div class="ANSWER">
<p>Unlike <span class="TRADEMARK">MS-DOS</span>, <span class="TRADEMARK">UNIX</span>
does not look in the current directory when it is trying to find out
which executable you want it to run, unless you tell it to. Either type
<tt class="COMMAND">./foobar</tt>, which means “run the file called <tt class="FILENAME">foobar</tt> in the current directory”, or change your <tt class="ENVAR">PATH</tt> environment variable so that it looks something like</p>
<div class="INFORMALEXAMPLE"><a id="AEN609" name="AEN609"></a><pre class="SCREEN">bin:/usr/bin:/usr/local/bin:.<br>         <br></pre></div>
<p>The dot at the end means “look in the current directory if it is not in any of the others”.</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.6." name="Q2.4.1.6."></a><b>2.4.1.6.</b> I called my executable <tt class="FILENAME">test</tt>, but nothing happens when I run it. What is going on?</p></div>
<div class="ANSWER">
<p>Most <span class="TRADEMARK">UNIX</span> systems have a program called <tt class="COMMAND">test</tt> in <tt class="FILENAME">/usr/bin</tt> and the shell is picking that one up before it gets to checking the current directory. Either type:</p>
<div class="INFORMALEXAMPLE"><a id="AEN622" name="AEN622"></a><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">./test</kbd><br>         <br></pre></div>
<p>or choose a better name for your program!</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.7." name="Q2.4.1.7."></a><b>2.4.1.7.</b> I compiled my program and it seemed to run all right at first, then there was an error and it said something about “<tt class="ERRORNAME">core dumped</tt>”. What does that mean?</p></div>
<div class="ANSWER">
<p>The name <i class="FIRSTTERM">core dump</i> dates back to the very early days of <span class="TRADEMARK">UNIX</span>,
when the machines used core memory for storing data. Basically, if the
program failed under certain conditions, the system would write the
contents of core memory to disk in a file called <tt class="FILENAME">core</tt>, which the programmer could then pore over to find out what went wrong.</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.8." name="Q2.4.1.8."></a><b>2.4.1.8.</b> Fascinating stuff, but what I am supposed to do now?</p></div>
<div class="ANSWER">
<p>Use <tt class="COMMAND">gdb</tt> to analyze the core (see <a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/debugging.html">Section 2.6</a>).</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.9." name="Q2.4.1.9."></a><b>2.4.1.9.</b> When my program dumped core, it said something about a “<tt class="ERRORNAME">segmentation fault</tt>”. What is that?</p></div>
<div class="ANSWER">
<p>This basically means that your program tried to perform some sort of illegal operation on memory; <span class="TRADEMARK">UNIX</span> is designed to protect the operating system and other programs from rogue programs.</p>
<p>Common causes for this are:</p>
<ul><li>
<p>Trying to write to a <var class="SYMBOL">NULL</var> pointer, eg</p><pre class="PROGRAMLISTING">char *foo = NULL;<br>strcpy(foo, "bang!");<br>       <br></pre></li><li>
<p>Using a pointer that has not been initialized, eg</p><pre class="PROGRAMLISTING">char *foo;<br>strcpy(foo, "bang!");<br>       <br></pre>
<p>The pointer will have some random value that, with luck, will point
into an area of memory that is not available to your program and the
kernel will kill your program before it can do any damage. If you are
unlucky, it will point somewhere inside your own program and corrupt
one of your data structures, causing the program to fail mysteriously.</p></li><li>
<p>Trying to access past the end of an array, eg</p><pre class="PROGRAMLISTING">int bar[20];<br>bar[27] = 6;<br>       <br></pre></li><li>
<p>Trying to store something in read-only memory, eg</p><pre class="PROGRAMLISTING">char *foo = "My string";<br>strcpy(foo, "bang!");<br>       <br></pre>
<p><span class="TRADEMARK">UNIX</span> compilers often put string literals like <var class="LITERAL">"My string"</var> into read-only areas of memory.</p></li><li>
<p>Doing naughty things with <code class="FUNCTION">malloc()</code> and <code class="FUNCTION">free()</code>, eg</p><pre class="PROGRAMLISTING">char bar[80];<br>free(bar);<br>       <br></pre>
<p>or</p><pre class="PROGRAMLISTING">char *foo = malloc(27);<br>free(foo);<br>free(foo);<br>       <br></pre></li></ul>
<p>Making one of these mistakes will not always lead to an error, but
they are always bad practice. Some systems and compilers are more
tolerant than others, which is why programs that ran well on one system
can crash when you try them on an another.</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.10." name="Q2.4.1.10."></a><b>2.4.1.10.</b> Sometimes when I get a core dump it says “<tt class="ERRORNAME">bus error</tt>”. It says in my <span class="TRADEMARK">UNIX</span> book that this means a hardware problem, but the computer still seems to be working. Is this true?</p></div>
<div class="ANSWER">
<p>No, fortunately not (unless of course you really do have a
hardware problem...). This is usually another way of saying that you
accessed memory in a way you should not have.</p></div></div>
<div class="QANDAENTRY">
<div class="QUESTION">
<p><a id="Q2.4.1.11." name="Q2.4.1.11."></a><b>2.4.1.11.</b> This
dumping core business sounds as though it could be quite useful, if I
can make it happen when I want to. Can I do this, or do I have to wait
until there is an error?</p></div>
<div class="ANSWER">
<p>Yes, just go to another console or xterm, do</p><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">ps</kbd><br>       <br></pre>
<p>to find out the process ID of your program, and do</p><pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">kill -ABRT <var class="REPLACEABLE">pid</var></kbd><br>       <br></pre>
<p>where <var class="PARAMETER"><var class="REPLACEABLE">pid</var></var> is the process ID you looked up.</p>
<p>This is useful if your program has got stuck in an infinite loop, for instance. If your program happens to trap <var class="SYMBOL">SIGABRT</var>, there are several other signals which have a similar effect.</p>
<p>Alternatively, you can create a core dump from inside your program, by calling the <code class="FUNCTION">abort()</code> function. See the manual page of <a href="http://www.freebsd.org/cgi/man.cgi?query=abort&amp;sektion=3"><span class="CITEREFENTRY"><span class="REFENTRYTITLE">abort</span>(3)</span></a> to learn more.</p>
<p>If you want to create a core dump from outside your program, but do not want the process to terminate, you can use the <tt class="COMMAND">gcore</tt> program. See the manual page of <a href="http://www.freebsd.org/cgi/man.cgi?query=gcore&amp;sektion=1"><span class="CITEREFENTRY"><span class="REFENTRYTITLE">gcore</span>(1)</span></a> for more information.</p></div></div></div></div></div>

<h3 class="FOOTNOTES">Notes</h3>

<table class="FOOTNOTES" border="0" width="100%">

<tbody>
<tr>
<td align="left" valign="top" width="5%"><a id="FTN.AEN333" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#AEN333" name="FTN.AEN333"><span class="footnote">[1]</span></a></td>
<td align="left" valign="top" width="95%">
<p>To be strictly accurate, <tt class="COMMAND">cc</tt> converts the source code into its own, machine-independent <i class="FIRSTTERM">p-code</i> instead of assembly language at this stage.</p></td></tr>
<tr>
<td align="left" valign="top" width="5%"><a id="FTN.AEN363" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#AEN363" name="FTN.AEN363"><span class="footnote">[2]</span></a></td>
<td align="left" valign="top" width="95%">
<p>In case you did not know, a binary sort is an efficient way of sorting things into order and a bubble sort is not.</p></td></tr>
<tr>
<td align="left" valign="top" width="5%"><a id="FTN.AEN376" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#AEN376" name="FTN.AEN376"><span class="footnote">[3]</span></a></td>
<td align="left" valign="top" width="95%">
<p>The reasons for this are buried in the mists of history.</p></td></tr>
<tr>
<td align="left" valign="top" width="5%"><a id="FTN.AEN415" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html#AEN415" name="FTN.AEN415"><span class="footnote">[4]</span></a></td>
<td align="left" valign="top" width="95%">
<p>Note, we did not use the <var class="OPTION">-o</var> flag to specify the executable name, so we will get an executable called <tt class="FILENAME">a.out</tt>. Producing a debug version called <tt class="FILENAME">foobar</tt> is left as an exercise for the reader!</p></td></tr></tbody>
</table>

<div class="NAVFOOTER">
<hr align="left" width="100%">

<table summary="Footer navigation table" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody>
<tr>
<td align="left" valign="top" width="33%"><a accesskey="P" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-programming.html">Prev</a></td>
<td align="center" valign="top" width="34%"><a accesskey="H" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/index.html">Home</a></td>
<td align="right" valign="top" width="33%"><a accesskey="N" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-make.html">Next</a></td></tr>
<tr>
<td align="left" valign="top" width="33%">Introduction to Programming</td>
<td align="center" valign="top" width="34%"><a accesskey="U" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools.html">Up</a></td>
<td align="right" valign="top" width="33%">Make</td></tr></tbody></table></div>

<p align="center"><small>This, and other documents, can be downloaded from <a href="ftp://ftp.freebsd.org/pub/FreeBSD/doc/">ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/</a>.</small></p>

<p align="center"><small>For questions about FreeBSD, read the <a href="http://www.freebsd.org/docs.html">documentation</a> before contacting &lt;<a href="mailto:questions@FreeBSD.org">questions@FreeBSD.org</a>&gt;.<br>For questions about this documentation, e-mail &lt;<a href="mailto:doc@FreeBSD.org">doc@FreeBSD.org</a>&gt;.</small></p>
<img src ="http://www.cnitblog.com/SpiWolf/aggbug/4345.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-11 15:23 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4345.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>Make</title><link>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4346.html</link><dc:creator>幽灵狼</dc:creator><author>幽灵狼</author><pubDate>Fri, 11 Nov 2005 07:23:00 GMT</pubDate><guid>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4346.html</guid><wfw:comment>http://www.cnitblog.com/SpiWolf/comments/4346.html</wfw:comment><comments>http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4346.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cnitblog.com/SpiWolf/comments/commentRss/4346.html</wfw:commentRss><trackback:ping>http://www.cnitblog.com/SpiWolf/services/trackbacks/4346.html</trackback:ping><description><![CDATA[<div class="SECT1">
<h1 class="SECT1"><a id="TOOLS-MAKE" name="TOOLS-MAKE">2.5 Make</a></h1>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN714" name="AEN714">2.5.1 What is <tt class="COMMAND">make</tt>?</a></h2>

<p>When you are working on a simple program with only one or two source files, typing
in</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc file1.c file2.c</kbd>
</pre>

<p>is not too bad, but it quickly becomes very tedious when there are several files--and
it can take a while to compile, too.</p>

<p>One way to get around this is to use object files and only recompile the source file
if the source code has changed. So we could have something like:</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">cc file1.o file2.o</kbd> ... <kbd class="USERINPUT">file37.c</kbd> ...<br></pre>

<p>if we had changed <tt class="FILENAME">file37.c</tt>, but not any of the others, since
the last time we compiled. This may speed up the compilation quite a bit, but does not
solve the typing problem.</p>

<p>Or we could write a shell script to solve the typing problem, but it would have to
re-compile everything, making it very inefficient on a large project.</p>

<p>What happens if we have hundreds of source files lying about? What if we are working
in a team with other people who forget to tell us when they have changed one of their
source files that we use?</p>

<p>Perhaps we could put the two solutions together and write something like a shell
script that would contain some kind of magic rule saying when a source file needs
compiling. Now all we need now is a program that can understand these rules, as it is a
bit too complicated for the shell.</p>

<p>This program is called <tt class="COMMAND">make</tt>. It reads in a file, called a <i class="FIRSTTERM">makefile</i>, that tells it how different files depend on each other,
and works out which files need to be re-compiled and which ones do not. For example, a
rule could say something like “if <tt class="FILENAME">fromboz.o</tt> is older than
<tt class="FILENAME">fromboz.c</tt>, that means someone must have changed <tt class="FILENAME">fromboz.c</tt>, so it needs to be re-compiled.” The makefile also
has rules telling make <span class="emphasis"><i class="EMPHASIS">how</i></span> to
re-compile the source file, making it a much more powerful tool.</p>

<p>Makefiles are typically kept in the same directory as the source they apply to, and
can be called <tt class="FILENAME">makefile</tt>, <tt class="FILENAME">Makefile</tt> or
<tt class="FILENAME">MAKEFILE</tt>. Most programmers use the name <tt class="FILENAME">Makefile</tt>, as this puts it near the top of a directory listing,
where it can easily be seen. <a id="AEN745" name="AEN745" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-make.html#FTN.AEN745"><span class="footnote">[1]</span></a></p>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN749" name="AEN749">2.5.2 Example of using <tt class="COMMAND">make</tt></a></h2>

<p>Here is a very simple make file:</p>

<pre class="PROGRAMLISTING">foo: foo.c<br>    cc -o foo foo.c<br></pre>

<p>It consists of two lines, a dependency line and a creation line.</p>

<p>The dependency line here consists of the name of the program (known as the <i class="FIRSTTERM">target</i>), followed by a colon, then whitespace, then the name of the
source file. When <tt class="COMMAND">make</tt> reads this line, it looks to see if <tt class="FILENAME">foo</tt> exists; if it exists, it compares the time <tt class="FILENAME">foo</tt> was last modified to the time <tt class="FILENAME">foo.c</tt>
was last modified. If <tt class="FILENAME">foo</tt> does not exist, or is older than <tt class="FILENAME">foo.c</tt>, it then looks at the creation line to find out what to do.
In other words, this is the rule for working out when <tt class="FILENAME">foo.c</tt>
needs to be re-compiled.</p>

<p>The creation line starts with a <span class="TOKEN">tab</span> (press the <b class="KEYCAP">tab</b> key) and then the command you would type to create <tt class="FILENAME">foo</tt> if you were doing it at a command prompt. If <tt class="FILENAME">foo</tt> is out of date, or does not exist, <tt class="COMMAND">make</tt> then executes this command to create it. In other words, this
is the rule which tells make how to re-compile <tt class="FILENAME">foo.c</tt>.</p>

<p>So, when you type <kbd class="USERINPUT">make</kbd>, it will make sure that <tt class="FILENAME">foo</tt> is up to date with respect to your latest changes to <tt class="FILENAME">foo.c</tt>. This principle can be extended to <tt class="FILENAME">Makefile</tt>s with hundreds of targets--in fact, on FreeBSD, it is
possible to compile the entire operating system just by typing <kbd class="USERINPUT">make world</kbd> in the appropriate directory!</p>

<p>Another useful property of makefiles is that the targets do not have to be programs.
For instance, we could have a make file that looks like this:</p>

<pre class="PROGRAMLISTING">foo: foo.c<br>    cc -o foo foo.c<br><br>install:<br>    cp foo /home/me<br></pre>

<p>We can tell make which target we want to make by typing:</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">make <var class="REPLACEABLE">target</var></kbd>
</pre>

<p><tt class="COMMAND">make</tt> will then only look at that target and ignore any
others. For example, if we type <kbd class="USERINPUT">make foo</kbd> with the makefile
above, make will ignore the <tt class="MAKETARGET">install</tt> target.</p>

<p>If we just type <kbd class="USERINPUT">make</kbd> on its own, make will always look at
the first target and then stop without looking at any others. So if we typed <kbd class="USERINPUT">make</kbd> here, it will just go to the <tt class="MAKETARGET">foo</tt>
target, re-compile <tt class="FILENAME">foo</tt> if necessary, and then stop without
going on to the <tt class="MAKETARGET">install</tt> target.</p>

<p>Notice that the <tt class="MAKETARGET">install</tt> target does not actually depend on
anything! This means that the command on the following line is always executed when we
try to make that target by typing <kbd class="USERINPUT">make install</kbd>. In this
case, it will copy <tt class="FILENAME">foo</tt> into the user's home directory. This is
often used by application makefiles, so that the application can be installed in the
correct directory when it has been correctly compiled.</p>

<p>This is a slightly confusing subject to try to explain. If you do not quite understand
how <tt class="COMMAND">make</tt> works, the best thing to do is to write a simple
program like “hello world” and a make file like the one above and experiment.
Then progress to using more than one source file, or having the source file include a
header file. The <tt class="COMMAND">touch</tt> command is very useful here--it changes
the date on a file without you having to edit it.</p>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN802" name="AEN802">2.5.3 Make and include-files</a></h2>

<p>C code often starts with a list of files to include, for example stdio.h. Some of
these files are system-include files, some of them are from the project you are now
working on:</p>

<pre class="PROGRAMLISTING">#include &lt;stdio.h&gt;<br>#include "foo.h"<br><br>int main(....<br></pre>

<p>To make sure that this file is recompiled the moment <tt class="FILENAME">foo.h</tt>
is changed, you have to add it in your <tt class="FILENAME">Makefile</tt>:</p>

<pre class="PROGRAMLISTING">foo: foo.c foo.h<br></pre>

<p>The moment your project is getting bigger and you have more and more own include-files
to maintain, it will be a pain to keep track of all include files and the files which are
depending on it. If you change an include-file but forget to recompile all the files
which are depending on it, the results will be devastating. <tt class="COMMAND">gcc</tt>
has an option to analyze your files and to produce a list of include-files and their
dependencies: <var class="OPTION">-MM</var>.</p>

<p>If you add this to your Makefile:</p>

<pre class="PROGRAMLISTING">depend:<br>    gcc -E -MM *.c &gt; .depend<br></pre>

<p>and run <kbd class="USERINPUT">make depend</kbd>, the file <tt class="FILENAME">.depend</tt> will appear with a list of object-files, C-files and the
include-files:</p>

<pre class="PROGRAMLISTING">foo.o: foo.c foo.h<br></pre>

<p>If you change <tt class="FILENAME">foo.h</tt>, next time you run <tt class="COMMAND">make</tt> all files depending on <tt class="FILENAME">foo.h</tt> will be
recompiled.</p>

<p>Do not forget to run <tt class="COMMAND">make depend</tt> each time you add an
include-file to one of your files.</p>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN825" name="AEN825">2.5.4 FreeBSD Makefiles</a></h2>

<p>Makefiles can be rather complicated to write. Fortunately, BSD-based systems like
FreeBSD come with some very powerful ones as part of the system. One very good example of
this is the FreeBSD ports system. Here is the essential part of a typical ports <tt class="FILENAME">Makefile</tt>:</p>

<pre class="PROGRAMLISTING">MASTER_SITES=   ftp://freefall.cdrom.com/pub/FreeBSD/LOCAL_PORTS/<br>DISTFILES=      scheme-microcode+dist-7.3-freebsd.tgz<br><br>.include &lt;bsd.port.mk&gt;<br></pre>

<p>Now, if we go to the directory for this port and type <kbd class="USERINPUT">make</kbd>, the following happens:</p>

<div class="PROCEDURE">
<ol type="1"><li>
<p>A check is made to see if the source code for this port is already on the system.</p>
</li><li>
<p>If it is not, an FTP connection to the URL in <var class="SYMBOL">MASTER_SITES</var>
is set up to download the source.</p>
</li><li>
<p>The checksum for the source is calculated and compared it with one for a known, good,
copy of the source. This is to make sure that the source was not corrupted while in
transit.</p>
</li><li>
<p>Any changes required to make the source work on FreeBSD are applied--this is known as
<i class="FIRSTTERM">patching</i>.</p>
</li><li>
<p>Any special configuration needed for the source is done. (Many <span class="TRADEMARK">UNIX</span>® program distributions try to work out which version of
<span class="TRADEMARK">UNIX</span> they are being compiled on and which optional <span class="TRADEMARK">UNIX</span> features are present--this is where they are given the
information in the FreeBSD ports scenario).</p>
</li><li>
<p>The source code for the program is compiled. In effect, we change to the directory
where the source was unpacked and do <tt class="COMMAND">make</tt>--the program's own
make file has the necessary information to build the program.</p>
</li><li>
<p>We now have a compiled version of the program. If we wish, we can test it now; when we
feel confident about the program, we can type <kbd class="USERINPUT">make install</kbd>.
This will cause the program and any supporting files it needs to be copied into the
correct location; an entry is also made into a <span class="DATABASE">package
database</span>, so that the port can easily be uninstalled later if we change our mind
about it.</p>
</li></ol>
</div>

<p>Now I think you will agree that is rather impressive for a four line script!</p>

<p>The secret lies in the last line, which tells <tt class="COMMAND">make</tt> to look in
the system makefile called <tt class="FILENAME">bsd.port.mk</tt>. It is easy to overlook
this line, but this is where all the clever stuff comes from--someone has written a
makefile that tells <tt class="COMMAND">make</tt> to do all the things above (plus a
couple of other things I did not mention, including handling any errors that may occur)
and anyone can get access to that just by putting a single line in their own make
file!</p>

<p>If you want to have a look at these system makefiles, they are in <tt class="FILENAME">/usr/share/mk</tt>, but it is probably best to wait until you have had a
bit of practice with makefiles, as they are very complicated (and if you do look at them,
make sure you have a flask of strong coffee handy!)</p>
</div>

<div class="SECT2">
<h2 class="SECT2"><a id="AEN862" name="AEN862">2.5.5 More advanced uses of <tt class="COMMAND">make</tt></a></h2>

<p><tt class="COMMAND">Make</tt> is a very powerful tool, and can do much more than the
simple example above shows. Unfortunately, there are several different versions of <tt class="COMMAND">make</tt>, and they all differ considerably. The best way to learn what
they can do is probably to read the documentation--hopefully this introduction will have
given you a base from which you can do this.</p>

<p>The version of make that comes with FreeBSD is the <b class="APPLICATION">Berkeley
make</b>; there is a tutorial for it in <tt class="FILENAME">/usr/share/doc/psd/12.make</tt>. To view it, do</p>

<pre class="SCREEN"><samp class="PROMPT">%</samp> <kbd class="USERINPUT">zmore paper.ascii.gz</kbd>
</pre>

<p>in that directory.</p>

<p>Many applications in the ports use <b class="APPLICATION">GNU make</b>, which has a
very good set of “info” pages. If you have installed any of these ports, <b class="APPLICATION">GNU make</b> will automatically have been installed as <tt class="COMMAND">gmake</tt>. It is also available as a port and package in its own
right.</p>

<p>To view the info pages for <b class="APPLICATION">GNU make</b>, you will have to edit
the <tt class="FILENAME">dir</tt> file in the <tt class="FILENAME">/usr/local/info</tt>
directory to add an entry for it. This involves adding a line like</p>

<pre class="PROGRAMLISTING"> * Make: (make).                 The GNU Make utility.<br></pre>

<p>to the file. Once you have done this, you can type <kbd class="USERINPUT">info</kbd>
and then select <span class="GUIMENUITEM">make</span> from the menu (or in <b class="APPLICATION">Emacs</b>, do <kbd class="USERINPUT">C-h i</kbd>).</p>
</div>
</div>



<h3 class="FOOTNOTES">Notes</h3>



<table class="FOOTNOTES" border="0" width="100%">


<tbody><tr>
<td align="left" valign="top" width="5%"><a id="FTN.AEN745" name="FTN.AEN745" href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-make.html#AEN745"><span class="footnote">[1]</span></a></td>
<td align="left" valign="top" width="95%">
<p>They do not use the <tt class="FILENAME">MAKEFILE</tt> form as block capitals are
often used for documentation files like <tt class="FILENAME">README</tt>.</p>
</td>
</tr>
</tbody>
</table>



<div class="NAVFOOTER">
<hr align="left" width="100%">
<table summary="Footer navigation table" border="0" cellpadding="0" cellspacing="0" width="100%">
<tbody><tr>
<td align="left" valign="top" width="33%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools-compiling.html" accesskey="P">Prev</a></td>
<td align="center" valign="top" width="34%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/index.html" accesskey="H">Home</a></td>
<td align="right" valign="top" width="33%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/debugging.html" accesskey="N">Next</a></td>
</tr>

<tr>
<td align="left" valign="top" width="33%">Compiling with <tt class="COMMAND">cc</tt></td>
<td align="center" valign="top" width="34%"><a href="http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/tools.html" accesskey="U">Up</a></td>
<td align="right" valign="top" width="33%">Debugging</td>
</tr>
</tbody></table>
</div>



<p align="center"><small>This, and other documents, can be downloaded from <a href="ftp://ftp.freebsd.org/pub/FreeBSD/doc/">ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/</a>.</small></p>



<p align="center"><small>For questions about FreeBSD, read the <a href="http://www.freebsd.org/docs.html">documentation</a> before contacting &lt;<a href="mailto:questions@FreeBSD.org">questions@FreeBSD.org</a>&gt;.<br>
For questions about this documentation, e-mail &lt;<a href="mailto:doc@FreeBSD.org">doc@FreeBSD.org</a>&gt;.</small></p>
<img src ="http://www.cnitblog.com/SpiWolf/aggbug/4346.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cnitblog.com/SpiWolf/" target="_blank">幽灵狼</a> 2005-11-11 15:23 <a href="http://www.cnitblog.com/SpiWolf/archive/2005/11/11/4346.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>