The Single UNIX Specification
An Introduction
Many names have been applied to the work that has culminated in the Single UNIX Specification and the accompanying UNIX certification program. It began as the Common API Specification, became Spec 1170, and is now in its latest iteration the Single UNIX Specification, Version 5 (2024), published in a number of The Open Group Technical Standards, the core of which are also POSIX.1-2024.
The Single UNIX Specification uses The Open Group Base Specifications documentation as its core. The documentation is structured as follows:
The Base Specifications, Issue 8, composed of:
- Base Definitions, Issue 8 (XBD)
- System Interfaces, Issue 8 (XSH)
- Shell and Utilities, Issue 8 (XCU)
- Rationale, Issue 8 (XRAT) (Informative)
- X/Open Curses, Issue 8
One thing that becomes apparent working with the Single UNIX Specification is its focus on application development. The Single UNIX Specification is similar to the User's and Programmer's Reference Manuals on Berkeley or System V systems. Matters of system management are not part of this specification. Directory organization is not discussed beyond the simple few directories and devices that applications generally use. User management discussions do not appear. There is no discussion of such files as /etc/passwd or /etc/groups, since an application's access to the information traditionally kept in these files is through programmatic interfaces such as getpwnam() and getgrnam(). Processes have appropriate privileges, and there is no concept of the ``superuser'' or ``root''.
Standards Alignment
The Single UNIX Specification supports formal standards developed for application portability. The following source code portability standards lie at the core of the specification:
-
POSIX.1-2024
(This is technically identical to the Base Specifications; they are one and the same document.) - ISO/IEC 9899. (ISO C)
The Single UNIX Specification fully aligns with these standards. Functional extensions beyond the required POSIX base are identified by the X/Open System Interfaces option (XSI), and by mandating certain other POSIX options — for example, File Synchronization.
Where there are multiple interfaces to accomplish a task, the standards-based interfaces are clearly identified as the preferred approach to support long-term portability.
Portability Codes
While great care has been taken to align the Single UNIX Specification with formal standards, it is still a superset specification that extends functionality or perhaps presents a more exact (and restrictive) definition. When this occurs, the text in the Single UNIX Specification is clearly marked by shading and a portability code appears in the margin (sometimes referred to as a margin code or option code). In this version, the number of portability codes has reduced, as the functionality associated with various old options has been mandated.
Programmers need to take care when using functionality that appears in a shaded area if they are developing applications that need to be maximally portable or portable beyond UNIX certified systems. For example, if functionality is marked with XSI in the margin, it will be available on all UNIX certified systems, but may not be available on systems only supporting the base POSIX.1 requirements. Alternatively, an application may depend on the exact format of output from a particular utility whose output format is incompletely specified, as indicated by shading and OF marked in the margin. It is likely that an application is developed with a particular platform, or at least a well-defined set of platforms in mind. These codes are exceptionally useful to warn a developer of areas of potential problems.
There are 45 margin codes defined in total in XBD, Section 2.1.6, Options. 40 of these reflect optional features defined within the POSIX base standard, six of which are mandatory in the Single UNIX Specification (FSC, TSA, TSH, TSS, UP, and XSI). One of the codes-MC1-is a shorthand notation for a permutation of certain options. Three of the codes are special codes to denote other portability warnings, these being OB (obsolescent), OH (optional header), and OF (output incompletely specified).
All the codes including those of the POSIX options are listed below together with an indication of their status within the Single UNIX Specification:
Single UNIX Specification Codes
-
ADV — Advisory Information
Identifies interfaces and additional semantics that are optional in the Single UNIX Specification. Part of the Advanced Realtime Option Group. - CD — C-Language Development Utilities
A set of utilities optional in the Single UNIX Specification. Defined in XCU. -
CPT — Process CPU-Time Clocks
Identifies interfaces and semantics optional in the Single UNIX Specification. Part of the Advanced Realtime Option Group. -
CX — Extension to the ISO C Standard
Extensions beyond the ISO C standard. Mandatory on all systems supporting POSIX.1 and the Single UNIX Specification. -
DC — Device Control
The functionality described is optional. The functionality described is also an extension to the ISO C standard. -
FD — FORTRAN Development Utilities
Utilities optional in the Single UNIX Specification. Defined in XCU. -
FR — FORTRAN Runtime Utilities
Utilities optional in the Single UNIX Specification. Defined in XCU. -
FSC — File Synchronization
Interfaces and semantics optional within POSIX base but mandatory in the Single UNIX Specification. -
IP6 — IPV6
Additional semantics for networking interfaces relating to IP Version 6 support, optional in the Single UNIX Specification. -
MC1 — Mutex Priority Protection Variants
Special margin code shorthand for Non-Robust and Robust Mutex Priority Protection or Inheritance. -
ML — Process Memory Locking
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Option Group. -
MLR — Range Memory Locking
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Option Group. -
MSG — Message Passing
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Option Group. -
MX — IEC 60559 Floating-Point Option
Additional semantics within Math interfaces optional in the Single UNIX Specification. Supports IEC 60559:1989 floating-point standard. -
MXC — IEC 60559 Complex Floating-Point
The functionality described is optional. The functionality described is mandated by the ISO C standard only for implementations that define __STDC_IEC_559_COMPLEX__. -
MXX — IEC 60559 Floating-Point Extension
The functionality described is optional. The functionality described is part of the IEC 60559 Floating-Point option, but is an extension to the ISO C standard. -
OB — Obsolescent
Features portable to all Single UNIX Specification platforms but may be withdrawn in future. Should be avoided. -
OF — Output Format Incompletely Specified
Utility output format is incompletely specified; cannot be processed consistently across platforms. -
OH — Optional Header
Certain headers not required in source modules on XSI-conforming systems, though POSIX may require them. -
PIO — Prioritized Input and Output
Interfaces and semantics optional in the Single UNIX Specification. -
PS — Process Scheduling
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Option Group. -
RPI — Robust Mutex Priority Inheritance
Interfaces and semantics optional in the Single UNIX Specification. -
RPP — Robust Mutex Priority Protection
Interfaces and semantics optional in the Single UNIX Specification. -
RS — Raw Sockets
Additional semantics for sockets optional in the Single UNIX Specification. -
SD — Software Development Utilities
Utilities optional in the Single UNIX Specification. Defined in XCU. -
SHM — Shared Memory Objects
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Option Group. -
SIO — Synchronized Input and Output
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Option Group. -
SPN — Spawn
Interfaces and semantics optional in the Single UNIX Specification. Part of the Advanced Realtime Option Group. -
SS — Process Sporadic Server
Interfaces and semantics optional in the Single UNIX Specification. Part of the Advanced Realtime Option Group. -
TCT — Thread CPU-Time Clocks
Interfaces and semantics optional in the Single UNIX Specification. Part of the Advanced Realtime Threads Option Group. -
TEF — Trace Event Filter
Interfaces and semantics optional in the Single UNIX Specification. Part of the Tracing Option Group. -
TPI — Non-Robust Mutex Priority Inheritance
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Threads Option Group. -
TPP — Non-Robust Mutex Priority Protection
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Threads Option Group. -
TPS — Thread Execution Scheduling
Interfaces and semantics optional in the Single UNIX Specification. Part of the Realtime Threads Option Group. -
TSA — Thread Stack Address Attribute
Semantics optional within POSIX base but mandatory in the Single UNIX Specification. -
TSH — Thread Process-Shared Synchronization
Interfaces and semantics optional within POSIX base but mandatory in the Single UNIX Specification. -
TSP — Thread Sporadic Server
Interfaces and semantics optional in the Single UNIX Specification. Part of the Advanced Realtime Threads Option Group. -
TSS — Thread Stack Size Attribute
Semantics optional within POSIX base but mandatory in the Single UNIX Specification. -
TYM — Typed Memory Objects
Interfaces and semantics optional in the Single UNIX Specification. Part of the Advanced Realtime Option Group. -
UP — User Portability Utilities
Interfaces and semantics in XCU optional within POSIX base but mandatory in the Single UNIX Specification. -
UU — UUCP Utilities
Utilities optional in the Single UNIX Specification. Defined in XCU. -
XSI — X/Open System Interfaces
Interfaces and semantics optional within POSIX base but mandatory in the Single UNIX Specification. Required on all UNIX certified systems.
Where a portability code applies to an entire function or utility, the SYNOPSIS section of the corresponding reference page is shaded and marked with the margin code. Refer to XBD, Section 1.7, Portability for more information.
Option Groups
The Single UNIX Specification includes a set of profiling options, allowing larger profiles of the options of the Base standard. The Option Groups within the Single UNIX Specification are defined within XBD, Section 2.1.5.2, XSI Option Groups.
The Single UNIX Specification contains the following Option Groups:
Functional Areas Covered
-
Encryption
Covering the functionscrypt(),encrypt(), andsetkey(). -
Realtime
Covering functions from the IEEE Std 1003.1b-1993 Realtime extension. -
Realtime Threads
Covering threads-related functions also related to realtime functionality. -
Advanced Realtime
Covering some of the non-threads-related functions originally from IEEE Std 1003.1d-1999 and IEEE Std 1003.1j-2000. -
Advanced Realtime Threads
Covering some of the threads-related functions originally from IEEE Std 1003.1d-1999 and IEEE Std 1003.1j-2000. -
Tracing
Covering functionality originally from IEEE Std 1003.1q-2000. Marked obsolescent in this version of the Single UNIX Specification.
Common Directories and Devices
The Single UNIX Specification describes an applications portability environment, and as such defines a certain minimal set of directories and devices that applications regularly use. The following directories are defined:
/- The root directory of the file system.
/dev- Contains special device files such as
/dev/console,/dev/null, and/dev/tty. /tmp- A directory where applications can create temporary files.
The directory structure does not cross into such system management issues as where user accounts are organized or software packages are installed. Refer to XBD, Section 10.1, Directory Structure and Files for more information.
XBD, Chapter 10, Directory Structure and Devices also defines the mapping of <control> - char sequences to control character values, and associated requirements on system documentation.
Environment Variables
When a program begins, an environment is made available to it. The environment consists of strings of the form name=value, where name is the name associated with the environment variable, and its value is represented by the characters in value. UNIX systems traditionally pass information to programs through the environment variable mechanism. The Single UNIX Specification uses only uppercase characters, digits, and underscores to name environment variables, reserving for applications the name space of names containing lowercase characters.
A number of utilities and functions defined in the Single UNIX Specification use environment variables to modify their behavior. The ENVIRONMENT VARIABLES section of a utility's reference page describes any appropriate environment variables. Quite a number of environment variables (listed below) modify the behavior of more than a single utility.
ARFLAGS
CC
CDPATH
CFLAGS
CHARSET
COLUMNS
DATEMSK
DEAD
EDITOR
ENV
EXINIT
FC
FCEDIT
FFLAGS
GET
GFLAGS
HISTFILE
HISTORY
HISTSIZE
HOME
IFS
LANG
LC_ALL
LC_COLLATE
LC_CTYPE
LC_MESSAGES
LC_MONETARY
LC_NUMERIC
LC_TIME
LDFLAGS
LEX
LFLAGS
LINENO
LINES
LISTER
LOGNAME
LPDEST
MAIL
MAILCHECK
MAILER
MAILPATH
MAILRC
MAKEFLAGS
MAKESHELL
MANPATH
MBOX
MORE
MSGVERB
NLSPATH
NPROC
OLDPWD
OPTARG
OPTERR
OPTIND
PAGER
PAGER
PPID
PRINTER
PROCLANG
PROJECTDIR
PS1
PS2
PS3
PS4
PWD
RANDOM
SECONDS
SHELL
TERM
TERMCAP
TERMINFO
TMPDIR
TZ
USER
VISUAL
YACC
YFLAGS
YACC Grammars as Specifications
The Single UNIX Specification describes certain functionality and ``little'' languages with yacc grammars. The yacc utility (Yet-Another-Compiler-Compiler) is used to build language parsers, and the syntax of the yacc input language is designed for describing languages. Because of this, and its familiarity in the community, it has been used in the Single UNIX Specification to describe the shell (XCU, Chapter 2, Shell Command Language), the awk language (XCU, Chapter 4, Utilities, awk), the bc language (XCU, Chapter 4, Utilities, bc), the locale definition language (XBD, Chapter 7, Locale), basic and extended regular expressions (XBD, Chapter 9, Regular Expressions), and terminfo source format (X/Open Curses, Chapter 7, Terminfo Source Format).
These grammars are representations of the languages they describe, and a number of caveats apply:
-
They are not complete input grammars to yacc itself.
- They are not based on any existing implementations, nor have they been tested as such.
- Were these partial grammars to be completed, there are no guarantees that they are the most efficient method of defining the respective languages for yacc to process.
Refer to XCU, Chapter 1 for more information.
Regular Expressions
Both Basic Regular Expressions (BREs) and Extended Regular Expressions (EREs) are described in XBD, Chapter 9, Regular Expressions and all of the utilities and interfaces that use regular expressions refer back to this definition.
Basic regular expressions: csplit, ed, ex, expr, grep, more, nl, pax, sed, vi
Extended regular expressions: awk, grep -E, lex, sed -E
The functions regcomp() and regexec() in XSH, Chapter 3, System Interfaces implement regular expressions as defined in the Single UNIX Specification.
File Access
The Single UNIX Specification describes the interaction of symbolic links that may effect the file access in the file system. The behavior of symbolic links is fully specified with respect to their creation and use through the relevant XSH interfaces, such as lchown(), lstat(), readlink(), realpath(), and symlink(), and pathname resolution (XBD, Chapter 4, General Concepts). The behavior is also specified with respect to the utilities described in XCU.
Programming Environment
The Single UNIX Specification and UNIX certified systems are tools for developing portable applications and for porting existing applications that were originally developed to run on UNIX systems. XSH defines the C-language programming environment, and the syntax and semantics of the interfaces. Feature test macros, name space issues, and the program interaction with the operating system are all described in the opening chapters of XSH. The following introductions to these topics include references and additional explanations to orient an application developer with the information presented in XSH.
C-Language Support
The programming interfaces in XSH are described in C-language syntax, as defined in the ISO C standard, and presume a C-language compilation environment. (Implementations may make the functionality available through other programming languages, but this is not covered by the Single UNIX Specification.)
For an implementation to be a conforming UNIX certified system, it must support the ISO C standard. Implementations may additionally support the X/Open Common Usage C dialect as a migration strategy from previous XPG3 environments. X/Open Common Usage C is defined in Programming Languages, Issue 3, Chapters 1-4, and essentially refers to the C Language before the 1989 ANSI C standard.
XCU defines c17 as the interface to the C compilation environment. The c17 interface is an interface to the standard c17 compiler. The c99, c89 and cc utilities are not defined in this version of the Single UNIX Specification, although implementations may additionally support them for backwards-compatibility.
Feature Test Macros and Name Space Issues
There are a number of tasks that must be done to effectively make the interface environment available to a program. One or more C-language macros, referred to as feature test macros, must be defined before any headers are included. These macros might more accurately be referred to as header configuration macros, as they control what symbols will be exposed by the headers. The macro _XOPEN_SOURCE must be defined to a value of 800 to make available the functionality of the Single UNIX Specification, Version 5. With respect to POSIX base functionality covered by the Single UNIX Specification, this is equivalent to defining the macro _POSIX_C_SOURCE to be 202405L.
Use of the _XOPEN_SOURCE macro should not be confused with indicator macros associated with options and Option Groups, such as _XOPEN_UNIX, which are defined by the implementation in <unistd.h>.
In the first case (feature test macro _XOPEN_SOURCE defined by the application), the application is announcing to the implementation its desire to have certain symbols exposed in standard headers. The implementation tests the value of the macro to determine what features have been requested. In the second case (indicator macro _XOPEN_UNIX defined by the implementation in <unistd.h>), the implementation is announcing to the application what functionality it supports. The application tests the macro value to determine whether the implementation supports the functionality it wants to use (the XSI option in this example).
The UNIX system name space is well defined. All identifiers defined in the Single UNIX Specification appear in the headers described in XBD, Chapter 13, Headers (with the exception of environ). Tables in XSH, Section 2.2, The Compilation Environment clearly describe which identifier prefixes and suffixes are reserved for the implementation, which identifier macro prefixes may be used by the application programmer providing appropriate #undef statements are used to prevent conflicts, and which identifiers are reserved for external linkage.
Error Numbers
Each interface reference page lists in the ERRORS section possible error returns that may be tested either in errno or in the function return value upon the unsuccessful completion of a function call. Each error return has a symbolic name (defined as a manifest constant in <errno.h>) which should always be used by the portable application, as the actual error values are unspecified. All of the error names are listed in XSH, Section 2.3, Error Numbers along with additional relevant information.
Signal Concepts
The signal() function defined by the ISO C standard has shortcomings that make it unreliable for many application uses on some implementations. The Single UNIX Specification defines a reliable signal mechanism which applications should use instead. XSH, Section 2.4, Signal Concepts discusses signal generation and delivery, signal actions, async-signal-safe functions, and interruption of functions by signals.
Standard I/O Streams
When a program starts, it has three I/O streams associated with it, namely standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output). These streams are already open for the process, and ready for I/O.
The mechanics of stream I/O, buffering, relationships to file descriptors, and inheritance across process creation are all discussed in XSH, Section 2.5, Standard I/O Streams. New in this version of the Single UNIX Specification are facilities for associating a standard I/O stream with a memory buffer instead of a file.
Realtime
The Single UNIX Specification includes realtime functionality to support the source portability of applications with realtime requirements. Realtime is discussed in XSH, Section 2.8, Realtime. This section includes an overview of the functional areas: semaphores, process memory locking, memory mapped files and shared memory objects, priority scheduling, realtime signals, timers, interprocess communication, synchronous input/output, and asynchronous input/output.
Threads
XSH describes functionality to support multiple flows of control, called threads, within a process. Threads are discussed in XSH, Section 2.9, Threads, which includes an overview of the supported interfaces, threads implementation models, thread mutexes, thread attributes, thread scheduling, thread cancelation, thread read-write locks, and application-managed thread stacks.
Sockets
UNIX certified systems support UNIX domain sockets for process-to-process
communication in a single system and network sockets using Internet
protocols based on IPv4, and may also support raw sockets and network
sockets using Internet protocols based on IPv6.
XSH Section 2.10, Sockets, discusses all aspects of UNIX domain sockets and network sockets, including socket types, addressing, protocols, and socket options.
General Terminal Interface
The general terminal interface is described in the Single UNIX Specification in XBD, Chapter 11, General Terminal Interface, providing a mechanism to control asynchronous communications ports. It is left to implementations as to whether or not they support network connections and synchronous communications ports.
While all of the interface details are contained in XSH, the mechanics of the terminal interface with respect to process groups, controlling terminals, input and output processing, input, output, and control modes, and special characters are described in XBD, Chapter 11, General Terminal Interface.
This interface should not be confused with the X/Open Curses interface that provides a terminal-independent way to update character screens.
How to Read an XSH Reference Page
Each reference page in XSH has a common layout of sections describing the interface. (Function interface descriptions in X/Open Curses follow the same layout.) This layout is similar to the manual page or ``man'' page format shipped with most UNIX systems, and each interface has SYNOPSIS, DESCRIPTION, RETURN VALUE, and ERRORS sections. These are the four sections that relate to conformance.
Additional sections contain considerable extra information for the application developer. The EXAMPLES sections provide source code examples of how to use certain interfaces. The APPLICATION USAGE sections provide additional caveats, issues, and recommendations to the developer. The SEE ALSO sections contain useful pointers to related interfaces and headers that a developer may wish to also read.
The FUTURE DIRECTIONS sections act as pointers to related work that may impact on the interface in the future, and often cautions the developer to architect the code to account for a change in this area. (A FUTURE DIRECTIONS section expresses current thinking and should not be considered a commitment to adopt the feature or interface in the future.)
The RATIONALE sections include historical information about an interface and why features were included or discarded in the definition.
The CHANGE HISTORY sections describe when the interface was introduced, and how it has changed. This information can be useful when porting existing applications that may reflect earlier implementations of the interface.
Option Group labels in the reference page headers, and portability shading and margin marks are features already described in this document; they appear on the reference pages to guide an application developer when deciding how best an interface should be used. Refer to XSH, Section 1.2, Format of Entries for information on the exact layout.
Commands and Utilities Environment
The Single UNIX Specification describes 174 utilities supported on UNIX certified systems. These utilities provide a rich environment for building shell script applications, supporting program development (the C Language in particular), and providing a user portability environment.
The shell command language, symbolic links, file format notation, utility reference page layouts, and guidelines, are all introduced in this section. The following introductions to these topics include references and additional explanations to orient an application developer with the information presented in XCU.
Shell Command Language
The shell is a powerful and flexible programming language. A considerable number of utilities on early UNIX systems were actually shell programs.
The Single UNIX Specification shell is the standard POSIX shell. This shell is for the most part based on the Bourne shell with features from the KornShell, ksh.
The shell command language is defined in its entirety in XCU, Chapter 2, Shell Command Language. This is a very strict definition of the shell. Token recognition, word expansions, simple and compound commands, the shell grammar, the execution environment, and special built-ins are a few of the topics covered. It is not a guide to writing or porting shell scripts.
Symbolic Links
The Single UNIX Specification includes support for symbolic links in the file system. The programmatic interfaces that manipulate symbolic links (or symlinks) are all well defined in XSH and XCU, and symlink concepts with respect to issues like pathname resolution are discussed in XBD, Chapter 4, General Concepts. A UNIX certified system supports symlinks, and application programs may make use of them.
File Format Notation
Sections in the utility reference pages often require the expected input
used by the utility or output it generates to be described. Additionally,
information files used or created sometimes require description. The
method used throughout XCU is a format description plus its arguments,
similar to that used by the
printf()
function. These file format specifications are presented as:
"<format>"[ ,<arg1>, <arg2>, ..., <argn>]
The format specifier contains the format string a programmer
might use to write the data. The specifiers should not be considered
format strings that could be directly used in a call to
scanf().
The conversion specifications in the format string are what would be
expected by a developer familiar with the
printf()
interface. Refer to XBD, Chapter 5, File Format Notation for more
information.
How to Read an XCU Reference Page
Each reference page in XCU has a common layout of sections describing the interface. This layout, while similar to the manual page or ``man'' page format shipped with most UNIX systems, offers a more detailed view of the utility's description.
As well as the SYNOPSIS and DESCRIPTION sections, each interface has OPTIONS, OPERANDS, STDIN (standard input format), INPUT FILES, ENVIRONMENT VARIABLES, ASYNCHRONOUS EVENTS (what signals are caught and the consequence of receiving signals), STDOUT (standard output format), STDERR, and OUTPUT FILES sections.
An EXTENDED DESCRIPTION will be used if the utility has a particularly
long description; for example, if it supports its own language (
awk),
Utilities generally return 0 upon successful completion, and a failed status as greater than 0. The EXIT STATUS sections specify this, but will also describe if particular values are returned in certain circumstances. In general, an application should be written to test for successful completion, rather than specific error returns.
The CONSEQUENCE OF ERRORS section describes what happens to such items as open files, process state, and the environment, should errors occur.
As with the XSH reference pages, additional sections contain considerable extra information for the application developer, and include EXAMPLES, APPLICATION USAGE, FUTURE DIRECTIONS, RATIONALE, SEE ALSO, and CHANGE HISTORY sections. The defaults for these sections, and additional detail about what each section specifies, are covered in XCU, Section 1.4, Utility Description Defaults.
Terminal Interfaces Environment
The Single UNIX Specification includes X/Open Curses. These interfaces provide a terminal-independent character screen update method. The functionality includes:
-
Multibyte and wide-character support
-
Color terminal support
-
Wide and non-spacing character generalizations such that multi-column display characters and non-spacing characters are supported
New interfaces over Issue 3 are marked with ENHANCED CURSES in the reference page header, and shading and a portability code is used (EC) to indicate in the introductory chapters and reference pages the extended features.
Applications using the interfaces from X/Open Curses need to define the _XOPEN_SOURCE macro to be 800 prior to including any headers. This feature test macro ensures the appropriate name space is exposed in the headers.
On UNIX certified systems, the c17 compiler recognizes the additional curses library option-argument for the -l option.
There are four X/Open Curses utilities- infocmp, tic, tput, and untic
Internationalization
There is a rich set of interfaces in the Single UNIX Specification to
support internationalized applications development. Internationalization
refers to developing an application without prior knowledge of the
language, cultural information, or character set encoding scheme that
will be used in the run-time environment. The application responds
accordingly at run time for cultural or locale-specific conditions. The
term ``internationalization'' is often shortened to simply ``I18N'',
for ``I''-18 letters (nternationalizatio)-``N''.
Localization is the process of establishing a base of cultural and codeset data on a system such that it can be accessed. It is the method by which locales are created in such a way that internationalized programs can access the relevant information.
There are a number of factors that need to be considered when structuring an application to support multiple cultures:
-
Cultural-specific data (for example, date, number, and monetary formats) needs to be defined in a locale, with the appropriate tools and interfaces available for creating and accessing the information.
-
Provision for handling text strings in programs, and managing catalogs of these strings for multiple natural languages, needs to be made.
-
Multibyte and wide-character handling interfaces are required to manage textual data that is not represented in single byte per character formats.
-
Codeset conversion tools need to be supported.
All of these requirements are met in the Single UNIX
Specification. Utilities exist to support the definition and display of
locales (
localedef
and
locale).
Programming interfaces exist to access the locale-specific data, and
others have been extended such that all the checking and manipulations
done with characters and strings can be done with wide characters and
wide-character strings. Interfaces to access message string catalogs
(
catopen(),
catgets(),
and
catclose())
and a tool to define them (
gencat)
exist. Codeset conversions can be done at the utility (iconv)
and program level (
iconv_open(),
iconv(),
and
iconv_close()).
New in this version of the Single UNIX Specification are facilities for
handling multiple concurrent locales (
newlocale(),
duplocale(),
uselocale(),
and variants of existing functions with names ending
_l;
for example,
isalnum_l()).
Much of the I18N information is spread throughout the Single UNIX Specification, with specific interface and utility descriptions appearing in XSH and XCU, respectively. Introductions to character and codeset issues and the locale definition language appear in XBD, Chapter 6, Character Set, and Chapter 7, Locale, respectively.