Mapping of Unicode characters
Unicode’s
Universal Character Set potentially supports over 1 million (1,114,112 = 220 + 216 or 17 × 216,
hexadecimal 110000) code points.As of Unicode 5.0.0, 102,012 (9.2%) of these code points are assigned, with another 137,468 (12.3%) reserved for
private use, 2,048 for surrogates, and 66 designated noncharacters, leaving 872,582 (78.3%) unassigned. The number of assigned code points is made up as follows:2,684 in reserve for designation within a particular block98,893 graphical characters435 special purpose characters for control, formatting, and glyph/character variation selection.
See more at Wikipedia.org...
Basic Multilingual Plane
<
text,
standard> (BMP) The first plane defined in
Unicode/
ISO 10646, designed to include all
scripts in active modern use. The BMP currently includes the Latin, Greek, Cyrillic, Devangari, hiragana, katakana, and Cherokee scripts, among others, and a large body of mathematical,
APL-related, and other miscellaneous
characters. Most of the
Han ideographs in current use are present in the BMP, but due to the large number of ideographs, many were placed in the
Supplementary Ideographic Plane.
Unicode home.
(2002-03-19)
(c) Copyright 1993 by Denis Howe