@@ -30,8 +30,8 @@ For a while people just wrote programs that didn't display accents. I remember
30
30
looking at Apple ][ BASIC programs, published in French-language publications in
31
31
the mid-1980s, that had lines like these::
32
32
33
- PRINT "FICHIER EST COMPLETE."
34
- PRINT "CARACTERE NON ACCEPTE."
33
+ PRINT "FICHIER EST COMPLETE."
34
+ PRINT "CARACTERE NON ACCEPTE."
35
35
36
36
Those messages should contain accents, and they just look wrong to someone who
37
37
can read French.
@@ -89,11 +89,11 @@ standard, a code point is written using the notation U+12ca to mean the
89
89
character with value 0x12ca (4810 decimal). The Unicode standard contains a lot
90
90
of tables listing characters and their corresponding code points::
91
91
92
- 0061 'a'; LATIN SMALL LETTER A
93
- 0062 'b'; LATIN SMALL LETTER B
94
- 0063 'c'; LATIN SMALL LETTER C
95
- ...
96
- 007B '{'; LEFT CURLY BRACKET
92
+ 0061 'a'; LATIN SMALL LETTER A
93
+ 0062 'b'; LATIN SMALL LETTER B
94
+ 0063 'c'; LATIN SMALL LETTER C
95
+ ...
96
+ 007B '{'; LEFT CURLY BRACKET
97
97
98
98
Strictly, these definitions imply that it's meaningless to say 'this is
99
99
character U+12ca'. U+12ca is a code point, which represents some particular
@@ -597,19 +597,19 @@ encoding and a list of Unicode strings will be returned, while passing an 8-bit
597
597
path will return the 8-bit versions of the filenames. For example, assuming the
598
598
default filesystem encoding is UTF-8, running the following program::
599
599
600
- fn = u'filename\u4500abc'
601
- f = open(fn, 'w')
602
- f.close()
600
+ fn = u'filename\u4500abc'
601
+ f = open(fn, 'w')
602
+ f.close()
603
603
604
- import os
605
- print os.listdir('.')
606
- print os.listdir(u'.')
604
+ import os
605
+ print os.listdir('.')
606
+ print os.listdir(u'.')
607
607
608
608
will produce the following output::
609
609
610
- amk:~$ python t.py
611
- ['.svn', 'filename\xe4\x94\x80abc', ...]
612
- [u'.svn', u'filename\u4500abc', ...]
610
+ amk:~$ python t.py
611
+ ['.svn', 'filename\xe4\x94\x80abc', ...]
612
+ [u'.svn', u'filename\u4500abc', ...]
613
613
614
614
The first list contains UTF-8-encoded filenames, and the second list contains
615
615
the Unicode versions.
@@ -703,26 +703,26 @@ Version 1.02: posted August 16 2005. Corrects factual errors.
703
703
- [ ] Unicode introduction
704
704
- [ ] ASCII
705
705
- [ ] Terms
706
- - [ ] Character
707
- - [ ] Code point
708
- - [ ] Encodings
709
- - [ ] Common encodings: ASCII, Latin-1, UTF-8
706
+ - [ ] Character
707
+ - [ ] Code point
708
+ - [ ] Encodings
709
+ - [ ] Common encodings: ASCII, Latin-1, UTF-8
710
710
- [ ] Unicode Python type
711
- - [ ] Writing unicode literals
712
- - [ ] Obscurity: -U switch
713
- - [ ] Built-ins
714
- - [ ] unichr()
715
- - [ ] ord()
716
- - [ ] unicode() constructor
717
- - [ ] Unicode type
718
- - [ ] encode(), decode() methods
711
+ - [ ] Writing unicode literals
712
+ - [ ] Obscurity: -U switch
713
+ - [ ] Built-ins
714
+ - [ ] unichr()
715
+ - [ ] ord()
716
+ - [ ] unicode() constructor
717
+ - [ ] Unicode type
718
+ - [ ] encode(), decode() methods
719
719
- [ ] Unicodedata module for character properties
720
720
- [ ] I/O
721
- - [ ] Reading/writing Unicode data into files
722
- - [ ] Byte-order marks
723
- - [ ] Unicode filenames
721
+ - [ ] Reading/writing Unicode data into files
722
+ - [ ] Byte-order marks
723
+ - [ ] Unicode filenames
724
724
- [ ] Writing Unicode programs
725
- - [ ] Do everything in Unicode
726
- - [ ] Declaring source code encodings (PEP 263)
725
+ - [ ] Do everything in Unicode
726
+ - [ ] Declaring source code encodings (PEP 263)
727
727
- [ ] Other issues
728
- - [ ] Building Python (UCS2, UCS4)
728
+ - [ ] Building Python (UCS2, UCS4)
0 commit comments