Changes between Version 3 and Version 4 of ThePerilsOfStr


Ignore:
Timestamp:
Jan 12, 2011, 11:01:06 AM (10 years ago)
Author:
flip
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ThePerilsOfStr

    v3 v4  
    1616 * We can't avoid using `unicode()` entirely because that's the best way
    1717 to convert our custom objects (experiments, metabs, etc.) to strings. For example,
    18  if you need a string representation of an experiment, use `unicode(experiment)`.
     18 if you need a string representation of an experiment, use
     19 `unicode(the_experiment)`.
    1920
    2021
     
    4546The result is an 8 bit string. The `.decode()` method reverses the process.
    4647
     48
    4749It's less obvious, but the built-in function `str()` can also convert a
    4850Unicode string into 8 bit. Normally we call `str()` on non-string objects
     
    6870
    6971The string `u"Bj\xf6rn"` is Python's Unicode representation of
    70 "Björn" which is a fairly common Swedish first name.
     72"Björn" which is a fairly common first name in Sweden.
    7173
    7274
     
    102104
    103105
    104 So our code that calls `str()` to convert strings from the GUI or the database
    105 into 8 bit representations in order to make them safe for PyGAMMA or to
    106 display them in a text file will
    107 break as soon as someone passes non-ASCII to it.
     106So the code that we wrote that calls `str()` to convert strings from the
     107GUI or the database into 8 bit representations in order to make those strings
     108safe for PyGAMMA or to display them in a text file would have broken as soon as
     109someone passed non-ASCII to it. ALl of that code was replaced in r1652.
    108110
    109111
     
    114116If the object doesn't implement a custom `__str__()` method, standard object
    115117inheritance implies a call to the `__str__()` method on Python's `object`
    116 class (remember that everything inherits from object). That will print
     118class (from which all objects derive). That will print
    117119the reliable-but-boring representation you've surely seen before --
    118120{{{
     
    177179== Technical Notes ==
    178180
     181There's no harm in calling `.encode()` on a non-Unicode string.
     182Python returns the same string --
     183{{{
     184#!python
     185>>> assert("foo" == "foo".encode("utf-8"))
     186>>> assert(type("foo") == type("foo".encode("utf-8")))
     187}}}
     188
     189
    179190One can make a string literal Unicode by prefacing it with 'u' as in the
    180191example below.