Learning Python Part-13: Python Strings

Here we are with yet another literal type called as strings.

Strings are used in all modern programming languages to store and process textual information. Logically, a string is a sequence of characters.

What is a Character?

A character is a unit of information as per computer science. Characters consist of graphical shapes, called as Graphemes. Graphemes is the fundamental unit of written or printed language. Graphemes consist of lines, curves and crossings in certain angles or positions and so on.

ASCII and Unicode:


Unicode is a standard designed to represent any character from any language. Unicode can handle any text from any of the world’s writing systems unlike ASCII. 

ASCII can handle languages like English or German but not the likes of Japanese or Korean. ASCII is restricted to 128 characters and Extended ASCII is limited to 256 bytes or characters.

A character in Unicode maps to a code point. A code point is a theoretical concept. This means, for example, if character “A” is assigned a code point U+000D.

The “U+” means “Unicode” and the “000D” is a hexadecimal number 0x000D, 13 in decimal notation.

Example:

>>> hex(13)
0xd
>>> int(0xd)
13


Strings, Unicode and Python:


All strings in Python 3 are sequences of pure Unicode characters, no specific encoding. Unicode has encoding like UTF-8, UTF,16, UTF-32. 


There are different ways to define strings in Python:

We can use single quotes (”’) or double quotes (“””) to represent strings. 


a = ‘This is a string in single quotes.’
b = “This is a string in double quotes.”

Multi-line strings can be denoted using triple quotes, ”’ or “””.  

c = ”’A string in triple quotes, 

extends over multiple lines, Also,
It can contain ‘single’ and “double” quotes inside it.”’


Strings are immutable in Python but strings can be indexed. So whenever we pass on string, index value is associated in both directions with each characters as shown in example below. This is similar concept to that in C programming language.


Hence, slicing operator [] can be used with strings to access characters at specific index value as explained below.

Example:

a = “Hello world”

print(a[4])          ### Output will be ‘o’ i.e. 4th position only
print(a[6:11])     ### Output will be ‘world’. Starting from 6 to 11th position


As we mentioned earlier, strings are immutable in Python, which means we cannot replace characters in strings like we do in lists. 

Example:

a[5] =‘d’            ### Generates error as Strings are immutable in Python.


Python String Operations:

  • Concatenation of Two or More Strings
    • Joining of two or more strings into a single one is called concatenation.
    • The + operator does this in Python. 
    • Simply writing two string literals together also concatenates them.
    • The * operator can be used to repeat the string for a given number of times.


str1 = ‘Hello’
str2 =’World!’

# using +
print(‘str1 + str2 = ‘, str1 + str2)
# using *
print(‘str1 * 3 =’, str1 * 3)

    • Writing two string literals together also concatenates them like + operator.

str = ‘Hello ”World!’
print(str)


  • Iterating Through String:
    • Using for loop we can iterate through a string. Here is an example to count the number of ‘l’ in a string.

count = 0
for letter in ‘Hello World’:
     if(letter == ‘l’):
     count += 1
     print(count,’letter found’)

  • String Membership Test:
    • We can test if a sub string exists within a string or not, using the keyword in.

>>> ‘a’ in ‘program’
True
>>> ‘a’ not in ‘program’
False


Built-in Python functions to Work with strings
  • Various built-in functions that work with sequence, works with string as well.
  • Some of the commonly used ones are enumerate() and len()
  • The enumerate() function returns an enumerate object. 
  • It contains the index and value of all the items in the string as pairs. 
  • This can be useful for iteration.
  • Similarly, len() returns the length (number of characters) of the string.

str = ‘cold’

# enumerate()
list_enumerate = list(enumerate(str))
print(‘list(enumerate(str) = ‘, list_enumerate)

#character count
print(‘len(str) = ‘, len(str))

Output:

list(enumerate(str) = [(0, ‘c’), (1, ‘o’), (2, ‘l’), (3, ‘d’)]
len(str) = 4

Python String Formatting:
  • Escape Sequence
  • Raw String to ignore escape sequence
  • The format() Method for Formatting Strings
  • Old style formatting
  • Common Python String Methods



Escape Sequence:

  • If we want to print a text like -He said, “What’s there?”- we can neither use single quote or double quotes.
  • This will result into SyntaxError as the text itself contains both single and double quotes.
  • One way to get around this problem is to use triple quotes. 

print(”’He said, “What’s there?””’)

  • Alternatively, we can use escape sequences.
  • An escape sequence starts with a backslash and is interpreted differently. 
  • If we use single quote to represent a string, all the single quotes inside the string must be escaped. Similar is the case with double quotes. 

print(‘He said, “What\’s there?”‘)

print(“He said, \”What’s there?\””)

List of all the escape sequence supported by Python:

Raw String to ignore escape sequence:
  • Sometimes we may wish to ignore the escape sequences inside a string. 
  • To do this we can place r or R in front of the string. 
  • This will imply that it is a raw string and any escape sequence inside it will be ignored.

# Without raw string

print(“This is \x61 \nexample”) 
#Output:
This is a
example

# With Raw string

print(r”This is \x61 \ngood example”)

#Output:
This is \x61 \ngood example

The format() Method for Formatting Strings
  • The format() method that is available with the string object is very versatile and powerful in formatting strings. 
  • Format strings contains curly braces {} as placeholders or replacement fields which gets replaced.


Old style formatting:

  • We can even format strings like the old sprintf() style used in C programming language. 
  • We use the % operator to accomplish this.

x = 12.3456789
print(‘The value of x is %3.2f’ %x)
print(‘The value of x is %3.4f’ %x) 


# Output

The value of x is 12.35
The value of x is 12.3457

Common Python String Methods:

  • There are numerous methods available with the string object. 
  • The format() method that we mentioned above is one of them. 
  • Some of the commonly used methods are
    • lower()
    • upper()
    • join()
    • split()
    • find()
    • replace()

Examples:

a = “PytHONic”.lower() 

print(a)

Output:

‘pythonic’

Example:


a= “PytHONic”.upper()
Output:

‘PYTHONIC’

Example:


a = “This will split all words into a list”.split()

Output:


[‘This’, ‘will’, ‘split’, ‘all’, ‘words’, ‘into’, ‘a’, ‘list’]

Example:


a = ”.join([‘This’, ‘ will’, ‘ join’, ‘ all’, ‘ words’, ‘ into’, ‘ a’, ‘ string’, ‘ statement’])
print(a)

Output:

This will join all words into a string statement

Example:


a = ‘Happy New Year’.find(‘ew’)
print(a)

Output:


7

Example:


a =  ‘Happy New Year’.replace(‘Happy’,’Excellent’) 
print(a)

Output:


Excellent New Year

Leave a Reply