Strings are used in all modern programming languages to store and process textual information. Logically, a string is a sequence of characters.
What is a Character?
A character is a unit of information as per computer science. Characters consist of graphical shapes, called as Graphemes. Graphemes is the fundamental unit of written or printed language. Graphemes consist of lines, curves and crossings in certain angles or positions and so on.
ASCII and Unicode:
Unicode is a standard designed to represent any character from any language. Unicode can handle any text from any of the world’s writing systems unlike ASCII.
A character in Unicode maps to a code point. A code point is a theoretical concept. This means, for example, if character “A” is assigned a code point U+000D.
The “U+” means “Unicode” and the “000D” is a hexadecimal number 0x000D, 13 in decimal notation.
Example:
>>> hex(13)
0xd
>>> int(0xd)
13
All strings in Python 3 are sequences of pure Unicode characters, no specific encoding. Unicode has encoding like UTF-8, UTF,16, UTF-32.
We can use single quotes (”’) or double quotes (“””) to represent strings.
c = ”’A string in triple quotes,
a = “Hello world”
print(a[4]) ### Output will be ‘o’ i.e. 4th position only
print(a[6:11]) ### Output will be ‘world’. Starting from 6 to 11th position
a[5] =‘d’ ### Generates error as Strings are immutable in Python.
Python String Operations:
- Concatenation of Two or More Strings
- Joining of two or more strings into a single one is called concatenation.
- The + operator does this in Python.
- Simply writing two string literals together also concatenates them.
- The * operator can be used to repeat the string for a given number of times.
str1 = ‘Hello’
str2 =’World!’
print(‘str1 + str2 = ‘, str1 + str2)
print(‘str1 * 3 =’, str1 * 3)
- Writing two string literals together also concatenates them like + operator.
str = ‘Hello ”World!’
print(str)
- Iterating Through String:
- Using for loop we can iterate through a string. Here is an example to count the number of ‘l’ in a string.
count = 0
for letter in ‘Hello World’:
if(letter == ‘l’):
count += 1
print(count,’letter found’)
- String Membership Test:
- We can test if a sub string exists within a string or not, using the keyword in.
>>> ‘a’ in ‘program’
True
>>> ‘a’ not in ‘program’
False
- Various built-in functions that work with sequence, works with string as well.
- Some of the commonly used ones are enumerate() and len().
- The enumerate() function returns an enumerate object.
- It contains the index and value of all the items in the string as pairs.
- This can be useful for iteration.
- Similarly, len() returns the length (number of characters) of the string.
str = ‘cold’
len(str) = 4
- Escape Sequence
- Raw String to ignore escape sequence
- The format() Method for Formatting Strings
- Old style formatting
- Common Python String Methods
Escape Sequence:
- If we want to print a text like -He said, “What’s there?”- we can neither use single quote or double quotes.
- This will result into SyntaxError as the text itself contains both single and double quotes.
- One way to get around this problem is to use triple quotes.
print(”’He said, “What’s there?””’)
- Alternatively, we can use escape sequences.
- An escape sequence starts with a backslash and is interpreted differently.
- If we use single quote to represent a string, all the single quotes inside the string must be escaped. Similar is the case with double quotes.
print(‘He said, “What\’s there?”‘)
- Sometimes we may wish to ignore the escape sequences inside a string.
- To do this we can place r or R in front of the string.
- This will imply that it is a raw string and any escape sequence inside it will be ignored.
# Without raw string
This is a
example
# With Raw string
#Output:
This is \x61 \ngood example
- The format() method that is available with the string object is very versatile and powerful in formatting strings.
- Format strings contains curly braces {} as placeholders or replacement fields which gets replaced.
Old style formatting:
- We can even format strings like the old sprintf() style used in C programming language.
- We use the % operator to accomplish this.
x = 12.3456789
print(‘The value of x is %3.2f’ %x)
print(‘The value of x is %3.4f’ %x)
# Output
The value of x is 12.3457
Common Python String Methods:
- There are numerous methods available with the string object.
- The format() method that we mentioned above is one of them.
- Some of the commonly used methods are
- lower()
- upper()
- join()
- split()
- find()
- replace()
Examples:
a = “PytHONic”.lower()
Output:
Example:
a= “PytHONic”.upper()
‘PYTHONIC’
Example:
a = “This will split all words into a list”.split()
Output:
[‘This’, ‘will’, ‘split’, ‘all’, ‘words’, ‘into’, ‘a’, ‘list’]
Example:
a = ”.join([‘This’, ‘ will’, ‘ join’, ‘ all’, ‘ words’, ‘ into’, ‘ a’, ‘ string’, ‘ statement’])
print(a)
Output:
This will join all words into a string statement
Example:
a = ‘Happy New Year’.find(‘ew’)
print(a)
Output:
7
Example:
a = ‘Happy New Year’.replace(‘Happy’,’Excellent’)
Output:
Excellent New Year