Strings in Python
In Python, a string is a sequence of characters enclosed in quotes. Strings are a fundamental data type used to represent text-based data and are versatile in various applications, including text processing, data manipulation, and user interactions. Python provides several methods and features for creating and storing strings efficiently.
Creating Strings
-
Using Single Quotes
You can create a string by enclosing characters within single quotes (
'
). For example: -
Using Double Quotes
Similarly, double quotes (
"
) can also be used to create strings: -
Using Triple Quotes
Triple quotes (
'''
or"""
) are used for creating multi-line strings. They are particularly useful for writing long text or docstrings:or
-
Single Character String
A string can also contain a single character:
-
Empty Strings
Strings can be empty, meaning they contain no characters:
-
Quotes Within Quotes
-
A string enclosed in double quotes can contain single quotes:
-
A string enclosed in single quotes can contain double quotes:
-
-
Escaping Quotes
To include the same type of quotation mark inside a string as the one used to enclose it, use a backslash (
\
) to escape it:
Storing Strings
Strings in Python are immutable, meaning once created, their content cannot be changed. When you assign a string to a variable, Python stores it in memory and reuses it when the same string is referenced again. This behavior is called string interning, which helps in optimizing memory usage.
Strings are typically stored in variables, allowing you to manipulate or access the text data. For example:
Basic String Operations in Python
Python provides a range of operations to manipulate and work with strings. These include concatenation, repetition, and membership testing. Here’s an overview of these basic string operations:
Concatenation
Concatenation is the process of combining two or more strings into one. This can be achieved using the +
operator.
-
Simple Concatenation
You can concatenate strings by using the
+
operator: -
Concatenation with Space
To include a space between concatenated strings, simply add a string with a space:
-
Concatenating Different Data Types
Python does not support direct concatenation of strings with other data types like integers. Attempting to do so will result in a
TypeError
:To concatenate an integer with a string, convert the integer to a string first:
Repetition
You can repeat a string multiple times using the *
operator. This creates a new string that consists of the original string repeated a specified number of times.
String Repetition
Multiply a string by an integer to repeat it:
Membership Testing
You can check if a string contains a specific substring using the in
and not in
operators. These operators return a Boolean value (True
or False
).
-
Checking for Presence with
in
The
in
operator evaluates toTrue
if the substring is found within the string: -
Checking for Absence with
not in
The
not in
operator evaluates toTrue
if the substring is not found within the string:
Comparison
Strings can be compared using various comparison operators (>
, <
, <=
, >=
, ==
, !=
). Python compares strings based on their ASCII values.
-
String Comparison
- Python compares strings character by character based on ASCII values. For example, since 'u' has a higher ASCII value than 'e',
"january"
is considered greater than"jane"
. - Strings can also be compared against an empty string, with non-empty strings being greater.
- Python compares strings character by character based on ASCII values. For example, since 'u' has a higher ASCII value than 'e',
Built-In Functions for Strings
Python provides several built-in functions to perform operations on strings:
-
len()
: Returns the number of characters in a string, including spaces. -
max()
: Returns the character with the highest ASCII value in the string. -
min()
: Returns the character with the lowest ASCII value in the string.
Accessing Characters in a String by Index Number
In Python, each character in a string is assigned an index number, which allows you to access individual characters. Indices start from 0 and go up to one less than the length of the string. You can also use negative indices to access characters from the end of the string.
Positive Indexing
Positive indexing starts from 0 and moves to the right. Here’s how you can access characters using positive indices:
-
Accessing Characters by Positive Index
-
Index Out of Range
If you try to access an index that is beyond the length of the string, Python will raise an
IndexError
:Index numbers start from 0 and go up to
len(string) - 1
. If the index provided is greater than or equal to the length of the string, it results in anIndexError
.
Negative Indexing
Negative indexing allows you to access characters from the end of the string, starting with -1 for the last character, -2 for the second last, and so forth.
Accessing Characters by Negative Index
Negative indices are particularly useful when you need to access characters at the end of a long string.
Index Breakdown Example
Here's a table that visualizes the positive and negative indices for the string "be yourself"
:
Character | Positive Index | Negative Index |
---|---|---|
b | 0 | -11 |
e | 1 | -10 |
(space) | 2 | -9 |
y | 3 | -8 |
o | 4 | -7 |
u | 5 | -6 |
r | 6 | -5 |
s | 7 | -4 |
e | 8 | -3 |
l | 9 | -2 |
f | 10 | -1 |
String Slicing and Joining in Python
Strings in Python can be sliced and joined to extract substrings or concatenate sequences of strings. These operations are essential for manipulating and processing textual data effectively.
String Slicing
String slicing allows you to access specific parts of a string using a range of index numbers. The syntax for string slicing is:
start
: The index where slicing begins (inclusive).end
: The index where slicing ends (exclusive).step
: Optional parameter that specifies the number of characters to skip.
Examples:
-
Basic Slicing
slicing.py In these examples:
healthy_drink[0:3]
extracts characters from index 0 up to, but not including, index 3.healthy_drink[:5]
extracts from the beginning of the string up to index 5.healthy_drink[6:]
extracts from index 6 to the end of the string.healthy_drink[:]
returns the entire string.healthy_drink[4:4]
returns an empty string since the start and end indices are the same.healthy_drink[6:20]
extracts from index 6 up to the end of the string, even though the end index is beyond the string's length.
-
Negative Indexing
Negative indexing allows you to start counting from the end of the string. For example:
healthy_drink[-3:-1]
extracts characters starting from the third last character up to, but not including, the last character.healthy_drink[6:-1]
extracts characters from index 6 up to the second last character.
-
Specifying Steps
The
step
argument in slicing allows you to skip characters:newspaper[0:12:4]
extracts every 4th character from the beginning up to index 12.newspaper[::4]
extracts every 4th character from the entire string.
Joining Strings
Joining strings involves concatenating a sequence of strings with a specified separator. The join()
method is used for this purpose. Its syntax is:
Where sequence
can be a list of strings or another string.
Examples:
-
Joining a List of Strings
":".join(date_of_birth)
joins the list items with ":" as the separator." ".join(web_app)
joins the list items with a space as the separator.
-
Joining Strings with Another String
numbers.join(characters)
inserts "667" between each character of "eniv".
Splitting Strings
The split()
method returns a list of strings by breaking up the original string using a delimiter. The syntax is:
separator
: Optional. The delimiter string. If not specified, whitespace is used as the delimiter.maxsplit
: Optional. Specifies the maximum number of splits. If not specified or set to-1
, there is no limit.
Examples:
-
Splitting with a Specified Separator
sitename.split(",")
splits the string using "," as the delimiter.
-
Splitting with Whitespace
sitename.split()
splits the string using whitespace as the delimiter.
Strings are Immutable
Strings in Python are immutable, meaning their content cannot be changed once assigned. You can create new strings based on existing ones but cannot modify the original string directly.
Examples:
-
Attempting to Modify a String
- Attempting to change the content of a string directly results in a
TypeError
.
- Attempting to change the content of a string directly results in a
-
Creating a New String
- Creating a new string by concatenating parts of the original string.
- Assigning a new string to the same variable.
String Traversing
Since strings are sequences of characters, you can traverse each character using a for
loop.
Example:
Output:
- This code demonstrates how to loop through each character in a string and print its index.
String Methods in Python
String methods in Python are built-in functions that allow you to perform various operations on strings, such as modifying their content, checking their properties, and more.
Table of String Methods
Method | Syntax | Description |
---|---|---|
capitalize() | string_name.capitalize() | Returns a copy of the string with its first character capitalized and the rest lowercased. |
casefold() | string_name.casefold() | Returns a casefolded copy of the string for caseless matching. |
center() | string_name.center(width[, fillchar]) | Centers the string, padding it with the specified fill character (default is space). |
count() | string_name.count(substring[, start[, end]]) | Returns the number of non-overlapping occurrences of substring in the specified range. |
endswith() | string_name.endswith(suffix[, start[, end]]) | Returns True if the string ends with the specified suffix, otherwise False . |
find() | string_name.find(substring[, start[, end]]) | Returns the lowest index where the substring is found, or -1 if not found. |
isalnum() | string_name.isalnum() | Returns True if all characters in the string are alphanumeric and there is at least one character, else False . |
isalpha() | string_name.isalpha() | Returns True if all characters in the string are alphabetic and there is at least one character, else False . |
isdecimal() | string_name.isdecimal() | Returns True if all characters in the string are decimal characters and there is at least one character, else False . |
isdigit() | string_name.isdigit() | Returns True if all characters in the string are digits and there is at least one character, else False . |
isidentifier() | string_name.isidentifier() | Returns True if the string is a valid identifier, else False . |
islower() | string_name.islower() | Returns True if all characters in the string are lowercase, else False . |
isspace() | string_name.isspace() | Returns True if there are only whitespace characters in the string and there is at least one character, else False . |
isnumeric() | string_name.isnumeric() | Returns True if all characters in the string are numeric characters, and there is at least one character, else False . |
istitle() | string_name.istitle() | Returns True if the string is title cased and there is at least one character, else False . |
isupper() | string_name.isupper() | Returns True if all cased characters in the string are uppercase and there is at least one cased character, else False . |
upper() | string_name.upper() | Converts lowercase letters in the string to uppercase. |
lower() | string_name.lower() | Converts uppercase letters in the string to lowercase. |
ljust() | string_name.ljust(width[, fillchar]) | Left-justifies the string, padding it with the specified fill character (default is space). |
rjust() | string_name.rjust(width[, fillchar]) | Right-justifies the string, padding it with the specified fill character (default is space). |
title() | string_name.title() | Returns a title-cased version of the string, with each word’s first character capitalized and the rest lowercased. |
swapcase() | string_name.swapcase() | Returns a copy of the string with uppercase characters converted to lowercase and vice versa. |
splitlines() | string_name.splitlines([keepends]) | Returns a list of the lines in the string, breaking at line boundaries. Line breaks are not included unless keepends is True . |
startswith() | string_name.startswith(prefix[, start[, end]]) | Returns True if the string starts with the specified prefix, otherwise False . |
strip() | string_name.strip([chars]) | Returns a copy of the string with leading and trailing whitespace removed, or with the specified characters removed. |
rstrip() | string_name.rstrip([chars]) | Removes all trailing whitespace, or specified characters, from the string. |
lstrip() | string_name.lstrip([chars]) | Removes all leading whitespace, or specified characters, from the string. |
replace() | string_name.replace(old, new[, max]) | Returns a copy of the string with all occurrences of the old substring replaced by the new substring. If max is specified, only the first max occurrences are replaced. |
zfill() | string_name.zfill(width) | Pads the string on the left with zeros to fill the specified width. |
Examples of Using String Methods
Case Conversion Methods
fact.isalnum()
: This will returnFalse
because the string contains spaces and is not purely alphanumeric."sailors".isalpha()
: This will returnTrue
because "sailors" contains only alphabetic characters."2018".isdigit()
: This will returnTrue
because "2024" contains only digits.fact.islower()
: This will returnFalse
becausefact
contains uppercase letters."TSAR BOMBA".isupper()
: This will returnTrue
because "TSAR BOMBA" is in uppercase."columbus".islower()
: This will returnTrue
because "columbus" is in lowercase.
Methods for Checking Start and End of Strings
Methods for Finding Substrings
-
Finding the Position of a Substring
- Explanation: The
find()
method searches for the substring "En" within the stringfact
. It returns the index of the first occurrence of "En". In the string "HPTU Exam Helper is created by Eniv Studios", the substring "En" is found starting at index 31.
- Explanation: The
find()
method searches for the substring "iv" within the stringfact
. It returns the index of the first occurrence of "iv". In the string "HPTU Exam Helper is created by Eniv Studios", the substring "iv" is found starting at index 33.
- Explanation: The
find()
method searches for the substring "xyz" within the stringfact
. Since "xyz" is not found in the string, the method returns-1
.
- Explanation: The
-
Counting Occurrences of a Substring
- Explanation: The
count()
method counts the number of non-overlapping occurrences of the substring "a" in the stringfact
. In the string "HPTU Exam Helper is created by Eniv Studios", the letter "a" appears 2 times.
- Explanation: The
Methods for Modifying Strings
Methods for Trimming Strings
Methods for splitting string into lines and centering string with spaces
Output:
Splitting the string into lines
- The
splitlines()
method splits a string at line boundaries and returns a list of lines. - Line boundaries include
\n
(newline),\r
(carriage return), and\r\n
(carriage return followed by newline). - In the string
'ab c\n\nde fg\rkl\r\n'
:'ab c\n'
splits into'ab c'
and an empty string''
because\n
is a line break.'\nde fg\r'
splits into an empty string''
and'de fg'
because\n
is a line break and\r
is another line break.'kl\r\n'
splits into'kl'
because\r\n
is a line break.
- The resulting list of lines is
['ab c', '', 'de fg', 'kl']
.
Centering the string with spaces
- The
center()
method returns a new string of a specified width, with the original string centered and padded with spaces (or another specified character) on both sides. - The string
"Eat sleep code repeat"
has a length of 21 characters. - The
center(40)
method call specifies a total width of 40 characters. - To center the string within this width, 19 padding spaces are required (
40 - 21 = 19
). - These padding spaces are divided as evenly as possible on both sides of the original string:
- There are 9 spaces on the left and 10 spaces on the right.
- The final result is the string with the original string
Eat sleep code repeat
centered and spaces added to the left and right.
String methods like capitalize()
, lower()
, upper()
, swapcase()
, title()
,
and count()
are used for conversion purposes. Methods like islower()
, isupper()
,
isdecimal()
, isdigit()
, isnumeric()
, isalpha()
, and isalnum()
are used for comparing strings.
Padding methods include rjust()
, ljust()
, zfill()
, and center()
.
The string method find()
is used to find
substring in an existing string. You can use string methods like replace()
, join()
, split()
and splitlines()
to replace a string in Python.
Formatting Strings in Python
Python provides multiple ways to format text strings. The main methods are:
- %-formatting
str.format()
- f-strings (formatted string literals)
Each method has its own strengths and weaknesses, and in recent years, f-strings have become the preferred method due to their simplicity and efficiency.
%-Formatting
%-formatting is an older way to format strings in Python, using placeholders in the string and substituting them with values.
Syntax:
Examples:
-
Single Value
-
Multiple Values
Limitations:
- Supports only a limited set of types (int, str, float).
- Not flexible with multiple values or complex formatting.
str.format()
Introduced in Python 2.6, str.format()
provides a more flexible and powerful way to format strings compared to %-formatting.
Syntax:
Examples:
-
Basic Usage
-
Named Placeholders
Limitations:
- Verbose and may involve repeated text.
- Placeholder and value can be far apart in the format string.
f-strings (Formatted String Literals)
F-strings are a more recent addition (introduced in Python 3.6) that allow embedding expressions inside string literals, using a minimal and readable syntax.
Syntax:
Basic Usage
Advantages:
- Concise and readable.
- Can include expressions and variables directly within the string.
- More powerful and flexible formatting capabilities.
Limitations:
- Backslashes are not supported inside curly braces.
Format Specifiers in f-Strings
Format specifiers in f-strings allow you to control the appearance of the values within a formatted string. You can specify details such as the width and precision of numbers. Here's how you can use them:
General Syntax
The general syntax for format specifiers in f-strings is:
width
: Specifies the minimum width of the formatted value.precision
: Specifies the number of decimal places for floating-point numbers.
Specifying Width and Precision
In this example, we specify both width and precision for formatting a floating-point number.
Explanation:
{value:{width}.{precision}}
: This formatsvalue
with a total width of 10 characters, including 5 decimal places.- The total width includes both the integer part, the decimal point, and the decimal places. If the value has fewer characters than the specified width, it will be right-aligned by default.
Output:
The result is right-aligned within a field of 10 characters, and the number is rounded to 5 decimal places.
Default Precision
If you want to specify a width but not a precision, you can do so as follows:
Explanation:
{value:{width}}
: This formatsvalue
with a total width of 10 characters, but does not specify the precision. Therefore, the default precision is used, which typically includes as many decimal places as necessary.
Output:
The result is right-aligned within a field of 10 characters, and the number is shown with its full precision.
Default Width
If you specify precision but not width, the value is formatted with the given precision, and no specific width is enforced:
Explanation:
{value:.{precision}}
: This formatsvalue
with the given precision (5 decimal places) but does not specify a width. The formatted string will include only the required space for the number plus the decimal point and the specified precision.
Output:
The number is rounded to 5 decimal places. The width is automatically adjusted to fit the formatted number.
Escape Sequences
Escape sequences allow you to include special characters in strings that would otherwise be difficult to represent. They are useful for formatting text and including characters that have a specific function in Python strings.
Common Escape Sequences:
Escape Sequence | Meaning |
---|---|
\ | Break a line into multiple lines |
\\ | Insert a backslash character |
\' | Insert a single quote character |
\" | Insert a double quote character |
\n | Insert a new line |
\t | Insert a tab |
\r | Insert a carriage return |
\b | Insert a backspace |
\u | Insert a Unicode character |
\0oo | Insert a character based on its octal value |
\xhh | Insert a character based on its hexadecimal value |
-
Breaking a Line into Multiple Lines
By placing a backslash (
\
) at the end of a line, you can break a single line of code into multiple lines while ensuring that the next line is also part of the same statement:Output:
You can break single line to multiple lines
-
Inserting a Backslash Character
To include a literal backslash (
\
) in the string, you need to escape it with another backslash:Output:
print backslash \ inside a string
-
Inserting a Single Quote
To include a single quote within a single-quoted string, escape it with a backslash:
Output:
print single quote ' within a string
-
Inserting a Double Quote
To include a double quote within a double-quoted string, escape it with a backslash:
Output:
print double quote " within a string
-
Inserting a New Line
The newline character (
\n
) moves the text following it to a new line:Output:
-
Inserting a Tab
The tab character (
\t
) adds horizontal spacing:Output:
-
Inserting a Carriage Return
The carriage return (
\r
) moves the cursor back to the beginning of the line, overwriting the characters that follow it:Output:
like
-
Inserting a Backspace
The backspace character (
\b
) removes the character before it:Output:
Hi
(removes 'e') -
Inserting a Unicode Character
The Unicode character (
\u
) allows you to insert characters based on their Unicode code point:Output:
₹
(Indian Rupee symbol) -
Inserting a Character Based on Octal Value
The octal value (
\0oo
) allows you to insert characters based on their octal representation:Output:
&
-
Inserting a Character Based on Hexadecimal Value
The hexadecimal value (
\xhh
) allows you to insert characters based on their hexadecimal representation:Output:
$
These escape sequences are essential for formatting strings and including special characters in Python strings.
Raw Strings
Raw strings ignore escape sequences and are created by prefixing the string with r
or R
.
Syntax:
Example:
Unicode Strings
Unicode provides a unique number for every character, enabling consistent representation across different systems.
Creating Unicode Strings:
Note: Regular strings in Python 3 are Unicode by default, so the u
prefix is often not necessary.
Last updated on -