String Functions¶

import introcs

The purpose of this functions is to allow students to work with strings without having to understand method calls. We do not provide all string methods as functions – just the most popular ones.

Type Checking¶

isalnum¶

introcs.isalnum(text)¶

Checks if all characters in text are alphanumeric and there is at least one character

A character c is alphanumeric if one of the following returns True: isalpha(), isdecimal(),:func:isdigit, or, isnumeric().

Parameters: text (str) – The string to check
Returns: True if all characters in text are alphanumeric and there is at least one character, False otherwise.
Return type: bool

isalpha¶

introcs.isalpha(text)¶

Checks if all characters in text are alphabetic and there is at least one character.

Alphabetic characters are those characters defined in the Unicode character database as a “Letter”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard.

Parameters: text (str) – The string to check
Returns: True if all characters in text are alphabetic and there is at least one character, False otherwise.
Return type: bool

islower¶

introcs.islower(text)¶

Checks if all cased characters in text are lowercase and there is at least one cased character.

Cased characters are defined by the Unicode standard. All alphabetic characters in the ASCII character set are cased.

Parameters: text (str) – The string to check
Returns: True if all cased characters in text are lowercase and there is at least one cased character, False otherwise.
Return type: bool

isupper¶

introcs.isupper(text)¶

Checks if all cased characters in text are uppercase and there is at least one cased character.

Cased characters are defined by the Unicode standard. All alphabetic characters in the ASCII character set are cased.

Parameters: text (str) – The string to check
Returns: True if all cased characters in text are uppercase and there is at least one cased character, False otherwise.
Return type: bool

isdecimal¶

introcs.isdecimal(text)¶

Check if all characters in text are decimal characters and there is at least one character.

Decimal characters are those that can be used to form integer numbers in base 10. For example, ‘10’ has all decimals, but ‘1.0’ does not (since the period is not a decimal). Formally a decimal character is in the Unicode General Category “Nd”.

Parameters: text (str) – The string to check
Returns: True if all characters in text are decimal characters and there is at least one character, False otherwise.
Return type: bool

isdigit¶

introcs.isdigit(text)¶

Checks if all characters in text are digits and there is at least one character.

Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. This covers digits which cannot be used to form numbers in base 10, like the Kharosthi numbers. It is very rare that this function is needed instead of isdecimal()

Parameters: text (str) – The string to check
Returns: True if all characters in text are digits and there is at least one character, False otherwise.
Return type: bool

isnumeric¶

introcs.isnumeric(text)¶

Checks if all characters in text are numeric characters, and there is at least one character.

Numeric characters include digit characters, and all characters that have the Unicode numeric value property. These includes all digit characters as well as vulgar fractions and Roman numeral (characters).

Parameters: text (str) – The string to check
Returns: True if all characters in text are numeric characters, and there is at least one character, False otherwise.
Return type: bool

isspace¶

introcs.isspace(text)¶

Checks if there are only whitespace characters in text and there is at least one character.

Whitespace characters are those characters defined in the Unicode character database as “Other” or “Separator”.

Parameters: text (str) – The string to check
Returns: True if there are only whitespace characters in text and there is at least one character, False otherwise.
Return type: bool

isprintable¶

introcs.isprintable(text)¶

Checks if all characters in text are printable or the string is empty.

Nonprintable characters are those characters defined in the Unicode character database as “Other” or “Separator”, excepting the ASCII space (0x20) which is considered printable. Note that printable characters in this context are those which should not be escaped when repr() is invoked on a string. It has no bearing on the handling of strings written to sys.stdout or sys.stderr.

Parameters: text (str) – The string to check
Returns: True if all characters in text are printable or the string is empty, False otherwise.
Return type: bool

Casing¶

capitalize¶

introcs.capitalize(text)¶

Creates a copy of text with only its first character capitalized.

For 8-bit strings, this function is locale-dependent.

Parameters: text (str) – The string to capitalize
Returns: A copy of text with only its first character capitalized.
Return type: str

swapcase¶

introcs.swapcase(text)¶

Creates a copy of text with uppercase characters converted to lowercase and vice versa.

Note that it is not necessarily true that swapcase(swapcase(s)) == s. That is because of how the Unicode Standard defines cases.

Parameters: text (str) – The string to convert
Returns: A copy of text with uppercase characters converted to lowercase and vice versa.
Return type: str

lower¶

introcs.lower(text)¶

Creates a copy of text with all the cased characters converted to lowercase.

The lowercasing algorithm used is described in section 3.13 of the Unicode Standard.

Parameters: text (str) – The string to convert
Returns: A copy of text with all the cased characters converted to lowercase.
Return type: str

upper¶

introcs.upper(text)¶

Creates a copy of text with all the cased characters converted to uppercase.

Note that isupper(upper(s)) might be False if s contains uncased characters or if the Unicode category of the resulting character(s) is not “Lu” (Letter, uppercase).

The uppercasing algorithm used is described in section 3.13 of the Unicode Standard.

Parameters: text (str) – The string to convert
Returns: A copy of text with all the cased characters converted to uppercase.
Return type: str

Searching¶

count_str¶

introcs.count_str(text, sub, start=None, end=None)¶

Computes the number of non-overlapping occurrences of substring sub in text[start:end].

Optional arguments start and end are interpreted as in slice notation.

Parameters

text (str) – The string to search
sub (str) – The substring to count
start (int) – The start of the search range
end (int) – The end of the search range

Returns

The number of non-overlapping occurrences of substring sub in text[start:end].

Return type

int

endswith_str¶

introcs.endswith_str(text, suffix, start=None, end=None)¶

Determines if text ends with the specified suffix.

The suffix can also be a tuple of suffixes to look for. With optional parameter start, the test will begin at that position. With optional parameter end, the test will stop comparing at that position.

Parameters

text (str) – The string to search
suffix (str or tuple of str) – The suffix to search for
start (int) – The start of the search range
end (int) – The end of the search range

Returns

True if text ends with the specified suffix, otherwise return False.

Return type

int

startswith_str¶

introcs.startswith_str(text, prefix, start=None, end=None)¶

Determines if text starts with the specified prefix.

The prefix can also be a tuple of prefixes to look for. With optional parameter start, the test will begin at that position. With optional parameter end, the test will stop comparing at that position.

Parameters

text (str) – The string to search
prefix (str or tuple of str) – The prefix to search for
start (int) – The start of the search range
end (int) – The end of the search range

Returns

True if text starts with the specified prefix, otherwise return False.

Return type

int

find_str¶

introcs.find_str(text, sub, start=None, end=None)¶

Finds the lowest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end]. The function returns -1 if sub is not found.

Note: The find_str() function should be used only if you need to know the position of sub. To check if sub is a substring or not, use the in operator:

>>>
>>> 'Py' in 'Python'
True

Parameters

text (str) – The string to search
sub (str) – The substring to search for
start (int) – The start of the search range
end (int) – The end of the search range

Returns

The lowest index of the substring sub within text in the range [start, end].

Return type

int

rfind_str¶

introcs.rfind_str(text, sub, start=None, end=None)¶

Finds the highest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end]. The function returns -1 if sub is not found.

Parameters

text (str) – The string to search
sub (str) – The substring to search for
start (int) – The start of the search range
end (int) – The end of the search range

Returns

The highest index of the substring sub within text in the range [start, end].

Return type

int

index_str¶

introcs.index_str(text, sub, start=None, end=None)¶

Finds the lowest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end].

This function is like find_str(), except that it raises a ValueError when the substring is not found.

Parameters

text (str) – The string to search
sub (str) – The substring to search for
start (int) – The start of the search range
end (int) – The end of the search range

Returns

The lowest index of the substring sub within text in the range [start, end].

Return type

int

rindex_str¶

introcs.rindex_str(text, sub, start=None, end=None)¶

Finds the highest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end].

This function is like rfind_str(), except that it raises a ValueError when the substring is not found.

Parameters

text (str) – The string to search
sub (str) – The substring to search for
start (int) – The start of the search range
end (int) – The end of the search range

Returns

The highest index of the substring sub within text in the range [start, end].

Return type

int

replace_str¶

introcs.replace_str(text, old, new, count=- 1)¶

Creates a copy of text with all occurrences of substring old replaced by new.

If the optional argument count is given, only the first count occurrences are replaced.

Parameters

text (str) – The string to copy
old (str) – The old string to replace
new (str) – The new string to replace with
count (int) – The number of occurrences to replace

Returns

A copy of text with all occurrences of substring old replaced by new.

Return type

str

Formatting¶

center¶

introcs.center(text, width, fillchar=' ')¶

Creates a copy of text centered in a string of length width.

Padding is done using the specified fillchar (default is an ASCII space). The original string is returned if width is less than or equal to len(s).

Parameters

text (str) – The string to center
width (int) – The width of the stirng to produce
fillchar (str) – The padding to expand the character to width

Returns

A copy of text centered in a string of length width.

Return type

str

ljust¶

introcs.ljust(text, width, fillchar=' ')¶

Creates a copy of text left justified in a string of length width.

Padding is done using the specified fillchar (default is an ASCII space). The original string is returned if width is less than or equal to len(s).

Parameters: text (str) – The string to justify
Returns: A copy of text left justified in a string of length width.
Return type: str

rjust¶

introcs.rjust(text, width, fillchar=' ')¶

Creates a copy of text right justified in a string of length width.

Padding is done using the specified fillchar (default is an ASCII space). The original string is returned if width is less than or equal to len(s).

Parameters: text (str) – The string to justify
Returns: A copy of text right justified in a string of length width.
Return type: str

strip¶

introcs.strip(text, chars=None)¶

Creates a copy of text with the leading and trailing characters removed.

The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped:

>>>
>>> strip('   spacious   ')
'spacious'
>>> strip('www.example.com','cmowz.')
'example'

The outermost leading and trailing chars argument values are stripped from the string. Characters are removed from the leading end until reaching a string character that is not contained in the set of characters in chars. A similar action takes place on the trailing end. For example:

>>>
>>> comment_string = '#....... Section 3.2.1 Issue #32 .......'
>>> strip(comment_string,'.#! ')
'Section 3.2.1 Issue #32'

Parameters

text (str) – The string to copy
chars (str) – The characters to remove from the ends

Returns

A copy of text with the leading and trailing characters removed.

Return type

str

lstrip¶

introcs.lstrip(text, chars=None)¶

Creates a copy of text with leading characters removed.

The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix; rather, all combinations of its values are stripped:

>>>
>>> lstrip('   spacious   ')
'spacious   '
>>> lstrip('www.example.com'.lstrip,'cmowz.')
'example.com'

Parameters

text (str) – The string to copy
chars (str) – The leading characters to remove

Returns

A copy of text with the leading characters removed.

Return type

str

rstrip¶

introcs.rstrip(text, chars=None)¶

Creates a copy of text with trailing characters removed.

The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a suffix; rather, all combinations of its values are stripped:

>>>
>>> rstrip('   spacious   ')
'   spacious'
>>> rstrip('mississippi','ipz')
'mississ'

Parameters

text (str) – The string to copy
chars (str) – The trailing characters to remove

Returns

A copy of text with the trailing characters removed.

Return type

str

Splitting¶

join¶

introcs.join(iterable, sep='')¶

Creates a string by concatenating the strings in iterable

A TypeError will be raised if there are any non-string values in iterable, including bytes objects. The optional separator is placed between the elements, but by default there is no separator.

Parameters

iterable (iterable) – The iterable of strings to concatenate
sep (str) – The separating string

Returns

A string which is the concatenation of the strings in iterable.

Return type

str

split¶

introcs.split(text, sep=None, maxsplit=- 1)¶

Creates a tuple of the words in text, using sep as the delimiter string.

If maxsplit is given, at most maxsplit splits are done (thus, the tuple will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, split('1,,2',',') returns ('1', '', '2')). The sep argument may consist of multiple characters (for example, split('1<>2<>3','<>') returns ('1', '2', '3')). Splitting an empty string with a specified separator returns ('',).

For example:

>>>
>>> split('1,2,3',',')
('1', '2', '3')
>>> split('1,2,3',',', maxsplit=1)
('1', '2,3')
>>> split('1,2,,3,',',')
('1', '2', '', '3', '')

If sep is not specified or is None, a different splitting algorithm is applied. In that case runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].

For example:

>>>
>>> split('1 2 3')
('1', '2', '3')
>>> split('1 2 3',maxsplit=1)
('1', '2 3')
>>> split('   1   2   3   ')
('1', '2', '3')

Parameters

text (str) – The string to split
sep (str) – The separator to split at
maxsplit (int) – The maximum number of splits to perform

Returns

A list of the words in text, using sep as the delimiter string.

Return type

str

rsplit¶

introcs.rsplit(text, sep=None, maxsplit=- 1)¶

Creates a tuple of the words in text, using sep as the delimiter string.

If maxsplit is given, at most maxsplit splits are done (thus, the tuple will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, rsplit('1,,2',',') returns ('1', '', '2')). The sep argument may consist of multiple characters (for example, rsplit('1<>2<>3','<>') returns ('1', '2', '3')). Splitting an empty string with a specified separator returns ('',).

This function only differs from split() if maxsplit is given and is less than the possible number of splits. In that case, the splits are favored to the right, and so the remainder is to the left.

Parameters

text (str) – The string to split
sep (str) – The separator to split at
maxsplit (int) – The maximum number of splits to perform

Returns

A list of the words in text, using sep as the delimiter string.

Return type

str

partition¶

introcs.partition(text, sep)¶

Splits text at the first occurrence of sep, returning the result as 3-tuple.

If the separator is not found, this function returns a 3-tuple containing the string itself, followed by two empty strings.

Returns: a 3-tuple containing the part before the separator, the separator itself, and the part after the separator.
Return type: tuple of str

rpartition¶

introcs.rpartition(text, sep)¶

Splits text at the last occurrence of sep, returning the result as 3-tuple.

If the separator is not found, this function a 3-tuple containing two empty strings, followed by the string itself.

Returns: a 3-tuple containing the part before the separator, the separator itself, and the part after the separator.
Return type: tuple of str