String Functions

import introcs

The purpose of this functions is to allow students to work with strings without having to understand method calls. We do not provide all string methods as functions – just the most popular ones.

Type Checking

isalnum

introcs.isalnum(text)

Checks if all characters in text are alphanumeric and there is at least one character

A character c is alphanumeric if one of the following returns True: isalpha(), isdecimal(),:func:isdigit, or, isnumeric().

Parameters

text (str) – The string to check

Returns

True if all characters in text are alphanumeric and there is at least one character, False otherwise.

Return type

bool

isalpha

introcs.isalpha(text)

Checks if all characters in text are alphabetic and there is at least one character.

Alphabetic characters are those characters defined in the Unicode character database as a “Letter”. Note that this is different from the “Alphabetic” property defined in the Unicode Standard.

Parameters

text (str) – The string to check

Returns

True if all characters in text are alphabetic and there is at least one character, False otherwise.

Return type

bool

islower

introcs.islower(text)

Checks if all cased characters in text are lowercase and there is at least one cased character.

Cased characters are defined by the Unicode standard. All alphabetic characters in the ASCII character set are cased.

Parameters

text (str) – The string to check

Returns

True if all cased characters in text are lowercase and there is at least one cased character, False otherwise.

Return type

bool

isupper

introcs.isupper(text)

Checks if all cased characters in text are uppercase and there is at least one cased character.

Cased characters are defined by the Unicode standard. All alphabetic characters in the ASCII character set are cased.

Parameters

text (str) – The string to check

Returns

True if all cased characters in text are uppercase and there is at least one cased character, False otherwise.

Return type

bool

isdecimal

introcs.isdecimal(text)

Check if all characters in text are decimal characters and there is at least one character.

Decimal characters are those that can be used to form integer numbers in base 10. For example, ‘10’ has all decimals, but ‘1.0’ does not (since the period is not a decimal). Formally a decimal character is in the Unicode General Category “Nd”.

Parameters

text (str) – The string to check

Returns

True if all characters in text are decimal characters and there is at least one character, False otherwise.

Return type

bool

isdigit

introcs.isdigit(text)

Checks if all characters in text are digits and there is at least one character.

Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. This covers digits which cannot be used to form numbers in base 10, like the Kharosthi numbers. It is very rare that this function is needed instead of isdecimal()

Parameters

text (str) – The string to check

Returns

True if all characters in text are digits and there is at least one character, False otherwise.

Return type

bool

isnumeric

introcs.isnumeric(text)

Checks if all characters in text are numeric characters, and there is at least one character.

Numeric characters include digit characters, and all characters that have the Unicode numeric value property. These includes all digit characters as well as vulgar fractions and Roman numeral (characters).

Parameters

text (str) – The string to check

Returns

True if all characters in text are numeric characters, and there is at least one character, False otherwise.

Return type

bool

isspace

introcs.isspace(text)

Checks if there are only whitespace characters in text and there is at least one character.

Whitespace characters are those characters defined in the Unicode character database as “Other” or “Separator”.

Parameters

text (str) – The string to check

Returns

True if there are only whitespace characters in text and there is at least one character, False otherwise.

Return type

bool

isprintable

introcs.isprintable(text)

Checks if all characters in text are printable or the string is empty.

Nonprintable characters are those characters defined in the Unicode character database as “Other” or “Separator”, excepting the ASCII space (0x20) which is considered printable. Note that printable characters in this context are those which should not be escaped when repr() is invoked on a string. It has no bearing on the handling of strings written to sys.stdout or sys.stderr.

Parameters

text (str) – The string to check

Returns

True if all characters in text are printable or the string is empty, False otherwise.

Return type

bool

Casing

capitalize

introcs.capitalize(text)

Creates a copy of text with only its first character capitalized.

For 8-bit strings, this function is locale-dependent.

Parameters

text (str) – The string to capitalize

Returns

A copy of text with only its first character capitalized.

Return type

str

swapcase

introcs.swapcase(text)

Creates a copy of text with uppercase characters converted to lowercase and vice versa.

Note that it is not necessarily true that swapcase(swapcase(s)) == s. That is because of how the Unicode Standard defines cases.

Parameters

text (str) – The string to convert

Returns

A copy of text with uppercase characters converted to lowercase and vice versa.

Return type

str

lower

introcs.lower(text)

Creates a copy of text with all the cased characters converted to lowercase.

The lowercasing algorithm used is described in section 3.13 of the Unicode Standard.

Parameters

text (str) – The string to convert

Returns

A copy of text with all the cased characters converted to lowercase.

Return type

str

upper

introcs.upper(text)

Creates a copy of text with all the cased characters converted to uppercase.

Note that isupper(upper(s)) might be False if s contains uncased characters or if the Unicode category of the resulting character(s) is not “Lu” (Letter, uppercase).

The uppercasing algorithm used is described in section 3.13 of the Unicode Standard.

Parameters

text (str) – The string to convert

Returns

A copy of text with all the cased characters converted to uppercase.

Return type

str

Searching

count_str

introcs.count_str(text, sub, start=None, end=None)

Computes the number of non-overlapping occurrences of substring sub in text[start:end].

Optional arguments start and end are interpreted as in slice notation.

Parameters
  • text (str) – The string to search

  • sub (str) – The substring to count

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

The number of non-overlapping occurrences of substring sub in text[start:end].

Return type

int

endswith_str

introcs.endswith_str(text, suffix, start=None, end=None)

Determines if text ends with the specified suffix.

The suffix can also be a tuple of suffixes to look for. With optional parameter start, the test will begin at that position. With optional parameter end, the test will stop comparing at that position.

Parameters
  • text (str) – The string to search

  • suffix (str or tuple of str) – The suffix to search for

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

True if text ends with the specified suffix, otherwise return False.

Return type

int

startswith_str

introcs.startswith_str(text, prefix, start=None, end=None)

Determines if text starts with the specified prefix.

The prefix can also be a tuple of prefixes to look for. With optional parameter start, the test will begin at that position. With optional parameter end, the test will stop comparing at that position.

Parameters
  • text (str) – The string to search

  • prefix (str or tuple of str) – The prefix to search for

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

True if text starts with the specified prefix, otherwise return False.

Return type

int

find_str

introcs.find_str(text, sub, start=None, end=None)

Finds the lowest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end]. The function returns -1 if sub is not found.

Note: The find_str() function should be used only if you need to know the position of sub. To check if sub is a substring or not, use the in operator:

>>>
>>> 'Py' in 'Python'
True
Parameters
  • text (str) – The string to search

  • sub (str) – The substring to search for

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

The lowest index of the substring sub within text in the range [start, end].

Return type

int

rfind_str

introcs.rfind_str(text, sub, start=None, end=None)

Finds the highest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end]. The function returns -1 if sub is not found.

Parameters
  • text (str) – The string to search

  • sub (str) – The substring to search for

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

The highest index of the substring sub within text in the range [start, end].

Return type

int

index_str

introcs.index_str(text, sub, start=None, end=None)

Finds the lowest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end].

This function is like find_str(), except that it raises a ValueError when the substring is not found.

Parameters
  • text (str) – The string to search

  • sub (str) – The substring to search for

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

The lowest index of the substring sub within text in the range [start, end].

Return type

int

rindex_str

introcs.rindex_str(text, sub, start=None, end=None)

Finds the highest index of the substring sub within text in the range [start, end].

Optional arguments start and end are interpreted as in slice notation. However, the index returned is relative to the original string text and not the slice text[start:end].

This function is like rfind_str(), except that it raises a ValueError when the substring is not found.

Parameters
  • text (str) – The string to search

  • sub (str) – The substring to search for

  • start (int) – The start of the search range

  • end (int) – The end of the search range

Returns

The highest index of the substring sub within text in the range [start, end].

Return type

int

replace_str

introcs.replace_str(text, old, new, count=- 1)

Creates a copy of text with all occurrences of substring old replaced by new.

If the optional argument count is given, only the first count occurrences are replaced.

Parameters
  • text (str) – The string to copy

  • old (str) – The old string to replace

  • new (str) – The new string to replace with

  • count (int) – The number of occurrences to replace

Returns

A copy of text with all occurrences of substring old replaced by new.

Return type

str

Formatting

center

introcs.center(text, width, fillchar=' ')

Creates a copy of text centered in a string of length width.

Padding is done using the specified fillchar (default is an ASCII space). The original string is returned if width is less than or equal to len(s).

Parameters
  • text (str) – The string to center

  • width (int) – The width of the stirng to produce

  • fillchar (str) – The padding to expand the character to width

Returns

A copy of text centered in a string of length width.

Return type

str

ljust

introcs.ljust(text, width, fillchar=' ')

Creates a copy of text left justified in a string of length width.

Padding is done using the specified fillchar (default is an ASCII space). The original string is returned if width is less than or equal to len(s).

Parameters

text (str) – The string to justify

Returns

A copy of text left justified in a string of length width.

Return type

str

rjust

introcs.rjust(text, width, fillchar=' ')

Creates a copy of text right justified in a string of length width.

Padding is done using the specified fillchar (default is an ASCII space). The original string is returned if width is less than or equal to len(s).

Parameters

text (str) – The string to justify

Returns

A copy of text right justified in a string of length width.

Return type

str

strip

introcs.strip(text, chars=None)

Creates a copy of text with the leading and trailing characters removed.

The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped:

>>>
>>> strip('   spacious   ')
'spacious'
>>> strip('www.example.com','cmowz.')
'example'

The outermost leading and trailing chars argument values are stripped from the string. Characters are removed from the leading end until reaching a string character that is not contained in the set of characters in chars. A similar action takes place on the trailing end. For example:

>>>
>>> comment_string = '#....... Section 3.2.1 Issue #32 .......'
>>> strip(comment_string,'.#! ')
'Section 3.2.1 Issue #32'
Parameters
  • text (str) – The string to copy

  • chars (str) – The characters to remove from the ends

Returns

A copy of text with the leading and trailing characters removed.

Return type

str

lstrip

introcs.lstrip(text, chars=None)

Creates a copy of text with leading characters removed.

The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a prefix; rather, all combinations of its values are stripped:

>>>
>>> lstrip('   spacious   ')
'spacious   '
>>> lstrip('www.example.com'.lstrip,'cmowz.')
'example.com'
Parameters
  • text (str) – The string to copy

  • chars (str) – The leading characters to remove

Returns

A copy of text with the leading characters removed.

Return type

str

rstrip

introcs.rstrip(text, chars=None)

Creates a copy of text with trailing characters removed.

The chars argument is a string specifying the set of characters to be removed. If omitted or None, the chars argument defaults to removing whitespace. The chars argument is not a suffix; rather, all combinations of its values are stripped:

>>>
>>> rstrip('   spacious   ')
'   spacious'
>>> rstrip('mississippi','ipz')
'mississ'
Parameters
  • text (str) – The string to copy

  • chars (str) – The trailing characters to remove

Returns

A copy of text with the trailing characters removed.

Return type

str

Splitting

join

introcs.join(iterable, sep='')

Creates a string by concatenating the strings in iterable

A TypeError will be raised if there are any non-string values in iterable, including bytes objects. The optional separator is placed between the elements, but by default there is no separator.

Parameters
  • iterable (iterable) – The iterable of strings to concatenate

  • sep (str) – The separating string

Returns

A string which is the concatenation of the strings in iterable.

Return type

str

split

introcs.split(text, sep=None, maxsplit=- 1)

Creates a tuple of the words in text, using sep as the delimiter string.

If maxsplit is given, at most maxsplit splits are done (thus, the tuple will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, split('1,,2',',') returns ('1', '', '2')). The sep argument may consist of multiple characters (for example, split('1<>2<>3','<>') returns ('1', '2', '3')). Splitting an empty string with a specified separator returns ('',).

For example:

>>>
>>> split('1,2,3',',')
('1', '2', '3')
>>> split('1,2,3',',', maxsplit=1)
('1', '2,3')
>>> split('1,2,,3,',',')
('1', '2', '', '3', '')

If sep is not specified or is None, a different splitting algorithm is applied. In that case runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns [].

For example:

>>>
>>> split('1 2 3')
('1', '2', '3')
>>> split('1 2 3',maxsplit=1)
('1', '2 3')
>>> split('   1   2   3   ')
('1', '2', '3')
Parameters
  • text (str) – The string to split

  • sep (str) – The separator to split at

  • maxsplit (int) – The maximum number of splits to perform

Returns

A list of the words in text, using sep as the delimiter string.

Return type

str

rsplit

introcs.rsplit(text, sep=None, maxsplit=- 1)

Creates a tuple of the words in text, using sep as the delimiter string.

If maxsplit is given, at most maxsplit splits are done (thus, the tuple will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, rsplit('1,,2',',') returns ('1', '', '2')). The sep argument may consist of multiple characters (for example, rsplit('1<>2<>3','<>') returns ('1', '2', '3')). Splitting an empty string with a specified separator returns ('',).

This function only differs from split() if maxsplit is given and is less than the possible number of splits. In that case, the splits are favored to the right, and so the remainder is to the left.

Parameters
  • text (str) – The string to split

  • sep (str) – The separator to split at

  • maxsplit (int) – The maximum number of splits to perform

Returns

A list of the words in text, using sep as the delimiter string.

Return type

str

partition

introcs.partition(text, sep)

Splits text at the first occurrence of sep, returning the result as 3-tuple.

If the separator is not found, this function returns a 3-tuple containing the string itself, followed by two empty strings.

Returns

a 3-tuple containing the part before the separator, the separator itself, and the part after the separator.

Return type

tuple of str

rpartition

introcs.rpartition(text, sep)

Splits text at the last occurrence of sep, returning the result as 3-tuple.

If the separator is not found, this function a 3-tuple containing two empty strings, followed by the string itself.

Returns

a 3-tuple containing the part before the separator, the separator itself, and the part after the separator.

Return type

tuple of str