File Functions

import introcs

This functions are useful handling simple data science assignments. They read and write popular (text-based) file formats. The most valuable functions are read_csv() and write_csv(), as these are non-trivial to implement.

Reading

read_txt

introcs.read_txt(filename)

Reads the contents of the text file filename.

This function reads the contents of the file filename and returns the result. This function assumes the file is a text file and not a binary file. If this is not the case, it will raise an error.

Parameters

filename (str) – The file to read

Returns

A string representing the file contents

Return type

str

read_json

introcs.read_json(filename)

Reads the contents of the JSON file filename.

This function reads the contents of the file filename. Assuming it is a properly encoded JSON file, it will convert this into a Python data value, which will either be a dictionary or a list.

If the file does not exist, or is not a proper JSON file, this function will raise an error.

Parameters

filename (str) – The file to read

Returns

A dictionary or list representing the file contents

Return type

dict or list

read_csv

introcs.read_csv(filename)

Reads the contents of the CSV file filename.

This function reads the contents of the file filename. Assuming it is a properly encoded csv file, it will convert this into a 2-dimensional list, where each element of the list is the row. Cells in the row are all interpreted as strings. It is up to the programmer to interpret this data, since CSV files contain no type information.

If the file does not exist, or is not a proper CSV file, this function will raise an error.

Parameters

filename (str) – The file to read

Returns

A two dimensional list including the header as the first row

Return type

2d list

read_package

introcs.read_package(folder)

Reads the contents of the given directory.

A package is a directory with a file index.json inside of it. This JSON is a one-level dictionary mapping keys to file names. These files should all be in the directory. Each file can be either a text , JSON, or CSV file, or another directory (which should also be a package).

This method returns a dictionary maping the keys to the contents of each file, as defined recursively by read_text(), read_json(), read_csv(), and read_package().

Packages are mainly used as a way of gathering data files from multiple source. Any package can actually be represented as a single JSON file, but it is not always easy to construct this file.

If the directory does not exist, or is not a proper package, this function will raise an error.

Parameters

folder (str) – The directory to read

Returns

A dictionary containing the contents of each file in the package

Return type

dict

Writing

There is no write-version of read_package(). Any data value that could be written to a package can be written to a single JSON file more efficiently.

write_txt

introcs.write_txt(data, filename)

Writes the given data out as a text file filename.

The data should be a string. If it is not proper data, this function will raise an error. The filename may have any extension.

Parameters
  • data (str) – The text to write to a file

  • filename (str) – The file to write

write_json

introcs.write_json(data, filename)

Writes the given data out as a JSON file filename.

The data should be an JSON encodable value (e.g. either a primitive – int, float, bool, string – or a list or dictionary of JSON encodable values). If it is not proper data, this function will raise an error.

The JSON filename must either have no extension, or the extension .json. Any other extension will cause an error.

Parameters
  • data (JSON-encodable value) – The Python value to encode as a JSON

  • filename (str) – The file to write

write_csv

introcs.write_csv(data, filename)

Writes the given data out as a CSV file filename.

To be a proper CSV file, it must be a 2-dimensional list with the first row containing only strings. All other rows may be any python value. Dates are converted using isoformatting. All other objects are converted to their string representation.

The CSV filename must either have no extension, or the extension .csv. Any other extension will cause an error.

Parameters
  • data (2d list) – The Python value to encode as a CSV file

  • filename (str) – The file to write

Errors

class introcs.FileToolError

A simple error class to unify error responses for the filetools package.