lex2.textio

Components of textstreams.

Classes

BaseTextstream

Abstract base class of an ITextstream implementation.

ITextIO

Interface to a class implementing TextIO functionality.

ITextstream

Common interface to a Textstream object instance.

TextIO

Abstract base class implementing ITextIO, providing TextIO functionality.

TextPosition

Struct that holds data about the position in a textstream.

TextstreamDisk

Textstream using disk streaming.

TextstreamMemory

Textstream using memory streaming.

TextstreamType

Textstream type.

class lex2.textio.TextPosition

Struct that holds data about the position in a textstream.

__init__(pos=0, col=0, ln=0)

TextPosition object instance initializer.

Parameters
  • pos (int, optional) – Absolute position in a text file. Note that multi-byte characters are counted as one position.

  • col (int, optional) – Column at a position in a text file. Counting starts from 0.

  • ln (int, optional) – Line at a position in a text file. Counting starts from 0.

pos: int

Absolute position in a textstream. Counting starts from 0. Note that multi-byte characters are counted as one position.

col: int

Column of a position in a textstream. Counting starts from 0.

ln: int

Line of a position in a textstream. Counting starts from 0.

class lex2.textio.ITextstream

Bases: ABC

Common interface to a Textstream object instance.

abstract close()

Closes and deletes textstream resources.

abstract update(n)

Updates the textstream’s buffer.

Parameters

n (int) – Amount of characters to read/update. Must be a positive number.

abstract is_eof()

Evaluates whether the textstream has reached the end of data.

abstract get_textstream_type()

Gets textstream enum type.

abstract get_text_position()

Gets the TextPosition object instance.

abstract get_string_buffer()

Gets the currently buffered string value.

abstract get_string_buffer_size()

Gets the length of the currently buffered string (in characters).

abstract get_string_buffer_position()

Gets the index of the current position in the buffered string.

class lex2.textio.BaseTextstream

Bases: ITextstream, ABC

Abstract base class of an ITextstream implementation.

abstract __init__(textstream_type)
is_eof()

Evaluates whether the textstream has reached the end of data.

get_textstream_type()

Gets textstream enum type.

get_text_position()

Gets the TextPosition object instance.

get_string_buffer()

Gets the currently buffered string value.

get_string_buffer_size()

Gets the length of the currently buffered string (in characters).

get_string_buffer_position()

Gets the index of the current position in the buffered string.

class lex2.textio.TextstreamType

Bases: Enum

Textstream type.

Values

MEMORY

When a textstream has all data in working memory at its disposal.

DISK

When a textstream only has part data available in working memory at one time, and has to dynamically read and swap data in chunks from disk.

class lex2.textio.TextstreamDisk

Bases: BaseTextstream, ITextstream

Textstream using disk streaming.

__init__(fp, buffer_size, encoding, convert_line_endings)

TextPosition object instance initializer.

Parameters
  • fp (Union[str, Path]) – String or Path object of a text file to open.

  • buffer_size (int) – Size of the buffer in kilobytes (kB). A size of zero (0) allocates the whole file into memory. In order to completely capture a token, its length must be smaller or equal to half the buffer size value. Note that the buffer size will be floored to the nearest even number.

  • encoding (str) – Encoding of the text file.

  • convert_line_endings (bool) – Convert line-endings from Windows style to UNIX style.

close()

Closes and deletes textstream resources.

update(n)

Updates the textstream’s buffer.

Parameters

n (int) – Amount of characters to read/update. Must be a positive number.

class lex2.textio.TextstreamMemory

Bases: BaseTextstream, ITextstream

Textstream using memory streaming.

__init__(str_data, convert_line_endings)

TextPosition object instance initializer.

Parameters
  • str_data (str) – String data to directly load. Note that encoding depends on the system-wide encoding.

  • convert_line_endings (bool) – Convert line-endings from Windows style to UNIX style.

close()

Closes and deletes textstream resources.

update(n)

Updates the textstream’s buffer.

Parameters

n (int) – Amount of characters to read/update. Must be a positive number.

class lex2.textio.ITextIO

Bases: ABC

Interface to a class implementing TextIO functionality.

abstract open(fp, buffer_size=512, encoding='UTF-8', convert_line_endings=True)

Opens a textfile.

Parameters
  • fp (str | Path) – String or Path object of a text file to open.

  • buffer_size (int, optional) – Size of the buffer in kilobytes (kB). A size of zero (0) allocates the whole file into memory. In order to completely capture a token, its length must be smaller or equal to half the buffer size value. Note that the buffer size will be floored to the nearest even number.

  • encoding (str, optional) – Encoding of the text file.

  • convert_line_endings (bool, optional) – Convert line-endings from Windows style to UNIX style.

abstract load(str_data, convert_line_endings=False)

Load string data directly.

Parameters
  • str_data (str) – String data to directly load. Note that encoding depends on the system-wide encoding.

  • convert_line_endings (bool, optional) – Convert line-endings from Windows style to UNIX style.

abstract close()

Closes and deletes textstream resources.

class lex2.textio.TextIO

Bases: ITextIO, ABC

Abstract base class implementing ITextIO, providing TextIO functionality.

abstract __init__()

TextIO object instance initializer.

open(fp, buffer_size=512, encoding='UTF-8', convert_line_endings=True)

Opens a textfile.

Parameters
  • fp (str | Path) – String or Path object of a text file to open.

  • buffer_size (int, optional) – Size of the buffer in kilobytes (kB). A size of zero (0) allocates the whole file into memory. In order to completely capture a token, its length must be smaller or equal to half the buffer size value. Note that the buffer size will be floored to the nearest even number.

  • encoding (str, optional) – Encoding of the text file.

  • convert_line_endings (bool, optional) – Convert line-endings from Windows style to UNIX style.

load(str_data, convert_line_endings=False)

Load string data directly.

Parameters
  • str_data (str) – String data to directly load. Note that encoding depends on the system-wide encoding.

  • convert_line_endings (bool, optional) – Convert line-endings from Windows style to UNIX style.

close()

Closes and deletes textstream resources.