Overview

Path PyPoE/poe/file/dat.py
Version 1.0.0a0
Revision $Id: c11f73e5589cfaf6074cfe29f46f717fb14a193c $
Author Omega_K2

Description

Support for .dat file format.

.dat files can be found in Data/ and are read by DatFile. Unfortunately, there is no magic keyword for identifying the GGG .dat format, so advise caution when trying to read dat files.

The GGG .dat format uses a fixed-width table and a variable-length data section. In the fixed-width table, the number of rows is defined, however, the data format stored has to be reverse-engineered and is currently not stored in the file itself. The data is a continuous amount of binary data; reading values form there is generally done by pointers (int) or list pointers (size, int) from the table-data.

A list of default specification is included with PyPoE; to set the correct version set_default_spec() may be used.

Agreement

See PyPoE/LICENSE

Todo

  • DatValue.get_value might hit the python recursion limit, but it is not a problem for any of the actual dat files.
  • Update RR with the new indexing

Documentation

Public API

class PyPoE.poe.file.dat.DatFile(file_name)[source]

Bases: PyPoE.poe.file.shared.AbstractFileReadOnly

Representation of a .dat file.

Variables:reader (DatReader) – reference to the DatReader instance once read() has been called
__init__(file_name)[source]
Parameters:file_name (str) – Name of the .dat file
get_read_buffer(file_path_or_raw, function, *args, **kwargs)

Will attempt to open the given file_path_or_raw in read mode and pass the buffer to the specified function. The function must accept at least one keyword argument called ‘buffer’.

Parameters:
  • file_path_or_raw (BytesIO | bytes | str) – file path, bytes or buffer to read from
  • args – Additional positional arguments to pass to the specified function
  • kwargs – Additional keyword arguments to pass to the specified function
Returns:

Result of the function

Return type:

object

Raises:

TypeError – if file_path_or_raw has an invalid type

read(file_path_or_raw, *args, **kwargs)

Reads the file contents into the specified path or buffer. This will also reset any existing contents of the file.

If a buffer or bytes was given, the data will be read from the buffer or bytes object.

If a file path was given, the resulting data will be read from the specified file.

Parameters:
  • file_path_or_raw (BytesIO | bytes | str) – file path, bytes or buffer to read from
  • args – Additional positional arguments
  • kwargs – Additional keyword arguments
Returns:

result of the read operation, if any

Return type:

object

Raises:

TypeError – if file_path_or_raw has an invalid type

class PyPoE.poe.file.dat.RelationalReader(raise_error_on_missing_relation=False, language=None, *args, **kwargs)[source]

Bases: PyPoE.poe.file.shared.cache.AbstractFileCache

Read dat files in a relational matter and cache them for further use.

The relational reader will process all relations upon accessing a dat file; this means any field marked as relation or enum in the specification will be processed and the pointer will be replaced with the actual value.

For example, if a row “OtherKey” points to another file “OtherDatFil.dat”, the contents of “OtherKey” will no longer be a reference like 0, but instead the actual row from the file “OtherDatFile.dat”.

As a result you have equivalence of:

  • rr[“DatFile.dat”][“OtherKey”][“OtherDatFileValue”]
  • rr[“OtherDatFile.dat”][0][“OtherDatFileValue”]

Enums are processed in a similar fashion, except they’ll be replaced with the according enum instance from PyPoE.poe.constants for the specific value.

Variables:
FILE_TYPE

alias of DatFile

__init__(raise_error_on_missing_relation=False, language=None, *args, **kwargs)[source]
path_or_ggpk : str | GGPKFile
The root path (i.e. relative to content.ggpk) where the files are stored or a PyPoE.poe.file.ggpk.GGPKFile instance
files : Iterable
Iterable of files that will be loaded right away
files_shortcut : bool
Whether to use the shortcut function, i.e. self.__getitem__
instance_options : dict[str, object]
options to pass to the file’s __init__ method
read_options : dict[str, object]
options to pass to the file instance’s read method
TypeError
if path_or_ggpk not specified or invalid type
ValueError
if a PyPoE.poe.file.ggpk.GGPKFile was passed, but it was not parsed
Parameters:
  • raise_error_on_missing_relation (bool) – Raises error instead of issuing an warning when a relation is broken
  • language (str) – language subdirectory in data directory
get_file(file_name)[source]

Attempts to return a dat file from the cache and if it isn’t available, reads it in.

During the process any relations (i.e. fields that have a “key” to other .dat files specified) will be read. This will result in the appropriate fields being replaced by the related row. Note that a related row may be “None” if no key was specified in the read dat file.

Parameters:file_name (str) – The name of the .dat to read. Extension is required.
Returns:Returns the given DatFile instance
Return type:DatFile
path_or_ggpk

The path or PyPoE.poe.file.ggpk.GGPKFile instance the cache was created with

PyPoE.poe.file.dat.set_default_spec(version=<VERSION.STABLE: 1>, reload=False)[source]

Sets the default specification to use for the dat reader.

See PyPoE.poe.file.specification.__init__ for more info

Parameters:
  • version (constants.VERSION) – Version of the game to load the default specification for.
  • reload (bool) – Whether to reload the version.

Internal API

class PyPoE.poe.file.dat.DatReader(file_name, *args, use_dat_value=True, specification=None, auto_build_index=False)[source]

Bases: PyPoE.shared.mixins.ReprMixin

Variables:
  • _table_offset (int) – Starting offset of table data in bytes
  • _cast_table (dict[str, list[str, int]]) – Mapping of cast type to the corresponding struct
  • and the size (type) – of the cast in bytes
  • auto_build_index (bool) – Whether the index is automatically build after reading
  • file_name (str) – File name
  • file_length (int) – File length in bytes
  • table_data (list[DatRecord[object]]) – List of rows containing DatRecord entries.
  • table_length (int) – Length of table in bytes
  • table_record_length (int) – Length of each record in bytes
  • table_rows (int) – Number of rows in table
  • data_offset (int) – Data section offset
  • columns (OrderedDict) – Shortened list of columns excluding intermediate columns
  • columns_zip (OrderedDict) – Shortened list of columns excluding zipped columns
  • columns_all (OrderedDict) – Complete list of columns, including all intermediate and virtual columns
  • columns_data (OrderedDict) – List of all columns directly derived from the data
  • columns_unique (OrderedDict) – List of all unique columns (which are also considered indexable)
  • table_columns (OrderedDict) – Used for mapping columns to indexes
__init__(file_name, *args, use_dat_value=True, specification=None, auto_build_index=False)[source]
Parameters:
  • file_name (str) – Name of the dat file
  • use_dat_value (bool) – Whether to use DatValue instances or values
  • specification (Specification) – Specification to use
  • auto_build_index (bool) – Whether to automatically build the index for unique columns after reading.
Raises:

errors.SpecificationError – if the dat file is not in the specification

build_index(column=None)[source]

Builds or rebuilds the index for the specified column.

Indexed columns can be accessed though the instance variable index and will return a single value for unique columns and a list for non-unique columns.

For example: self.index[column_name][indexed_value]

Warning

This method only works for columns that are marked as unique in the specification.

Parameters:column (str or Iterable or None) – if specified the index will the built for the specified column or iterable of columns if not specified, the index will be build for any ‘unique’ columns by default
column_iter()[source]

Iterators over the columns

Yields:list – Values per column
export_to_html(export_table=True, export_data=False)[source]

DEPRECATED. Will be removed in PyPoE unknown version

print_data()[source]

For debugging. Prints out data.

row_iter()[source]
Returns:Iterator over the rows
Return type:iter
class PyPoE.poe.file.dat.DatValue(value=None, offset=None, size=None, parent=None, specification=None)[source]

Bases: object

Representation of a value found in a dat file.

DatValue instances are created by reading or writing a DatValue and should not be directly be created. The purpose of DatValues is to keep information regarding the placement of the value in the respective DatFile intact.

Support for built-ins:

DatValue do support comparison, however is it performed on the dereferenced value it holds, not the equality of the dat value itself.

This means generally DatValues can be compared to anything, the actual comparison is however performed depending on the data type.

Example 1: dat_value < 0

  • works if the dat_value holds an integer
  • raises TypeError if it holds a list

Example 2: dat_value1 < dat_value2

  • works if both dat values have the same or comparable types
  • raises TypeError if one holds a list, and the other an integer

Dev notes: Must keep the init

data_end_offset

Retrieves the end offset of the data held by the current instance in the data section.

Returns:end offset of data
Return type:int
Raises:TypeError – If performed on DatValue instances without data
data_size

Retrieves size of the data held by the current instance in the data section.

Returns:size of data
Return type:int
Raises:TypeError – If performed on DatValue instances without data
data_start_offset

Retrieves the start offset of the data held by the current instance in the data section.

Returns:start offset of data
Return type:int
Raises:TypeError – If performed on DatValue instances without data
get_value()[source]

Returns the value that is held by the DatValue instance. This is done recursively, i.e. pointers will be dereferenced accordingly.

This means if you want the actual value of the DatValue, you should probably access the value attribute instead.

If this DatValue instance is a list, this means a python list of items will be returned. If this DatValue instance is a pointer, this means whatever value the child of this instance holds will be returned. Otherwise the value of the DatValue instance itself will be returned.

Note, that values may be nested i.e. if a list contains a list, a nested list will be returned accordingly.

Returns:the dereferenced value
Return type:object
has_data

Whether this DatValue instance has data or not; this applies to types that hold a pointer.

Returns:
Return type:bool
is_data

Whether this DatValue instance is data or not.

Returns:
Return type:bool
is_list

Whether this DatValue instance is a list.

Returns:
Return type:bool
is_parsed

Whether this DatValue instance is parsed (i.e. non bytes).

Returns:
Return type:bool
is_pointer

Whether this DatValue instance is a pointer.

Returns:
Return type:bool