"""
Overview
===============================================================================
+----------+------------------------------------------------------------------+
| Path | PyPoE/poe/file/specification/fields.py |
+----------+------------------------------------------------------------------+
| Version | 1.0.0a0 |
+----------+------------------------------------------------------------------+
| Revision | $Id: 403117de6bea46d1b8023dfb3553ae0a381b5a48 $ |
+----------+------------------------------------------------------------------+
| Author | Omega_K2 |
+----------+------------------------------------------------------------------+
Description
===============================================================================
Data fields to be used in the specifications
Agreement
===============================================================================
See PyPoE/LICENSE
General Information
===============================================================================
About .dat files.
-------------------------------------------------------------------------------
.dat files are basically a binary database. They have a fixed width data
section and a variable width data section.
The header of the file contains the number of rows and the size of the rows,
so we can determine with certainty how large it would be. Together this is
fixed size data section.
The variable size section starts with a magic keyword 0xBBbbBBbbBBbbBB and
then is just filled with the data (which can not be read properly without
understanding the fixed width section - all data is referenced).
Note that some of the .dat files have been purposefully stripped of their
rows and data so they can not be read.
Style Guide
-------------------------------------------------------------------------------
Fields should be named with a proper name. CamelCase should be used on the all
the keys.
Related values should be have a prefix which is separated by a underscore.
For example, if there are two columns one of which provides a list of Ids and
the other provides a list of Values, you'd do something like:
* MyValue_Ids
* MyValue_Values
If it is unknown what the field does, name it after the datatype and replace N:
Index<N>
for unknown strings (ref|string)
Unknown<N>
for unknown values (byte, short, int, long & unsigned variants)
Data<N>
for unknown data (ref|list)
Flag<N>
for unknown boolean values (bool)
Key<N>
for unknown keys (likely to be a key, ulong type)
If it is known what the field does
<WhatAmI>
Use a good name. For example ItemLevel if the key is item level
<OtherDat>Key
If you know this field references another dat file use this naming;
for example WorldAreasKey
<OtherDat>Keys
Similar to above, but use this for a list of keys
<EXT>File
Use this for a ref|string that contains a file path
Replace <EXT> with the extension of the file, for example:
DDSFile for a file with the .dds extension
Id
Use this for primary key value. Usually also the first value in a row
Editing guide
-------------------------------------------------------------------------------
New file:
Use int or uint types if you have a new .dat and fill it up until the data
size (pad with short/byte if necessary).
Finding out the proper type:
- an int field followed by a all 0 field is usually an ulong reference key
- this also applies to lists, i.e. a ref|list|uint with each entry followed
by a zero is probably a ref|list|ulong instead (and a key).
- an int field with ever increasing numbers not larger then data section is
a pointer to the data section. If preceeded by a value > 0, it may be a
list; otherwise it may be a string
- if the value is increasing but out of bounds, the int may be at the wrong
position, i.e. preceeded by byte(s) or short.
- list and strings may be empty
- multiple empty lists may point at the same position in the data
- empty strings will still take up 4 bytes of space (the zero terminator)
- if you see there are gaps or overlapping values in data section, considering
increasing/decreasing the type accordingly (i.e. from ref|uint to ref|ulong)
- if that doesn't help, the key might not be a reference
Finding out the proper meaning:
- First of all, mind the game! A lot can be deducted from knowing the game
well.
- Keep the name of the file in mind;
- it's common for files to have references to other related files (i.e. a
xxxMasterMission is most likely to contain a reference to Master.dat
somewhere)
- the values will usually relate the file name obviously; i.e. will contain
stats/mods for the items, their visuals, and so on
- often these are supplied as keys (or Key1, Key2, Value1, Value2 ...)
- Look at the minimum and maximum of the values; often they only have a
specific range which can hint at their meaning
- 0 to 100 can often be Level related
- Values with a base line of 1000 (or more rarely 100) above 0 are often
spawn chance or weighting related.
- Values often appear as pairs, for example:
- Spawn Weight
- ref|list|ulong -> Tags.dat keys
- ref|list|int -> Values
- Stats
- ulong -> Stat key
- int -> Value (sometimes 2x for min/max rollable range)
Regarding references/keys to other files:
- Generally for their type:
- ulong if referencing another file
- uint if referencing the same file
- None (0xFEFEFEFE) is a pretty solid giveaway
- Offsets:
- Usually the other dat file, starting at 0 (offset not needed, default)
- If the file has been blanked or if it referencing a specific column,
often it uses offset 1
- Finding out what they reference to:
- if the keys is very small it's likely to refer to a file with little
entries (like difficulty, master, etc), like wise for big keys.
- based on what the file does related files can often be deducted
- references to Tags.dat and Mods.dat are very common
- if possible, test the references out and see if they make sense
Lastly, I suppose you could also try to reverse engineer the PathOfExile.exe
and see whether you find any structs for the files.
Documentation
===============================================================================
.. autoclass:: Specification
.. autoclass:: File
.. autoclass:: Field
.. autoclass:: VirtualField
"""
# =============================================================================
# Imports
# =============================================================================
# Python
from collections import OrderedDict
# 3rd-party
# self
from PyPoE.poe import constants
from PyPoE.poe.file.specification.errors import SpecificationError
# =============================================================================
# Globals
# =============================================================================
__all__ = ['Specification', 'File', 'Field', 'VirtualField']
# =============================================================================
# Classes
# =============================================================================
class _Common(object):
def as_dict(self):
"""
Returns
-------
dict
Returns itself as dictionary without any class references
"""
return {k: getattr(self, k) for k in self.__slots__}
[docs]class Specification(dict):
"""
Specification file
"""
def __init__(self, *args, **kwargs):
super(Specification, self).__init__(*args, **kwargs)
[docs] def validate(self):
"""
Performs validation on the current data in the specification
Raises
------
SpecificationError
Raised when any errors occur.
See :py:mod:`PyPoE.poe.file.specification.errors` for details
on errors and error codes.
"""
for file_name, file in self.items():
for field_name, field in file.fields.items():
# Validation
if field.key:
if field.key not in self:
raise SpecificationError(
SpecificationError.ERRORS.INVALID_FOREIGN_KEY_FILE,
'%(dat_file)s->%(field)s->key: %(other)s is not in '
'specification' % {
'dat_file': file_name,
'field': field_name,
'other': field.key,
}
)
other_key = field['key_id']
if other_key and other_key not in self[field.key]['fields']:
raise SpecificationError(
SpecificationError.ERRORS.INVALID_FOREIGN_KEY_ID,
'%(dat_file)s->%(field)s->key_id: %(other)s->'
'%(other_key)s not in specification' % {
'dat_file': file_name,
'field': field_name,
'other': field.key,
'other_key': other_key,
}
)
if field.enum:
if field.key:
raise SpecificationError(
SpecificationError.ERRORS.INVALID_ARGUMENT_COMBINATION,
'%(dat_file)s->%(field)s->enum: Either key or enum can '
'be specified but never both.' % {
'dat_file': file_name,
'field': field_name,
}
)
if not hasattr(constants, field.enum):
raise SpecificationError(
SpecificationError.ERRORS.INVALID_ENUM_NAME,
'%(dat_file)s->%(field)s->enum: Invalid constant enum '
'""%(enum)s" specified'
% {
'dat_file': file_name,
'field': field_name,
'enum': field.enum,
}
)
for field_name, virtual_field in file.virtual_fields.items():
# Validation
if field_name in file.fields:
raise SpecificationError(
SpecificationError.ERRORS.VIRTUAL_KEY_DUPLICATE,
'%(dat_file)s->virtual_fields->%(field)s use the same name '
'as a key specified in %(dat_file)s->fields' %
{
'dat_file': file_name,
'field': field_name,
}
)
if not virtual_field.fields:
raise SpecificationError(
SpecificationError.ERRORS.VIRTUAL_KEY_EMPTY,
'%(dat_file)s->virtual_fields->%(field)s->fields is empty' %
{
'dat_file': file_name,
'field': field_name,
}
)
for other_field in virtual_field.fields:
if other_field not in file.fields and \
other_field not in file.virtual_fields:
raise SpecificationError(
SpecificationError.ERRORS.VIRTUAL_KEY_INVALID_KEY,
'%(dat_file)s->virtual_fields->%(field)s->fields: '
'Field "%(other_field)s" does not exist' %
{
'dat_file': file_name,
'field': field_name,
'other_field': other_field,
}
)
if virtual_field.zip and other_field in file.fields and \
not file.fields[other_field].type.startswith(
'ref|list'):
raise SpecificationError(
SpecificationError.ERRORS.VIRTUAL_KEY_INVALID_DATA_TYPE,
'%(dat_file)s->virtual_fields->%(field)s->zip: The zip '
'option requires "%(other_field)s" to be a list' %
{
'dat_file': file_name,
'field': field_name,
'other_field': other_field,
}
)
[docs] def as_dict(self):
"""
Returns
-------
dict
Returns itself as dictionary without any class references
"""
return {
k: v.as_dict() for k, v in self.items()
}
[docs]class File(object):
"""
Represents a single file in the specification.
Parameters
----------
fields : OrderedDict
OrderedDict containing the field name as key and a :class:`Field`
instance as value
virtual_fields : OrderedDict
OrderedDict containing the field name as key and a
:class:`VirtualField` instance as value
columns : OrderedDict
Shortened list of columns excluding intermediate columns
columns_zip : OrderedDict
Shortened list of columns excluding zipped columns
columns_all : OrderedDict
Complete list of columns, including all intermediate and virtual columns
columns_data : OrderedDict
List of all columns directly derived from the data
columns_unique: OrderedDict
List of all unique columns (which are also considered indexable)
"""
__slots__ = [
'fields',
'virtual_fields',
'columns',
'columns_all',
'columns_data',
'columns_unique',
'columns_zip',
]
[docs] def __init__(self, fields=None, virtual_fields=None):
"""
Parameters
----------
fields : OrderedDict
OrderedDict containing the field name as key and a :class:`Field`
instance as value
virtual_fields : OrderedDict
OrderedDict containing the field name as key and a
:class:`VirtualField` instance as value
"""
if fields is None:
fields = OrderedDict()
self.fields = fields
if virtual_fields is None:
virtual_fields = OrderedDict()
self.virtual_fields = virtual_fields
# Set utility columns from the given data
self.columns = OrderedDict()
self.columns_unique = OrderedDict()
for field_name, field in fields.items():
self.columns[field_name] = None
if field.unique:
self.columns_unique[field_name] = None
self.columns_all = OrderedDict(self.columns)
self.columns_data = OrderedDict(self.columns)
self.columns_zip = OrderedDict(self.columns)
if virtual_fields:
delete = set()
delete_zip = set()
for field_name, virtual_field in virtual_fields.items():
self.columns[field_name] = None
self.columns_all[field_name] = None
self.columns_zip[field_name] = None
if virtual_field.zip:
delete_zip.update(virtual_field.fields)
delete.update(virtual_field.fields)
for item in delete:
try:
del self.columns[item]
# TODO: This can happen when virtual keys are invalid, move from validator to here?
except KeyError:
pass
for item in delete_zip:
del self.columns_zip[item]
def __getitem__(self, item):
return getattr(self, item)
[docs] def as_dict(self):
"""
Returns
-------
dict
Returns itself as dictionary without any class references
"""
out = {}
for k in self.__slots__:
v = getattr(self, k)
if k in ('fields', 'virtual_fields'):
out[k] = OrderedDict([(ok, ov.as_dict()) for ok, ov in v.items()])
else:
out[k] = v
return out
[docs]class Field(_Common):
"""
Fields instances are used to tie a specific set of information to a
column field.
**Type Syntax**
I've mostly adapted the Syntax from VisualGGPK2, but it may be subject to
change to the python struct data types; for now they'll stay since bool is
most certainly more readable then ?.
Base types:
bool
8 bit integer value, first bit is 1 or 0 (cocered to True/False)
byte
8 bit integer value, signed
ubyte
8 bit integer value, unsigned
short
16 bit integer value, signed
ushort
16 bit integer value, unsigned
int
32 bit integer value, signed
uint
32 bit integer value, unsigned
long
64 bit integer value, signed
ulong
64 bit integer value, unsigned
float
32 bit floating point value, single precision
double
64 bit floating point value, double precision
Variable/Pointer types:
ref|<other>
32 bit value, unsigned
a pointer to the data section
ref|list|<other>
two 32 bit values, unsigned
first value determines the size of the list
second value is the pointer to the data section
ref|string
just like a normal reference, but it will parse as null terminated
utf16_le encoded string
"""
__slots__ = [
'type', 'key', 'key_id', 'key_offset', 'enum', 'unique', 'file_path',
'file_ext', 'display', 'display_type', 'description'
]
[docs] def __init__(self, type, key=None, key_id=None, key_offset=0, enum=None,
unique=False, file_path=False, file_ext=None, display=None,
display_type='{0}', description=None):
"""
All parameters except type are optional.
Parameters
----------
type : str
Required. The type. See Type Syntax above
key : str
Name of the .dat file that is reference. Must exist in the
specification.
key_id : str
Name of the column in the other dat file that is referenced.
This should be specified together with key, alone it does nothing.
If the column is indexed (i.e. unique), this is fast.
key_offset : int
Offset at which the key of the other file starts.
This generally useful when it's a key_id value, but the keys are
numbered rowid+1 for example (so offset would be 1).
This should be specified together with key, alone it does nothing.
enum : str
Enum from :py:mod:`PyPoE.poe.constants` to use for this field
unique : bool
Whether each value contained in this file is unique.
file_path : bool
Whether the entry is a file path. Please note this should also be
set if there is no file extension.
file_ext : str
The extension of the file, if any.
Most of the time this should be set together with file_path, unless
there is no path given.
display : str
String to show instead of the field id when displaying this
display_type : str
Python formatter syntax for outputting the value
description : str
Description of what this field does
"""
self.type = type
self.key = key
self.key_id = key_id
self.key_offset = key_offset
self.enum = enum
self.unique = unique
self.file_path = file_path
self.file_ext = file_ext
self.display = display
self.display_type = display_type
self.description = description
def __getitem__(self, item):
return getattr(self, item)
[docs]class VirtualField(_Common):
"""
Virtual fields are based off other Field instances and provide additional
convenience options such as grouping certain fields together.
"""
__slots__ = ['fields', 'zip']
[docs] def __init__(self, fields, zip=False):
"""
Parameters
----------
fields : list[str]
List of fields to coerce into one field.
All fields must be exist, but they can be either a regular or
virtual field or combination of.
zip : bool
Whether to zip the fields together.
This option requires each of the referenced fields to be a list.
"""
self.fields = fields
self.zip = zip
def __getitem__(self, item):
return getattr(self, item)