ocxtools

clients

console

CLI console

class ocxtools.console.console.CliConsole[source]

Bases: Console

CLI console

error(msg)[source]

Console error print method

Parameters:

msg – Output message

html_page(url)[source]

Display the `a web page in a browser window.

Parameters:

url – The address to the web page

info(msg)[source]

Console info print method

Parameters:

msg – Output message

man_page(sub_command)[source]

Display the sub_command html file in a browser.

Parameters:

sub_command – The sub_command name

print_table(table, justify=Justify.CENTER)[source]

Console table print method

Parameters:
  • justify – Justify the table in the console. Default = center

  • table – A Rich Table to output.

print_table_row(table, cells, justify=Justify.CENTER)[source]

Console table print method

Parameters:
  • justify – Justify the table in the console. Default = center

  • table – A Rich Table to output.

readme(sub_command)[source]

Print the sub_command readme file in the console window.

Parameters:

sub_command – The sub_command name

run_sub_process(command, silent=False)[source]

Execute the command in a python subprocess.

Parameters:
  • command – The command to execute.

  • silent – If True, don’t output anything to the console.

Return type:

str

section(title, separator='=', style=Style(color=Color('blue', ColorType.STANDARD, number=4), bold=True))[source]
Parameters:
  • style – The rule style

  • separator – The rule characters

  • title – The section title

warning(msg)[source]

Console info print method

Parameters:

msg – Output message

class ocxtools.console.console.Justify(value)[source]

Bases: Enum

Justify enum

CENTER = 'center'
FULL = 'full'
LEFT = 'left'
RIGHT = 'right'

context

interfaces

Interfaces module.

class ocxtools.interfaces.interfaces.IModuleDeclaration[source]

Bases: ABC

Abstract module import declaration Interface

abstract static get_declaration()[source]

Abstract Method: Return the module declaration string.

Return type:

str

class ocxtools.interfaces.interfaces.IObservable[source]

Bases: ABC

Interface. The observable object.

abstract subscribe(observer)[source]

subscription

abstract unsubscribe(observer)[source]
abstract update(event, message)[source]

update method. :Parameters: * event – The event type

  • message – The event message

class ocxtools.interfaces.interfaces.IObserver[source]

Bases: ABC

The observer interface

abstract update(event, payload)[source]

Interface update method

class ocxtools.interfaces.interfaces.IParser[source]

Bases: ABC

Abstract IParser interface.

abstract iterator(model)[source]

Abstract method for iterating a data model.

Parameters:

model – the data model to iterate on.

Return type:

Iterator

Returns:

An iterator

abstract parse(model)[source]

Abstract method for parsing a data model,

Parameters:

model – the data model source

Return type:

dataclass

Returns:

the root dataclass of the parsed data model.

class ocxtools.interfaces.interfaces.IRule(latest)[source]

Bases: IObserver, ABC

Abstract rule interface

abstract convert(source_params, target)[source]

Abstract Method: Return the mapped parameters.

Return type:

Dict

get_latest_version()[source]

Returns the latest supported version.

Return type:

str

listen_to_events()[source]

Default is to subscribe to no events

Return type:

List

update(event, message)[source]

Default update is to do nothing

validate_version(target)[source]
Parameters:

target

Return type:

bool

Returns:

True if the conversion is implemented for the target version.

class ocxtools.interfaces.interfaces.ISerializer(model)[source]

Bases: ABC

OcxSerializer interface

abstract serialize_to_file(to_file)[source]

Abstract XML serialize to file method

Return type:

bool

abstract serialize_to_string()[source]

Abstract XML serialize to string method

Return type:

str

class ocxtools.interfaces.interfaces.ObservableEvent(value)[source]

Bases: Enum

Events that can be listened to and broadcast.

DATACLASS = 'dataclass'
REPORT = 'report'
SERIALIZE = 'serialize'

interfaces

Interfaces module.

class ocxtools.interfaces.interfaces.IModuleDeclaration[source]

Bases: ABC

Abstract module import declaration Interface

abstract static get_declaration()[source]

Abstract Method: Return the module declaration string.

Return type:

str

class ocxtools.interfaces.interfaces.IObservable[source]

Bases: ABC

Interface. The observable object.

abstract subscribe(observer)[source]

subscription

abstract unsubscribe(observer)[source]
abstract update(event, message)[source]

update method. :Parameters: * event – The event type

  • message – The event message

class ocxtools.interfaces.interfaces.IObserver[source]

Bases: ABC

The observer interface

abstract update(event, payload)[source]

Interface update method

class ocxtools.interfaces.interfaces.IParser[source]

Bases: ABC

Abstract IParser interface.

abstract iterator(model)[source]

Abstract method for iterating a data model.

Parameters:

model – the data model to iterate on.

Return type:

Iterator

Returns:

An iterator

abstract parse(model)[source]

Abstract method for parsing a data model,

Parameters:

model – the data model source

Return type:

dataclass

Returns:

the root dataclass of the parsed data model.

class ocxtools.interfaces.interfaces.IRule(latest)[source]

Bases: IObserver, ABC

Abstract rule interface

abstract convert(source_params, target)[source]

Abstract Method: Return the mapped parameters.

Return type:

Dict

get_latest_version()[source]

Returns the latest supported version.

Return type:

str

listen_to_events()[source]

Default is to subscribe to no events

Return type:

List

update(event, message)[source]

Default update is to do nothing

validate_version(target)[source]
Parameters:

target

Return type:

bool

Returns:

True if the conversion is implemented for the target version.

class ocxtools.interfaces.interfaces.ISerializer(model)[source]

Bases: ABC

OcxSerializer interface

abstract serialize_to_file(to_file)[source]

Abstract XML serialize to file method

Return type:

bool

abstract serialize_to_string()[source]

Abstract XML serialize to string method

Return type:

str

class ocxtools.interfaces.interfaces.ObservableEvent(value)[source]

Bases: Enum

Events that can be listened to and broadcast.

DATACLASS = 'dataclass'
REPORT = 'report'
SERIALIZE = 'serialize'

loader

Dynamically load a python module.

class ocxtools.loader.loader.DeclarationOfOcxImport(name, version)[source]

Bases: IModuleDeclaration

Declaration of the ocx module.

get_declaration()[source]

Return the module import declaration.

Return type:

str

get_name()[source]

Return the declared module name.

Return type:

str

get_version()[source]

Return the OCX module version.

Return type:

str

class ocxtools.loader.loader.DynamicLoader[source]

Bases: object

Dynamically loads modules, classes of functions from a module declaration.

classmethod get_all_class_names(module_name, version)[source]

Return all class names in the module by the __all__ variable.

Parameters:
  • module_name – The module name

  • version – The module version

Return type:

List

Returns:

The list of available module class names.

classmethod import_class(module_declaration, class_name)[source]

The module import declaration. :Parameters: * class_name – The class name to load form the declared module

  • module_declaration – The declaration of the python module to be loaded

Return type:

Any

Returns:

Return the loaded class, None if failed.

classmethod import_module(module_declaration)[source]
Parameters:

module_declaration – The declaration of the pyton module to load

Return type:

ModuleType

Returns:

Return the loaded module, None if failed.

exception ocxtools.loader.loader.DynamicLoaderError[source]

Bases: AttributeError

Dynamic import errors.

class ocxtools.loader.loader.ModuleDeclaration(package, sub_module, name)[source]

Bases: IModuleDeclaration, ABC

General module declaration

Parameters:
  • package – the package name

  • sub_module – The submodule name

  • name – The method name

get_declaration()[source]

Return the module import declaration.

Return type:

str

parser

Module for parsing a 3Docx model.

class ocxtools.parser.parser.MetaData[source]

Bases: object

Dataclass metadata.

static class_name(data_class)[source]

Return the name of the class

Return type:

str

static meta_class_fields(data_class)[source]

Return the dataclass metadata.

Parameters:

data_class – The dataclass instance

Return type:

Dict

Returns:

The metadata of the class

static name(data_class)[source]

Get the OCX name

Parameters:

data_class – The dataclass instance

Return type:

str

Returns:

The name of the OCX type

static namespace(data_class)[source]

Get the OCX namespace

Parameters:

data_class – The dataclass instance

Return type:

str

Returns:

The namespace of the dataclass

class ocxtools.parser.parser.OcxNotifyParser(fail_on_unknown_properties=False, fail_on_unknown_attributes=False, fail_on_converter_warnings=True)[source]

Bases: IObservable, ABC

Ocx notification parser class for 3Docx XML files.

Parameters:
  • fail_on_unknown_properties – Don’t bail out on unknown properties.

  • fail_on_unknown_attributes – Don’t bail out on unknown attributes

  • fail_on_converter_warnings – bool = Convert warnings to exceptions

class_factory(clazz, params)[source]

Custom class factory method

parse(xml_file)[source]

Parse a 3Docx XML model and return the root dataclass.

Parameters:

xml_file – The 3Docx xml file or url to parse.

Return type:

dataclass

Returns:

The root dataclass instance of the parsed 3Docx XML.

parse_element(element, ocx_module)[source]

Parse a 3Docx XML element and return the dataclass.

Parameters:

element – The 3Docx XML Element to parse.

Return type:

dataclass

Returns:

The element dataclass instance.

subscribe(observer)[source]

subscription

unsubscribe(observer)[source]
update(event, payload)[source]

update method. :Parameters: * event – The event type

  • message – The event message

class ocxtools.parser.parser.OcxParser(fail_on_unknown_properties=False, fail_on_unknown_attributes=False, fail_on_converter_warnings=True)[source]

Bases: IParser, ABC

OcxParser class for 3Docx XML files.

Parameters:
  • fail_on_unknown_properties – Don’t bail out on unknown properties.

  • fail_on_unknown_attributes – Don’t bail out on unknown attributes

  • fail_on_converter_warnings – bool = Convert warnings to exceptions

iterator(ocx_model)[source]

Abstract method for iterating a data model.

Parameters:

model – the data model to iterate on.

Return type:

Iterator

Returns:

An iterator

parse(xml_file)[source]

Parse a 3Docx XML model and return the root dataclass.

Parameters:

xml_file – The 3Docx xml file or url to parse.

Return type:

dataclass

Returns:

The root dataclass instance of the parsed 3Docx XML.

renderer

Render classes

exception ocxtools.renderer.renderer.RenderError[source]

Bases: ValueError

Render errors.

class ocxtools.renderer.renderer.ReportType(value)[source]

Bases: Enum

Validator report types

OCX = 'ocx'
SCHEMATRON = 'schematron'
class ocxtools.renderer.renderer.RichTable[source]

Bases: object

Build a Rich table.

classmethod build_rich_tree(tree, parent)[source]

Recursively builds a tree structure from a dictionary.

Parameters:
  • tree_dict – The dictionary representing the tree structure.

  • parent – The parent node of the current tree level. Defaults to None for the root node.

Return type:

Tree

Returns:

TreeNode – The root node of the built tree.

classmethod df_to_table(pandas_dataframe, rich_table, show_index=True, index_name=None)[source]

Convert a pandas.DataFrame obj into a rich.Table obj. :Parameters: * pandas_dataframe (DataFrame) – A Pandas DataFrame to be converted to a rich Table.

  • rich_table (Table) – A rich Table that should be populated by the DataFrame values.

  • show_index (bool) – Add a column with a row count to the table. Defaults to True.

  • index_name (str, optional) – The column name to give to the index column. Defaults to None, showing no value.

Return type:

Table

Returns:

Table – The rich Table instance passed, populated with the DataFrame values.

classmethod render(title, data, show_header=True, caption=None)[source]

Render a rich table :Parameters: * show_header – If True render the table header.

  • title – The table title rendered above.

  • data – The table content. List of dictionaries where each dictionary

  • represents a row in the table, and the keys represent column headers.

  • caption – The table caption rendered below.

Returns:

The table

classmethod render_rich_tree(node, tree=None, depth=0)[source]

Renders a rich tree structure based on the provided node.

Parameters:
  • node – The root node of the tree structure.

  • tree – Optional existing Tree object to append the rendered tree. Defaults to None.

  • depth – The current depth of the tree. Defaults to 0.

Returns:

Tree – The rendered rich tree structure.

classmethod tree(data, title)[source]

Builds and renders a rich tree structure based on the provided data.

Parameters:
  • data – The dictionary representing the tree structure.

  • title – The title of the tree.

Return type:

Tree

Returns:

Tree – The rendered rich tree structure.

class ocxtools.renderer.renderer.TableRender[source]

Bases: object

static render(data)[source]
Parameters:

data

Returns:

class ocxtools.renderer.renderer.TreeNode(name)[source]

Bases: object

Represents a node in a tree structure.

Parameters:

name – The name of the node.

Variables:
  • name – The name of the node.

  • children – A list of child nodes.

class ocxtools.renderer.renderer.XsltTransformer(xslt_file)[source]

Bases: object

Transform an XML file using an xslt stylesheet.

render(data, source_file, output_folder, report_type=ReportType.SCHEMATRON)[source]
Parameters:
  • report_type – The report type. OCX or SCHEMATRON.

  • output_folder – The report folder.

  • data – the xml data as a string

  • source_file – The source file

Return type:

str

Returns:

The path to the output file name

reporter

OCX reporter module

class ocxtools.reporter.reporter.OcxObserver(observable)[source]

Bases: IObserver, ABC

OCX reporter observer class

event (ObservableEvent): The event that triggered the update. payload (Dict): The payload associated with the event.

Returns:

None

get_elements()[source]

Return all parsed elements.

Return type:

Dict

Returns:

Dict of all OCX objects from the parsed XML document.

get_number_of_elements()[source]

Returns the number of elements in the parsed 3Docx model.

Return type:

int

Returns:

int – The number of elements.

header(model)[source]

Return the 3Docx header data.

Return type:

OcxHeader

Returns:

The header dataclass

update(event, payload)[source]

Interface update method

class ocxtools.reporter.reporter.OcxReportFactory[source]

Bases: object

Reporter factory class

static create_header(root, ocx_model)[source]

Create the OcxHeader dataclass from the XML content. :Parameters: * root – The XML root

  • ocx_model – The 3Docx file path.

Return type:

OcxHeader

Returns:

The 3Docx header dataclass

static datframe_types(model, report_type, data)[source]
Return type:

ReportDataFrame

static element_count(model, objects)[source]

Element count report :Parameters: * model – The source XML file

  • objects – List of tuples (tag, count) of 3Docx objects.

Return type:

ReportElementCount

Returns:

The element count report.

static element_count_2(model, objects)[source]

Element count report. :Parameters: * model – The source XML file

  • objects – List of tuples (tag, count) of 3Docx objects.

Return type:

ReportElementCount

Returns:

The element count report.

static element_primitives(ocx_element)[source]

Returns a dictionary containing the attributes of the given OCX element.

Parameters:

ocx_element – The OCX element to generate the report for.

Return type:

dict[Any, Any]

Returns:

dict[Any, Any] – A dictionary containing the attributes of the OCX element, excluding non-primitive types.

static element_to_dataframe(model, report_type, data)[source]

Converts a list of dataclass objects into a pandas DataFrame by flattening the data.

Parameters:
  • model – The 3Docx source file

  • report_type – The report type

  • data (List[dataclass]) – The list of dataclass objects to be converted.

  • depth (int) – The maximum depth to flatten the data.

Return type:

ReportDataFrame

Returns:

DataFrame – The flattened data as a pandas DataFrame.

Examples

>>> data = [DataClass(a=1, b=2), DataClass(a=3, b=4)]
>>> element_to_dataframe(data, depth=2)
   a  b
0  1  2
1  3  4
class ocxtools.reporter.reporter.OcxReporter[source]

Bases: object

3Docx attribute reporter

static dataframe(model, ocx_type)[source]
Parameters:
  • model – The 3Docx source

  • ocx_type – The 3Docx type to parse

Return type:

Optional[ReportDataFrame]

Returns:

The dataclass ReportDatFrame containing the flattened data frame of the parsed OCX element

element_count(selection='All')[source]
Return type:

ReportElementCount

static element_count_2(model)[source]

Return the count of a list of OCX elements in a model. This method is slow due to the OcxNotifyParser. Don’t use.

Parameters:

model – The 3Docx model source

Return type:

ReportElementCount

get_header()[source]

Returns the header of the OCX report.

Return type:

OcxHeader

Returns:

OcxHeader – The header of the OCX report.

get_model()[source]

Return the path to the parsed model

Return type:

str

get_root()[source]

Return the XML model root.

Return type:

Element

parse_model(model)[source]

Parse the 3Docx model and return the root Element.

Args;

model: The 3Docx source file

Return type:

Element

Returns:

The XML root element after parsing.

ocxtools.reporter.reporter.all_empty_array(column)[source]

Check if all elements in a column are empty arrays.

Parameters:

column – A pandas Series representing a column of data.

Return type:

bool

Returns:

bool – True if all elements in the column are empty arrays, False otherwise.

ocxtools.reporter.reporter.duplicate_values(df, col_name)[source]
Return type:

List

Find duplicates in a dataframe column. :Parameters: * df – DatFrame

  • col_name – The specified column.

Returns:

ocxtools.reporter.reporter.flatten_data(data, parent_key='', sep='.')[source]

Flattens nested data structures into a dictionary.

Parameters:
  • data (Any) – The data to be flattened.

  • parent_key (str) – The parent key to be prepended to the flattened keys. Defaults to an empty string.

  • sep (str) – The separator to use between keys. Defaults to ‘.’.

Return type:

Dict[str, Any]

Returns:

Dict[str, Any] – The flattened data as a dictionary.

Examples

>>> data = {'a': {'b': {'c': 1}}, 'd': [2, 3]}
>>> flatten_data(data)
{'a.b.c': 1, 'd': [2, 3]}
ocxtools.reporter.reporter.get_guid(element)[source]

Return the ocx:GUIDRef. :Parameters: element – The element instance

Return type:

Optional[str]

Returns:

The GUIDRef value if present, else None

ocxtools.reporter.reporter.get_guid_ref(element)[source]

Return the guid or localref refernce of an element. :Parameters: element – The element instance

Return type:

Optional[str]

Returns:

The GUIDRef value if present, else None

serializer

OcxSerializer module.

class ocxtools.serializer.serializer.OcxSerializer(ocx_model, pretty_print=True, pretty_print_indent='  ', encoding='utf-8')[source]

Bases: object

OcxSerializer class for 3Docx XML models.

Parameters:
  • ocx_model – The dataclass to serialize.

  • pretty_print – True to pretty print, False otherwise.

  • pretty_print_indent – Pretty print indentation.

  • encoding – The encoding code.

Params:

_model: The dataclass to serialize. _config: The serializer configuration.

serialize_json()[source]

Serialize a 3Docx XML model to json with proper indentations.

Return type:

str

Returns:

The dataclass xml serialisation.

Raises:

SerializeError if failing

serialize_xml(global_ns='ocx')[source]

Serialize a 3Docx XML file with proper indentations.

Return type:

str

Returns:

The dataclass xml serialisation.

Raises:

SerializeError if failing

class ocxtools.serializer.serializer.ReportFormat(value)[source]

Bases: Enum

Serialisation formats

CSV = 'csv'
PARQUET = 'parquet'
class ocxtools.serializer.serializer.Serializer[source]

Bases: object

A general serializer for dict type data structures

static serialize_to_csv(table, file_name)[source]

Serialize a list of dictionaries to a csv file. Each dictionary is a row with key:value pairs where the key is the column header and the value is the data value. :Parameters: * table – The table to serialize

  • file_name – the output file name

static serialize_to_parquet(report, report_folder)[source]

Serialize a dataframe report to a parquet file :Parameters: * report – Dataclass containing the dataframe to serialize

  • report_folder – the output directory

exception ocxtools.serializer.serializer.SerializerError[source]

Bases: ValueError

OCX Serializing errors.

validator

The validator report class.

class ocxtools.validator.validator_report.ValidatorReportFactory[source]

Bases: object

Validator report.

static create_info_report(response)[source]

The validator information about supported domains and validation types. :Parameters: response – The input data

Return type:

List[ValidationInformation]

Returns:

A list of the ValidationInformation objects

static create_report(source, report_data, header)[source]

Create the validation report. :Parameters: * source – The source 3Docx model source file name.

  • report_data – The validation result.

  • header – The 3Docx Header information

Return type:

ValidationReport

Returns:

The report dataclass

utils

Shared utility classes and functions

class ocxtools.utils.utilities.OcxNamespace[source]

Find the schema namespace of the 3Docx XML model.

static ocx_namespace(model)[source]

Return the OCX schema namespace of the model.

Parameters:

model – The source path or uri

Return type:

str

Returns:

The OCX schema namespace of the model.

class ocxtools.utils.utilities.OcxVersion[source]

Find the schema version of an 3Docx XML model.

static get_version(model)[source]

The schema version of the model. :Parameters: model – The source file path or uri

Return type:

str

Returns:

The schema version of the 3Docx XML model.

class ocxtools.utils.utilities.SourceValidator[source]

Methods for validating the existence of a data source.

static filter_files(directory, filter_str)[source]

Return an iterator over the filtered files in the directory.

Return type:

Generator

static is_directory(source)[source]

Return True if the source is a directory, False otherwise

Return type:

bool

static is_url(source)[source]

Return true if source is a valid url.

Return type:

bool

static mkdir(source)[source]

Create the directory and any parent folders if missing.

Parameters:

source – The folder name

Return type:

str

Returns:

The folder name

static validate(source)[source]

Validate the existence of a data source.

Parameters:

source – The source file path or url.

Return type:

str

Returns:

Returns the uri or full path if the source is valid.

Raises:

Raises a ValueError if source is invalid

ocxtools.utils.utilities.all_equal(iterable)[source]

Verify that all items in a list are equal :Parameters: iterable

Return type:

True

Returns:

True if all are equal, False otherwise.

ocxtools.utils.utilities.camel_case_split(str)[source]

Split camel case string to individual strings.

Return type:

List

ocxtools.utils.utilities.current_dir(file)[source]

The full path to the folder containing the file

Parameters:

file – The name of an existing file

Return type:

str

ocxtools.utils.utilities.default_to_grid(d)[source]

Converts defaultdicts to a data grid with unique row ids.

Parameters:

d – The dict to be converted

Return type:

Dict

ocxtools.utils.utilities.default_to_regular(d)[source]

Converts defaultdicts of defaultdicts to dict of dicts.

Parameters:

d – The dict to be converted

Return type:

Dict

ocxtools.utils.utilities.dromedary_case_split(str)[source]

Split camel case string to individual strings.

Return type:

List

ocxtools.utils.utilities.find_replace_multi(string, dictionary)[source]

Substitute every value in a dict if it matches.

Return type:

str

ocxtools.utils.utilities.get_file_path(file_name)[source]

Get the correct file path also when called within a one-file executable.

ocxtools.utils.utilities.is_substring_in_list(substring, string_list)[source]
Parameters:
  • substring – The search string

  • string_list – List of strings

Returns:

True if the substring is found, False otherwise.

ocxtools.utils.utilities.list_files_in_directory(directory, file_ext='.3docx')[source]

Utility function to list files in a directory.

Parameters:
  • directory – the name of the directory.

  • file_ext – Only files with matching extension will be listed.

Return type:

list

Returns:

list of matching files.

ocxtools.utils.utilities.load_yaml_config(config)[source]

Safely read a yaml config file and return the content as a dict.

Parameters:

config – Path to yaml file

Raises:

Raise errno.ENOENT if yaml file does not exist

Return type:

dict

ocxtools.utils.utilities.logging_level(loglevel)[source]

Utility function to return the logging level.

Parameters:

loglevel – One of INFO, WARNING, ERROR or DEBUG

Return type:

int

ocxtools.utils.utilities.nested_dict()[source]

A recursive function that creates a default dictionary where each value is another default dictionary.

ocxtools.utils.utilities.number_table_rows(table, first_index=0)[source]

Utility function to add row numbers to the first column of a table stored as a dict.

Parameters:
  • table – The input table dict

  • first_index – The first row index value. Default = 0

Return type:

Dict

Returns:

a table (dict) with numbered rows in the first column

ocxtools.utils.utilities.resource_path(relative_path)[source]

Get absolute path to resource, works for dev and for PyInstaller

ocxtools.utils.utilities.root_dir()[source]

Path to the directory of the parent module.

Return type:

str

ocxtools.utils.utilities.tree(paths, prefix='')[source]

A recursive generator, given a directory Path object will yield a visual tree structure line by line with each line prefixed by the same characters