transformer classes

The ‘’Transformer’’ class

parser classes

The ‘’OcxParser’’ class

class ocx_schema_parser.ocxparser.OcxParser[source]

Bases: object

The OcxSchema provides functionality for parsing the OCX xsd schema and storing all the elements.

Variables:
  • _schema_namespaces – All namespaces on the form (prefix, namespace) key-value pairs resulting from parsing all schema files, W3C.

  • _is_parsed – True if a schema has been parsed, False otherwise

  • _schema_version

    The version of the parsed schema

    _schema_changes: A list of all schema changes described by the tag SchemaChange contained in the xsd file.

    _schema_types: The list of xsd types to be parsed. Only these types will be stored. _substitution_groups: Collection of all substitution groups with its members. _schema_enumerators: All schema enumerators _builtin_xs_types: W3C primitive data types.

    www.w3.org. Defined in config.py

    _schema_ns: The schema target ns with the schema version as key

element_iterator()[source]

Iterator of the parsed schem elements.

Return type:

Iterator

Returns:

Element iterator

get_element_from_tag(tag)[source]

Return get the etree.Element with the key tag.

Return type:

Optional[Element]

Returns:

The schema element instance

get_element_from_type(schema_type)[source]

Retrieve the schema element etree.Element with the key schema_type.

Parameters:

schema_type – The schema type to retrive on the form ns_prefix:name

Return type:

Tuple[Any, Any]

Returns:

A tuple of the element unique tag and the element (tag, Element)

get_lookup_table()[source]

Return the lookup table of parsed schema types.

Return type:

Dict

Returns:

The lookup table.

get_namespaces()[source]

The parsed namespaces.

Return type:

Dict

Returns:

The dict of namespaces as (namespace,prefix) key-value pairs

get_prefix_from_namespace(namespace)[source]

Find the namespace prefix.

Return type:

str

Returns:

The namespace prefix

get_schema_attribute_group_types()[source]

All schema elements of type attributeGroup.

Return type:

List[str]

Returns:

The list of all etree.Element of type attributeGroup

get_schema_attribute_types()[source]

All schema elements of type attribute.

Return type:

List[str]

Returns:

The list of unique tags for all etree.Element of type attribute

get_schema_complex_types()[source]

All tags for schema elements of type complexType.

Return type:

List[str]

Returns:

The list of tags of all etree.Element of type complexType

get_schema_element_types()[source]

All schema elements of type element.

Return type:

List

Returns:

The list of all etree.Element of type element

get_schema_enumerations()[source]

All schema elements of type enumeration.

Return type:

List[str]

Returns:

The list of tags of all etree.Element of type enumeration

get_schema_namespace(version)[source]

The schema namespace of the schema with version.

Return type:

str

Returns:

The target namespace

get_schema_simple_types()[source]

All schema elements of type simpleType.

Return type:

List[str]

Returns:

The list of tags of all etree.Element of type simpleType

get_schema_version()[source]

The OCX schema version.

Return type:

str

Returns:

The coded version string of the OCX schema

get_substitution_groups()[source]

The collection of the schema substitutionGroup.

Return type:

Dict

Returns:

Substitution groups with members

get_target_namespace()[source]

Return the target namespcae of the parsed schema.

Return type:

str

Returns:

The target namespace.

get_xs_types()[source]

All builtin xs types.

Return type:

Dict

Returns:

The list of all defined xs types

process_xsd_from_file(file)[source]

Process the xsd with file name file.

Parameters:

file – The file name of the xsd.

Return type:

bool

Returns:

True if processed, False otherwise.

tbl_attribute_groups()[source]

All parsed attributeGroup types in the schema and any referenced schemas.

Return type:

Dict

Returns:

List of SchemaType data class holding attributeGroup attributes.

tbl_attribute_types()[source]

The table of all parsed attribute elements in the schema and any referenced schemas.

Return type:

Dict

Returns:

The SchemaType data class attributes of attributeType

tbl_complex_types()[source]

The table of all parsed complexType elements in the schema and any referenced schemas.

Return type:

Dict

Returns:

The SchemaType data class attributes of complexType

tbl_element_types()[source]

The table of all parsed elements of type element in the schema and any referenced schemas.

Return type:

Dict

Returns:

The SchemaType data class attributes of element

tbl_enumerators()[source]

The table of all parsed enumerator elements in the schema and any referenced schemas.

Return type:

Dict

Returns:

The SchemaType data class attributes of simpleType

tbl_simple_types()[source]

The table of all parsed simpleType elements in the schema and any referenced schemas.

Return type:

Dict

Returns:

The SchemaType data class attributes of simpleType

tbl_summary(short=True)[source]

The summary of the parsed schema and any referenced schemas.

Parameters:

short – If true, only report number of schema types, otherwise report names of types.

Return type:

Dict

Returns:

The schema summary content dataclasses

data_classes classes

The ‘’BaseDataClass’’ class

class ocx_schema_parser.data_classes.BaseDataClass[source]

Bases: object

Base class for OCX dataclasses.

Each subclass has to implement a field metadata with name header for each of its attributes, for example:

name : str = field(metadata={'header': '<User friendly field name>'})

to_dict()[source]

Output the data class as a dict with field names as keys.

Return type:

Dict

The ‘’SchemaType’’ class

class ocx_schema_parser.data_classes.SchemaType(prefix, name, tag, source_line)[source]

Bases: BaseDataClass

Class for xsd schema type information.

Parameters:
  • name – The schema type name

  • prefix – The schema type namespace prefix

  • source_line – The line number in the schema file where the type is defined

  • tag – The schema type tag

name: str
prefix: str
source_line: int
tag: str

The ‘’SchemaSummary’’ class

class ocx_schema_parser.data_classes.SchemaSummary(schema_version, schema_types, schema_namespaces)[source]

Bases: BaseDataClass

Class for schema summary information.

Parameters:
  • schema_version – The schema version

  • schema_types – Tuples of the number of schema types

  • schema_namespaces – Tuples of namespace prefixes

schema_namespaces: List[Tuple]
schema_types: List[Tuple]
schema_version: List[Tuple]

The ‘’SchemaChange’’ class

class ocx_schema_parser.data_classes.SchemaChange(version, author, date, description='')[source]

Bases: BaseDataClass

Class for keeping track of OCX schema changes.

Parameters:
  • version – The schema version the change applies to

  • author – The author of the schem change

  • date – The date of the schema change

  • description – A description of the change

author: str
date: str
description: str = ''
version: str

The ‘’OcxEnumerator’’ class

class ocx_schema_parser.data_classes.OcxEnumerator(prefix, name, tag, values=<factory>, descriptions=<factory>)[source]

Bases: object

Enumerator class.

Parameters:
  • name – The name of the xs:attribute enumerator

  • values – Enumeration values

  • descriptions – Enumeration descriptions

descriptions: List[str]
name: str
prefix: str
tag: str
to_dict()[source]

Output the enumerator values and annotations.

Return type:

Dict

values: List[str]

elements classes

The ‘’OcxGlobalElement’’ class

class ocx_schema_parser.elements.OcxGlobalElement(xsd_element, unique_tag, namespaces)[source]

Bases: object

Global schema element class capturing the xsd schema definition of a global xs:element.

Parameters:

xsd_element – The lxml.etree.Element class

Variables:
  • _element – The lxml.Element instance

  • _attributes – The attributes of the global element including the attributes of all schema supertypes _tag: The unique global tag of the OcXGlobalElement

  • _parents – Hash table of references to all parent schema types with tag as key

  • _children – List of references to all children schema types with tag as key. Includes also children of all super-types.

  • -assertions – List of any assertions associated with the xs:element

add_assertion(test)[source]

Add an assertion test associated to me

Parameters:

test – The definition of the assertion represented as a string

Returns:

Nothing

add_attribute(attribute)[source]

Add attributes to the global element.

Parameters:

attribute – The attribute instance to be added

add_child(child)[source]

Add a child of an OCX global element’

Parameters:

child – The added child instance

Returns:

Nothing

attributes_to_dict()[source]

A dictionary of all OcxGlobalElement attribute values

Return type:

Dict

Returns:

A dictionary of attribute values with heading keys

Heading keys

Attribute

Type

Use

Default

Fixed

Description

children_to_dict()[source]

A dictionary of all OcxGlobalElement children values

Return type:

Dict

Returns:

main – A dictionary of attribute values with heading keys:

Heading keys

Child

Type

Use

Cardinality

Description

get_annotation()[source]

The global element annotation or description

Return type:

str

Returns:

The annotation string of the element

get_assertion_tests()[source]

Get all my assertions

Return type:

List

Returns:

Assertion tests in a list

get_attributes()[source]

The global element attributes including also parent attributes

Return type:

List

Returns:

A dict of all attributes including also parent attributes

get_cardinality()[source]

Get the cardinality of the OcxGlobalElement

Return type:

str

Returns:

The cardinality as sting represented by [lower, upper]

get_children()[source]

Get all my children XSD types.

Return type:

List

Returns:

Return all children as a dict of key-value pairs (tag, OCXChildElement)

get_name()[source]

The global element name

Return type:

str

Returns:

The name of the global schema element as a str

get_namespace()[source]

The element _namespace

Return type:

str

Returns:

The _namespace of the global schema element as a str

get_parent_names()[source]

Get all my parent names

Return type:

List

Returns:

Return all parents names in a list

get_parents()[source]

Return all my parents

Return type:

dict

Returns:

Return all parents as a dict of key-value pairs (tag, Element)

get_prefix()[source]

The global element _namespace prefix

Return type:

str

Returns:

The namespace prefix of the global schema element

get_properties()[source]

A dictionary of all OcxGlobalElement property values

Return type:

Dict

Returns:

main

A dictionary of property values with heading keys:

Heading keys

Name

Type

Use

Cardinality

Fixed

Description

get_schema_element()[source]

Get the schema xsd element of the OcxSchemeElement object

Return type:

Element

Returns:

My xsd schema element

get_substitution_group()[source]

Return the name of the substitutionGroup

Return type:

Optional[str]

Returns:

The name of the substitutionGroup, None otherwise

get_tag()[source]

The global schema element unique tag

Return type:

str

Returns:

The element tag on the form {prefix}name

get_type()[source]

The global element type

Return type:

str

Returns:

The type of the global schema element as a str

get_use()[source]

The element’s use, required or optional

Return type:

str

Returns:

The element usereq. if mandatory, else opt

has_assertion()[source]

Whether the element has assertions or not’

Return type:

bool

Returns:

Tru if the global element as assertions, False otherwise

is_abstract()[source]

Whether the element is abstract

Return type:

bool

Returns:

True if the element is abstract, False otherwise

is_choice()[source]

Whether the element is a choice or not

Return type:

bool

Returns:

True if the element is a choice, False otherwise

is_mandatory()[source]

Whether the element mandatory or not

Return type:

bool

Returns:

Returns True if the element is mandatory, False otherwise

is_reference()[source]

Whether the element has a reference or not

Return type:

bool

Returns:

is_reference – True if the element has a reference, False otherwise

is_substitution_group()[source]

Whether the element is part of a substitutionGroup

Return type:

bool

Returns:

True if the element is a substitutionGroup, False otherwise

put_cardinality(element)[source]

Override the cardinality of the OcxGlobalElement

Parameters:

element – the etree.Element node

put_parent(tag, parent)[source]

Add a parent element

Parameters:
  • tag – The unique tag of the parent element

  • parent – The parent xsd schema element

Returns:

None

xelement classes

The ‘’LxmlElement’’ class

class ocx_schema_parser.xelement.LxmlElement[source]

Bases: object

A wrapper class for the lxml etree.Element class main functions.

static cardinality(element)[source]

Establish the cardinality of the Element

Parameters:

element – the etree.Element instance

Return type:

tuple

Returns:

The tuple of the element lower and upper bounds (lower, upper)

classmethod cardinality_string(element)[source]

Return the element cardinality formatted string.

Return type:

str

static find_all_children_with_attribute_value(element, name, attrib_name, attrib_value, namespace='*')[source]

Find all the XML elements with the attribute name ‘attrib_name’ having a given value ‘attrib_value’

Parameters:
  • element – The XML parent node to search from

  • name – The name of the element with attrib_name and attrib_value

  • attrib_name – The name of the attribute

  • attrib_value – The value of the attribute

  • namespace – The search namespace. Default is the wildcard * matching any namespace

Return type:

List

Returns:

All children having attributes with name attrib_name and value attrib_value. Empty list if no children can be found

static find_all_children_with_name(element, child_name, namespace='*')[source]

Find all the XML element’s children with name child_name

Parameters:
  • element – The XML parent node to search from

  • child_name – The name of the child

  • namespace – The search namespace. Default is the wildcard * matching any namespace

Return type:

List

Returns:

A list of elements. Empty list if no children can be found

static find_all_children_with_name_and_attribute(element, child_name, attrib_name, namespace='*')[source]

Find all the XML elements with name name and attribute name attrib_name

Parameters:
  • element – The XML parent node to search from

  • child_name – The name of the children elements

  • attrib_name – The name of the attribute

  • namespace – The search namespace. Default is the wildcard matching any namespace

Return type:

List

Returns:

All elements having attributes with name attrib_name. An empty list if no children can be found

static find_assertion(element, namespace='*')[source]

Find any assertions under the element

Parameters:
  • element – The XML parent node to search from

  • namespace – The search namespace. Default is the wildcard * matching any namespace

Return type:

Optional[str]

Returns:

The assertion test as a string. None if no assertion tag is found

static find_attribute_groups(element, namespace='*')[source]

Find all sub elements of type xs:attributeGroup

Parameters:
  • element – The XML parent node to search from

  • namespace – The search namespace. Default is the wildcard * matching any namespace

Return type:

List[Element]

Returns:

Attribute groups

static find_attributes(element, namespace='*')[source]

Find all sub elements of type xs:attribute

Parameters:
  • element – The XML parent node to search from

  • namespace – The search namespace. Default is the wildcard * matching any namespace

Return type:

List[Element]

Returns:

The list of the xs – attribute type found

static find_child_with_name(element, child_name, namespace='*')[source]

Find the first direct child of the XML element’s children with name child_name

Parameters:
  • element – The XML parent node to search from

  • child_name – The name of the child

  • namespace – The search namespace. Default is the wildcard ‘*’ matching any namespace

Return type:

Element

Returns:

The child element as etree.Element. None if no child can be found

static get_base(element)[source]

The base URI of the document holding the Element (the location of the document)

Parameters:

element – The current etree.Element node

Return type:

str

Returns:

The base URI of the document or empty string if unknown

static get_children(element)[source]

Returns all direct children of the xml element

Parameters:

element – The XML parent node

Return type:

List[Element]

Returns:

A List of all children of ‘element’ excluding the parent itself.

static get_element_text(element)[source]

The text between the element’s start and end tags without any tail text.

Parameters:

element – the etree.Element instance

Return type:

str

Returns:

The element text stripped of any special characters

static get_localname(element)[source]

The local name (type) of an XML element

Parameters:

element – The XML parent node

Return type:

str

Returns:

The element local name

static get_name(element)[source]

The name of an XML element defined by the attribute name

Parameters:

element – The XML parent node

Return type:

Any

Returns:

The element local name or None if no name

static get_namespace(element)[source]

The namespace of an XML element

Parameters:

element – The XML parent node

Return type:

str

Returns:

The element namespace

static get_parent(element)[source]

Returns the parent of this element or None for the root element.

Parameters:

element – The current etree.Element node

Return type:

Element

Returns:

The parent Element node of the document for this element or none if element is the root

static get_reference(element)[source]

The referenced XML element defined by the attribute ref

Parameters:

element – The XML parent node

Return type:

Any

Returns:

The referenced element including namespace prefix if any. None if there is no reference

static get_restriction(element)[source]

Return the element restriction

Parameters:

element – the etree.Element instance

Return type:

str

Returns:

restriction type

static get_root(element)[source]

Return Element root node of the document that contains this element.

Parameters:

element – The current etree.Element node

Return type:

Element

Returns:

The root Element node of the document for this element

static get_source_line(element)[source]

Original line number as found by the parser or None if unknown.

Parameters:

element – The current etree.Element node

Return type:

int

Returns:

The line number of the element in the source

static get_substitution_group(element)[source]

Return the name of the element’s substitutionGroup

Parameters:

element – the etree.Element instance

Return type:

str

Returns:

name of substitutionGroup, None if no substitutionGroup

static get_use(element)[source]

Whether XML element is required or optional given by the attribute value use

Parameters:

element – The XML parent node

Return type:

Any

Returns:

The element use as a string, either required or optional

static get_xml_attrib(element)[source]

The XML attributes of an element

Parameters:

element – The XML parent node

Return type:

Dict

Returns:

A dictionary of (key,value) pairs of the element attributes

static has_child_with_name(element, child_name, namespace='*')[source]

Check if the element has a child with name ‘child_name’

Parameters:
  • element – The XML parent node to search from

  • child_name – The name of the child

  • namespace – The search namespace. Default is the wildcard * matching any namespace

Return type:

bool

Returns:

True if the element has a child with name child_name False otherwise

static is_abstract(element)[source]

Return True if the element is abstract

Parameters:

element – the etree.Element instance

Return type:

bool

Returns:

True if the element abstract, false otherwise

static is_choice(element)[source]

Return True if the element ancestor is an xs:choice, False otherwise

Parameters:

element – the etree.Element instance

Return type:

bool

Returns:

True if the element node is a choice, false otherwise

static is_enumeration(element)[source]

Whether the attribute is an enumeration or not

Parameters:

element – The XML parent node

Return type:

bool

Returns:

true if the attribute is an enumeratos, false otherwise

static is_mandatory(element)[source]

The element use. True if required, False otherwise

Parameters:

element – The element node

Return type:

bool

Returns:

True if the element is mandatory, false otherwise

static is_reference(element)[source]

Whether the element is a reference or not

Parameters:

element – The XML parent node

Return type:

bool

Returns:

true if the element is a reference, false otherwise

static is_substitution_group(element)[source]

Return True if the element is a substitution

Parameters:

element – the etree.Element instance

Return type:

bool

Returns:

True if the element is a substitutionGroup, false otherwise

static items(element)[source]

Gets element attributes, as a sequence. The attributes are returned in an arbitrary order.

Parameters:

element – The current etree.Element node

Return type:

List

Returns:

A List of all attributes of element

static iter(element, tag=None, *tags)[source]
Iterate over all elements in the subtree in document order (depth first pre-order),

starting with this element.

Can be restricted to find only elements with specific tags: pass {ns}localname as tag. Either or both of ns and localname can be * for a wildcard; ns can be empty for no namespace. localname is equivalent to {}localname (i.e. no namespace) but * is {*}* (any or no namespace), not {}*.

Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.

Parameters:
  • element – The etree.Element node to search from

  • tag – The name of the child

Return type:

ElementDepthFirstIterator

Returns:

An iterator filtered by tags if specified.

Example

for type in LxmlElement.iter(root, {*}complexType)
    print(type.tag)

will iterate over all complexType tags and print the tag starting from the document root .

static namespace_prefix(element)[source]

Returns the namespace prefix of an element if any

Parameters:

element – The element name with or without prefix as a string

Return type:

Optional[str]

Returns:

The element prefix string or None if no prefix

static namespaces_decorate(ns)[source]

Decorate a string with curly brackets to form a valid XML namespace

Parameters:

ns – The namespace

Return type:

str

Returns:

The curly decorated namespace

static replace_ns_tag_with_ns_prefix(element, namespaces)[source]

Replace the namespace tag with a mapped namespace prefix.

Parameters:
  • element – The element name with or without prefix as a string

  • namespaces – the namespace tag to prefix mapping

Return type:

str

Returns:

the element with prefix

static strip_namespace_prefix(element)[source]

Returns the element name without the namespace prefix

Parameters:

element – The element name with or without prefix as a string

Return type:

str

Returns:

The element without namespace prefix

static strip_namespace_tag(element)[source]

Returns the element name without the namespace tag

Parameters:

element – The element name with or without namespace as a string

Return type:

str

Returns:

The element without namespace tag

classmethod unique_tag(element)[source]

The unique tag of an XML element: {namespace}name

Parameters:

element – The XML parent node

Return type:

Any

Returns:

The element unique tag

ocxdownloader classes

The ‘’OcxDownloader’’ class

helpers classes

The ‘’SchemaHelper’’ class

class ocx_schema_parser.helpers.SchemaHelper[source]

Bases: object

A utility class for retrieving OCX attributes and information from an OCX xsd element

static find_schema_changes(root)[source]

Find any schema version changes with tag SchemaChange

Parameters:

root – The root element of the schema

Return type:

List[SchemaChange]

Returns:

A list of SchemaChange dataclasses

classmethod get_reference(element)[source]

The element reference

Return type:

Optional[str]

Returns:

The reference to a global element on the form prefix:name. Returns None if the element is not a reference.

static get_schema_version(root)[source]

Get the current OCX schema version

Parameters:

root – The root element of the schema

Return type:

str

Returns:

The version of the OCX schema

static get_type(element)[source]

The element type given by the element attribute or by its complexContent

Return type:

str

Returns:

The global element type on the form prefix:name. If the element has no type, untyped is returned.

classmethod is_reference(element)[source]

Is a reference or not

Return type:

bool

Returns:

True if the element is a reference, False otherwise

classmethod schema_changes_data_grid(root)[source]

A dictionary of the content of all SchemaChange tags

Parameters:

root – The root element of the schema

Return type:

Dict

Returns:

A dict dta grid with a unique id as key

static unique_tag(name, namespace)[source]

A unique global tag from the element name and namespace

Parameters:
  • name – The name of the element

  • namespace – The namespace

Return type:

str

Returns:

A unique element tag on the form {namespace}name

xparse classes

The ‘’LxmlParser’’ class

class ocx_schema_parser.xparse.LxmlParser[source]

Bases: object

A wrapper of the lxml etree document tree and parser.

Variables:

_tree – The lxml.etree DOM

doc_encoding()[source]
Return type:

str

Returns:

The XML document encoding

doc_public_id()[source]
Return type:

str

Returns:

The XML document type

doc_root_name()[source]
Return type:

str

Returns:

The XML document root name

doc_system_url()[source]
Return type:

str

Returns:

The XML document system URL

doc_url()[source]
Return type:

str

Returns:

The XML document url

doc_xml_version()[source]
Return type:

str

Returns:

The XML document version

get_namespaces()[source]

The dict of the defined namespaces of (prefix, namespace) as (key,value) pairs.

Return type:

Dict

Returns:

(prefix, namespace) as (key,value) pairs

get_referenced_files()[source]

The XML imports (xs:import tags).

Return type:

Dict

Returns:

A dict of key, value pairs (namespace – location/URL) of all xs:import tags.

get_root()[source]

The XML root.

Return type:

Element

Returns:

The XML root node

get_target_namespace()[source]

The target namespace of the schema.

Return type:

str

Returns:

The target namespace as a str

lxml_version()[source]

lxml version tag.

Return type:

str

Returns:

The lxml version tag

parse(file, store_ids=False)[source]

Parses an XML file.

Parameters:
  • file – The file name of the xml document to be parsed. The parser can only parse from a local file.

  • store_ids – If set to True, the parser will create a hash table of the xml IDs

Return type:

bool

Returns:

The return value. True for success, False otherwise.