Seon plugin seonplugin engdatv2 decode

From Seon
Revision as of 10:13, 30 October 2014 by Admin (talk | contribs)
Jump to: navigation, search

Purpose

Recognize the ENGDAT abstract file in a job and analyze its content. The ENGDAT abstract file is separated into different segments, the following are taken into account:

  • UNB: Address code of recipient and sender
  • MID: Output of the document ID
  • EFC:
    • Filename
    • File sequence number
    • Compression
    • Format (Plaintext & ODDC77 encoding)
  • FTX (free text, comment; both for job and single files supported)

The plugin marks the found ENGDAT abstract file with the XML attribute "type=ENGDAT" in order to let it be deleted by the remove ENGDAT plugin.

If a file is marked with the compression type "gzip" in the field EFC4891, the following checks will be executed:

  • Is the file really a GZIP compressed file?
  • If yes: is the suffix of the file ".gz"? If not, the suffix will be appended to the filename as long as no other file with the same name exists.
  • Decompression of the file.
  • The name of the file in the ENGDAT abstract file (without the suffix ".gz") will be modified, so the original filename is available.

Requirements

  • Seon configuration file /etc/seon.conf exists or the file pointed to via the environment variable $Seon_CFGFILE exists. The referenced configuration file defines the used database, from which the temporary directory and license information will be extracted.

Configuration

-

Return values

  • 0: everything OK
  • 1: wrong arguments or configuration file not readable
  • 2: database connection error

Automatic sender learning