Click the question to show/hide the answerI have a number of documents that are very similar in layout. Is there a way of simplifying the data extraction?

When you mark up data fields and extract them, EscapeE creates a file (extension *.EE) with the field name definitions. So rather than redefining each whole document you can use the same .EE file and make the few modification manually to create a new .EE file.

Click the question to show/hide the answerThere doesn't seem to be a place to define where the .EE file is. Do I need to put DEFAULT.EE in the same folder as the print files?

You can do this – it will apply to all files in that folder unless they have a .EE file with the same stem. For example if you have the following files:
A.PCL
A.EE
B.PCL
C.PCL
DEFAULT.EE
then file A uses field defined in A.EE and the others use DEFAULT.EE. If you want to specify a field definitions file explicitly you can either type in the name in the Fields dialog (the box at the top of the screen) or specify it on the EscapeE command line when calling via a shortcut from a CMD window or another program. The syntax for this is:
ESCAPEE /FIELDS filename
You can specify virtually all the EscapeE options on the command line if required – see Command line syntax.

Click the question to show/hide the answerI need text extraction for the whole document, not for fields. Is there a way to get the text contents extracted and TIFFs produced in one pass?

Proceed as follows:
1. Define a field comprising the whole page. The easiest way to do this is to view the whole page, sweep it all out and right click. Choose the New Field option and (after perhaps changing its name) click on the New Field button in the resulting dialog. This will by default create a field definition file called xxxx.EE where xxxx is the stem of the name of the file you are viewing. You can change this name to DEFAULT.EE if you like, and this will avoid having to call up the field definitions explicitly for other documents in the same directory.
2. Request logging of fields in text format via the Options menu or when exporting the file.

Click the question to show/hide the answerWhen I try to extract text I get an empty file - why is this?

There is no text in your file, just graphics. EscapeE (from version 8.50) can do OCR, but you must have either
Microsoft® Office 2003 or 2007 loaded onto your PC with the Microsoft Office Document Imaging tool (MODI) and you must purchase the OCR plugin from RedTitan® or
if you do not have MODI because you have a more recent version of Office then RedTitan have an alternative OCR plugin. Check the EscapeE help index and look for OCR for further details or contact help@redtitan.com.

Alternatively, if it was produced by a Windows® driver there may well be a way of persuading it to produce text. In the Print Setup dialog click on Properties, then Advanced Graphics options. Make sure you choose either Download TrueType® fonts as outline soft fonts or as bitmap soft fonts.

Tip: if your file is mainly graphics, when you right-click on some text you will find that it doesn't enable the "Text details" or "Font properties" options, only "Graphic details".

Click the question to show/hide the answerWhen I tried to extract text I obtained rubbish in my file. Why is this?

You are suffering from the non-standard character codes used by some drivers. Most of the problems come from Windows® drivers, since customized software or UNIX® systems tend to drive printers in a fairly straightforward way, so I assume your output was created via a Windows® driver.
When you use fonts which are not resident on the printer and the driver is forced to download the fonts it may not use the standard ASCII or Latin-1 codes: try selecting Options|Configuration and set 'Type' to 'Windows HP Driver'.
Other drivers assign character codes arbitrarily, in order of their occurrence in the text. This means that, for example, if the text began with the word 'Hello' the character 'H' would have code1, 'e' code 2 etc. which as far as EscapeE's display is concerned is fine but does cause problems in making use of the text. The default configuration is 'Windows Driver' but you can also try 'Other' which applies no code conversion at all.

If it is not possible to change the Windows printer-driver then the EEfonts program enables you to set up a character recognition database which can be used by the RedTitan® EscapeE PCL® viewer to convert the text back to a useful form, either in Windows character set or Unicode.

more >

Click the question to show/hide the answerIn some of the fields why is there more text included than I marked?

The problem is that there are two overlapping pieces of text in your field, so EscapeE concatenates the two. The solution is to be more specific in the searching criteria or perhaps to be more accurate in delimiting the field. For example if the two pieces of text are in different fonts or sizes then you can specify the attributes of the one you want in the Fields/Searching dialog. You can check for overlapping fields by right-clicking on the text and choosing Text Details. You will see a line for each piece of text found at the point where you clicked.

Click the question to show/hide the answerWhen I try to extract a number of fields from the detail line of an invoice, additional data is picked up on this line.

A line is considered part of the field if any part of it falls within the field and the characters on such lines are included if at least half the character's width is in the field. If the fields are not well aligned with the data, extra lines may become included. It is therefore crucial that the fonts do not change between defining the fields and extracting the data (e.g. if Courier is substituted for a missing font). You can sometimes avoid this by making the fields relative to an explicit tag: e.g. make the description fields use the 'Description' text as a reference, so that their offsets are measured from wherever that text is printed.
Tip: the View menu has a Fields option which if ticked causes all the fields to be shown in yellow, and any selected contents in red.

Click the question to show/hide the answerHow do I define a field relative to a tag?

To change a field right click on it and choose 'Field properties'. You can set up fields which only appear for those pages containing a specified textual tag by making them refer to a tag. You define the tag by right-clicking on the tag wherever it appears on the page and choosing 'Define tag'. The text that you clicked on will be shown in the Tag box and may be edited if required. Then click on OK. Then define your field by marking it out (or select an existing field) and in the Field Properties dialog you click on the Reference Field box, then choose the appropriate tag as a reference.

more >

Click the question to show/hide the answerCan I define the page field tags, names and positions directly in the .EE file without having to use the page viewer?

You can write the .EE file yourself, since it is XML and therefore just a text file.

Click the question to show/hide the answerWhen extracting fields, do you have a way of handling different page formats in a single PCL file?

The field extraction can be tailored to each different kind of page by choosing a tag string which is unique to that page and basing a different series of fields on each such tag. You can also define multi-page sets which repeat every n pages (see Field Definitions|Advanced). The starting page can be specified separately, so a field could be defined to start at page 3 and then every 2 pages, or to skip the first page you define a field that starts at page 2 and is then on every page.