Table of Contents
Adding a new object type
There are many object types in the PDF specification and only some are defined in the library yet. This page explains in detail how to add a new object type to the library and why you should do so.
Why should an object type be defined?
Why bother to define a type? The usual objects are dictionaries which can be processed and viewed as they are.
The attribute /BS of an /FreeText has an untyped dictionary.
To make it nice in the PDFExplorer!
Adding an object type allows for a customized presentation with a printString, an icon, attribute documentation and order etc. (see below for all the details).
The attribute /W
of a /BorderStyle object shows its documentation.
How to define a new type
Usually, you inspect a PDF with the PDFExplorer and find some object which is not documented. To define an object type it is important to have an example open in the PDFExplorer so that you can see the changes. In our example this is the object in the attribute /BS in a /FreeText annotation object.
In order to find out what the object is about, the relevant piece of documentation should be found in the PDF Specification. In our case this is a border style dictionary described in chapter "12.5.4 Border Styles" on page 386.
Add a new Smalltalk class
The new class can be defined with the ClassCreationDialog
:
Choose the package
The package is [PDF Interactive Features]
, because it is related to /Annot which is defined there.
Choose the namespace
This should be Graphics.PDF
, since this is the only namespace for the runtime code of the library.
Choose a class name
As name for this example, I use BorderStyle
. Ideally the name should be the same as used in the PDF specification. If the name does not match the name in the specification, be it because the name is already defined or for estetic reasons, the class method type
(or subtype
, depending on the type inference mechanism) needs to be implemented.
Choose the superclass
Most often, this will be Dictionary
or a TypedDictionary
if the object has the common attribute /Type
. It can also be something exotic as a PDFArray
, a Name
or someting else (see later).
Add a class comment
The first line should give the reference to the PDF specification, followed by the first paragraph of the description in the specification. I usually edit this text to add line breaks after sentences and remove any cross references to other parts of the specification:
PDF border style dictionary as defined in PDF 32000_2008.pdf, section 12.5.4, pp. 386. An annotation may optionally be surrounded by a border when displayed or printed. If present, the border shall be drawn completely inside the annotation rectangle. In PDF 1.1, the characteristics of the border shall be specified by the Border entry in the annotation dictionary. Beginning with PDF 1.2, the border characteristics for some types of annotations may instead be specified in a border style dictionary designated by the annotation’s BS entry. Such dictionaries may also be used to specify the width and dash pattern for the lines drawn by line, square, circle, and ink annotations. If neither the Border nor the BS entry is present, the border shall be drawn as a solid line with a width of 1 point.
Add class methods
Two more bits of information should be added as methods on the class side.
The documentationPlace
defines the section in the PDF specification. This is a more recent addition intented to be able to jump directly to the corresponding place in the specification PDF from the code browser or the PDFExplorer. This has not been done yet and most objects don't have this method, but for new objects, I add it. Eventually, I will add this to all objects.
documentationPlace
^#(12 5 4)
If the object type was not part of the original PDF specification 1.0, the version should be added. version
notes the minor part of the PDF version in which this feature first occurred, allowing for computing the minimal version for a PDF.
version
^2
The version is usually mentioned in the specification of the object. After I add this method, I remove the corresponding text from the class comment.
Reset the object types
Since a new type is defined, the object types have to be reset with
PDF resetObjecttypes.
This clears the cache for all the object types (Smalltalk classes - 137 at the time of writing). On next access, the cache is filled with all known types, including the newly defined ones, so that the new type can be found.
This has to be done only when a new class is defined.
Use the new type for the containing attributes
The new type can now be used. Therefore, the type of the attribute which contains the object should be set to the new type. In the example, in the method BS
of class FreeText
, the type:
pragma should be changed from
BS <type: #Dictionary> <version: 6> <attribute: 9 documentation: 'A border style dictionary specifying the line width and dash pattern that shall be used in drawing the annotation’s border. The annotation dictionary’s AP entry, if present, takes precedence over the BS entry'> ^self objectAt: #BS ifAbsent: [Dictionary empty]
to
BS <type: #BorderStyle> <version: 6> <attribute: 9 documentation: 'A border style dictionary specifying the line width and dash pattern that shall be used in drawing the annotation’s border. The annotation dictionary’s AP entry, if present, takes precedence over the BS entry'> ^self objectAt: #BS ifAbsent: [BorderStyle empty]
The result looks like this in the PDFExplorer (after hitting F5 for refresh):
the style is recognized as BorderStyle and it shows the right version (PDF-1.2), but the required field Type
is red (error) and the W
field is pink (not known).
Add attributes
Attributes are added as methods in protocol accessing entries
named like the key in the definition, even with a capital letter, although this is not common Smalltalk style.
The first two attributes (of 4) look like this in the PDF specification:
The corresponding methods look like this:
Type <type: #Name> <attribute: 1 documentation: 'The type of PDF object that this dictionary describes.'> ^self objectAt: #Type ifAbsent: [#Border asPDF]
W <type: #Number> <attribute: 2 documentation: 'The border width in points. If this value is 0, no border shall drawn.'> ^self objectAt: #W ifAbsent: [1 asPDF]
An attribute method consists of a number of describing pragmas and the code for access.
The type: pragma
Mandatory is the <type: aSymbol>
pragma: it takes the name symbol of the Smalltalk class implementing the PDF type. This is derived from the "Type" column of the definition table. For more information about typing and the possible type pragmas, see typing.
The documentation pragma
The documentation is specified in the <attribute: anInteger documentation: aString>
pragma. The first parameter defines the order of the attribute, so that they can be displayed in the same order as they are defined by the PDF specification. The first attribute shall be 1
and the next ones are numbered consecutively.
The documentation is taken directly from the specification and edited, so that all information is removed which is expressed directly in the method. In our example, the "(Optional)" is removed, because this is implied. If the attribute is required, the <required>
pragma is used to express this fact.
The description of the default value is also removed, because this is evident from the access code.
Also references to other parts of the specification are removed (which is not the case in the example).
The version: pragma
Often, new attributes were added with later PDF versions. The version of an attribute, if it is higher than the version of the type, can be noted with the <version: anInteger>
pragma, where the argument is the minor version of the PDF (i.e. 2 for version PDF-1.2).
The access code
The access code can be either
^self objectAt: #Type ifAbsent: [#Border asPDF]
for optional attributes with a default value, or
^self objectAt: #Type
for a required attribute. This will raise an error if the attribute is not present in the object.
The method will return the object of the value of the attribute. The object is either stored directly in the attribute or a reference to it. In any case, the object is returned. To access the value (object or reference), the following methods can be used:
^self at: #Type ifAbsent: [#Border asPDF]
^self at: #Type
Customize an object type
Now, the PDF type is sufficiently defined to be usefully displayed in the PDFExplorer. But more can be done by defining some of the following methods.
Optional customization methods
Display name
The method listText
returns a short Text used in the treeview of the PDFExplorer. The method titleText
is used for the display of the selected object on the right side.
Icon
The method toolListIcon
can be defined on the class side to get an icon for the class in the Smalltalk browser. A PDF type class has the method listIcon
on the instance side which, by default, is the toolListIcon
of the class. Therefore it is possible to select an icon depending on the object's state.
Excluding attributes
Some attributes clutter the treeview on the left side of the PDFExplorer. For example, every TypedDictionary
has the attribute /Type
which is usually used as the name of the object itself.
By defining the method displayKeysToOmit
, such attributes can be excluded from the children of the object in the treeview. For the class TypedDictionary
the method looks like this:
displayKeysToOmit
^super displayKeysToOmit , #(#Type)