Increasingly, information is only offered and passed on in digital form, whereby the reliable and user-friendly Portable Document Format (PDF) has established itself worldwide as the preferred file format. To ensure unrestricted access in every respect, PDF files must meet certain requirements. These are defined in PDF/UA as the ISO standard for accessible PDF documents. It ensures that even people with greatly diminished vision, insufficient command of written language or physical limitations can capture and interactively use documents without outside help. In addition, content from PDF/UA documents on mobile devices is much easier for users to read than from a conventional PDF document. Added value for everyone.
What the stairs-free access for wheelchair users is, is the barrier-free PDF according to the PDF/UA standard when reading and editing documents and forms. Without limiting the diversity of PDF technology, it determines how unrestricted accessibility of content in PDF files is ensured. To this end, the PDF/UA standard defines requirements to exclude barriers when accessing page content, form fields, notes, metadata and other elements of PDF files. This allows users who use special tools - such as screen readers, special mice or voice output and input - to interact with content in PDF documents. PDF/UA, in return, offers the author clearly defined criteria that must be observed when creating accessible documents.
In order to achieve the goal of accessibility, the PDF/UA standard requires, among other things, the following:
- All content must be correctly tagged. Headings, regular paragraphs, lists and tables must be marked as such.
- The levels of the headings must reflect the structure of the document.
- The intended reading sequence for the entire content must be clearly defined.
- A corresponding text must be assigned to pictorial representations, which reflects the pictorial content in words.
- The language in which a text is written must be specified.
- Language changes within the text must also be indicated.
- Information may not be represented solely by colour or contrast.
Most of these requirements are met by additional, invisible markers in the PDF page descriptions, which are called "tags". These tags provide PDF elements with additional information about the content, position and type of the element, and integrate all elements into a semantic overall structure. This defines headings, captions and navigation elements. Graphics and images can be provided with an "alternative text". This is the basis for the practical use of a PDF despite barriers such as blindness, significantly reduced vision or restrictions in mobility that necessitate the use of technical aids. Screen readers are dependent on all content being in text form, which can then be read aloud via speech synthesisers or output as Braille on a Braille display. The use of special mouse, voice control and other tools for effective navigation in and interaction with documents require access to the content structure of the document in the form of the tag structure. This is achieved by logically linking the tags, which means that all content elements are also assigned to a semantic role. These can then be connected in a logical read sequence, regardless of the page and position on which they are placed.
These tags can also be defined later in existing PDFs using programs such as Adobe Acrobat - but this is an extremely tedious and time-consuming task. It is much more effective to perform corresponding preparatory work in creation programs - such as Microsoft Word and PowerPoint or OpenOffice and LibreOffice or for professional publications in Adobe InDesign. They already offer the possibility to create a good quality tagged PDF directly from the application, even if this does not yet achieve PDF/UA quality in some points. Additional tools support the user in closing the remaining gap to PDF/UA conformity.
But how does an author check whether the documents created by him really correspond to the specifications stored in PDF/UA? Since 2013, the Matterhorn Protocol, developed by the PDF Association in coordination with the ISO, has been available as a binding test catalogue for barrier-free PDF documents and forms. It consists of 31 test sections consisting of 136 individual, precisely defined fault conditions. This makes it easier for authors to create and verify PDF/UA-compliant PDF files and forms. Each test section represents a specific area of conformity requirements, such as "definition of text language" or "metadata". The individual error conditions define a specific test at document, page, object or JavaScript level. Some of the error conditions can be programmatically checked by software, for example with the free PDF Accessibility Checker (PAC 2) from the Swiss foundation "Access for all". PAC 2 is considered the first tool based entirely on the Matterhorn protocol. A number of other error conditions must be tested interactively, for example "Headings are not marked as headings". The callas pdfGoHTML plug-in for Adobe Acrobat from callas software supports this. It displays the document structure including reading sequence and alternative texts for images in an easy-to-understand quick diagnosis view with coloured markings.
Accessibility does not only affect people with more or less severe disabilities. A PDF correctly tagged with a clean tag structure is a big step towards the optimal creation, editing, output and utilization of documents in general. For example, search engines can index accessible PDFs much better. Structured PDF files can be used much better than conventional documents. For format conversions - PDF to HTML, unformatted text or RTF, Microsoft Word and OpenOffice Writer - the results are more reliable. On mobile devices with a comparatively small display, content from PDF documents can be presented in a more user-friendly way. This plays a particularly important role in a company shaped by mobile devices.