# PDF Server

The Form.io PDF Solution is a powerful tool with two options: PDF Server and PDF Server Plus. Please see the below description of each to see which aligns best with the requirements of the Application being built.&#x20;

{% hint style="warning" %}
The Form.io PDF server is currently not compatible with Apple M1 chips and ARM64 architecture. The guide below will detail how to run this server on other systems like an Intel Mac or Linux machine.&#x20;
{% endhint %}

### PDF Basic&#x20;

**The Use Case:** PDF print out of the JSON-driven dynamic webform submissions filled out by the end-user within your application.&#x20;

### PDF Plus&#x20;

**The Use Case:** Forms are required to be on a pixel-perfect PDF background with a dynamic JSON form overlay on top of the PDF.&#x20;

A PDF template designer that allows customization of the data and layout for PDF downloads generated from webform submissions made by the end-user

PDF-First forms can also be presented to the user as responsive web forms, but enable the ability to print the Submission to the pixel-perfect PDF background.&#x20;

{% hint style="info" %}
Contact <sales@form.io> for more information on the PDF Plus Server&#x20;
{% endhint %}

### Deploying the PDF Server in a Private Cloud or On-Premise Environment

Recommended CPU/memory config is 2 CPU cores and 4GB of memory.

A common deployment command for the PDF server looks as follows.

```bash
docker run -itd \
  -e "LICENSE_KEY=YOURLICENSE" \
  -e "MONGO=mongodb://mongo:27017/formio" \
  -e "FORMIO_S3_SERVER=seaweedfs" \
  -e "FORMIO_S3_PORT=8333" \
  -e "FORMIO_S3_BUCKET=formio" \
  -e "FORMIO_S3_KEY=CHANGEME" \
  -e "FORMIO_S3_SECRET=CHANGEME" \
  --network formio \
  --link formio-mongo:mongo \
  --link formio-seaweedfs:seaweedfs\
  --restart unless-stopped \
  --name pdf-server \
  -p 4005:4005 \
  formio/pdf-server;
```

Which uses the following Environment Variables.

### Environment Variables for Deployment

#### Required Environment Variables

The following environment variables are always required.

| Environment Variable | Description                                                                                                                                                                                                 |
| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `LICENSE_KEY`        | A Form.io license key                                                                                                                                                                                       |
| MONGO                | The connection string to your MongoDB-compliant database                                                                                                                                                    |
| DB\_SECRET           | The database secret. Although not technically required, it must match your Enterprise Server's [DB\_SECRET environment variable](https://help.form.io/deployments/enterprise-server#environment-variables). |

#### File Storage Environment Variables

The following environment variables involve the PDF Plus Server's file storage integration for storing PDF files. The PDF server currently supports [Amazon S3](https://help.form.io/deployments/cloud-deployment/aws#cloud-file-storage-using-s3), Amazon S3 compatible servers, and [Azure Blob](https://help.form.io/deployments/cloud-deployment/azure#azure-blob-storage-setup).

| Environment Variable              | Description                                                                                                                                                                                                                                                                         |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| FORMIO\_S3\_SERVER                | S3 hostname or IP address. If set, the PDF Server will assume a S3 compatible configuration. If not set AND other FORMIO\_S3\_\* variables are set (in particular, FORMIO\_S3\_REGION, FORMIO\_S3\_KEY, or FORMIO\_S3\_SECRET), the PDF Server will assume an AWS S3 configuration. |
| FORMIO\_S3\_PORT                  | S3 port. Defaults to 8333.                                                                                                                                                                                                                                                          |
| FORMIO\_S3\_BUCKET                | S3 bucket name. Required when using S3                                                                                                                                                                                                                                              |
| FORMIO\_S3\_KEY                   | AWS Access Key Id if using AWS S3, `accessKey` (the user-like id)                                                                                                                                                                                                                   |
| FORMIO\_S3\_SECRET                | AWS Secret Access Key if using AWS S3, `secretKey` (the password-like id)                                                                                                                                                                                                           |
| FORMIO\_AZURE\_CONNECTION\_STRING | The connection string to your Azure Blob deployment. Required when using Azure Blob storage.                                                                                                                                                                                        |
| FORMIO\_AZURE\_CONTAINER          | The Azure Blob container name. Required when using Azure Blob storage.                                                                                                                                                                                                              |

#### Optional Environment Variables

The following environment variables are optional.

| Environment Variable         | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `FORMIO_PDF_PORT`            | The port that the PDF Server listens on. Defaults to 4005.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `FORMIO_PDF_ADMINKEY`        | Set this environment variable to communicate server-to-server with the PDF Server. The value for this environment variable passed as the `x-admin-key` header when sending server-to-server API calls to the PDF Server. See <https://apidocs.form.io/#8e817291-38df-4338-889f-14cf38de5fbd> for more information on how to use this header.                                                                                                                                                                                                                                                                                                                                                         |
| `MONGO_DB_NAME`              | Mongo database name, if using localhost                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| `MONGO_CA`                   | File path to an SSL certificate authority file. This is usually a file with an extension of ".pem".  For example, AWS DocumentDB may require this to be like the following value:  "/src/certs/rds-combined-ca-bundle.pem"                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `DOCKER_SECRETS`             | Whether to enable using docker secrets.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| `DOCKER_SECRETS_PATH`        | The path to the docker secrets.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `LICENSE_REMOTE`             | Boolean to indicate if the license key provided is an offline remote license key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `PDF_BROWSER_TIMEOUT`        | Determines how long (in milliseconds) the browser can run and execute before timing out. Default is 120000 milliseconds or 2 minutes.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| `DEBUG`                      | <p>Perform debugging within a PDF server. The following debug commands will do the following. This uses the Debug Module for Node.js so documentation can be found @ <a href="https://github.com/visionmedia/debug"><https://github.com/visionmedia/debug></a>.</p><p> - <code>DEBUG=*</code>:  Debug everything</p><p> - <code>DEBUG=pdf.*</code>:  Debug all PDF related events</p><p> - <code>DEBUG=pdf.create.*</code>: Debug the PDF upload process </p><p> - <code>DEBUG=pdf.get.*</code>: Debug the PDF fetching process</p><p> - <code>DEBUG=pdf.delete.*</code>:  Debug the PDF deleting process</p><p> - <code>DEBUG=pdf.download.*</code>: Debug the PDF submission download process.</p> |
| `MAX_BODY_SIZE`              | The maximum request body size. If this is a number, then the value specifies the number of bytes; if it is a string, the value is parsed by the [bytes](https://www.npmjs.com/package/bytes) library. Defaults to `'16mb'`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `PAGE_POOL_PAGE_TTL`         | Sets the time-to-live for page pool pages, in milliseconds. The default is 3 minutes (180000).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `DEFAULT_PAGE_POOL_SIZE`     | Overrides the default size of the page pool. Unless set, the default is 5.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `FORMIO_FONT_RENDER_HINTING` | <p>Sets the level of adjustment applied to rendered fonts. Options are <code>none</code>, <code>slight</code>, <code>medium</code>, <code>full</code>, and <code>max</code>.  <br>The default value is <code>full</code>.<br>If fonts are rendering unusually, setting this to <code>none</code> may improve legibility.  </p>                                                                                                                                                                                                                                                                                                                                                                       |

### Configure Portal for PDF Server

Now that you have the PDF server operational as well as the API Server, the next thing that needs to happen is to configure the Form.io Portal to use the correct DNS URL for the PDF Server URL. This is necessary because the "internal" DNS URL is used within the Enterprise Server configurations, and we now need to let the Portal know how to directly communicate with the PDF Server. To set this up, you will first need to navigate to your Project Settings and then click on **PDF Server** configuration page. Here you will then need to provide the Public DNS URL to the pdf server as follows.

{% hint style="info" %}
The PDF Server URL field shown in Project Settings was deprecated in 8.0.1 and removed from the interface. It is now fully managed by the **PDF\_SERVER** environment variable that is set on the [API server](https://help.form.io/deployments/enterprise-server#environment-variables).
{% endhint %}

![Configure the Public DNS url to the PDF server in your project settings.](https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MPHoF2HwOA0s5HV_AIB%2F-MQL5YEPeSEx3JcIVfTl%2F-MQL5qRUgEqi58tPlEY8%2FScreen%20Shot%202021-01-05%20at%2011.10.48%20PM.png?alt=media\&token=01fe58a4-3ff3-4338-993d-d24f4c224643)

Once you have done this, you should now be able to upload an existing PDF form, where it will then convert that PDF document into a Form.io webform overlay on top of that PDF form.

### PDF Server Fine Tuning

This section covers some aspects of fine tuning a PDF server. If you want to speed up your deployment or you want to configure your deployment for large PDF handling, you might want to give a look at this section.

#### Mounting /tmp directory to RAM

If you have a virtual machine with large amount of RAM available, you can benefit from it by mounting container's `/tmp` directory to RAM. You can to this by adding `--tmpfs /tmp:rw` parameter to your `docker run` command. This will allow PDF Server to speed up IO processes by storing files in RAM instead of disk.

#### Configuring file cache time

File cache is a disk cache for HTML files used for PDF downloads. Default cache time is one hour. If you have a lot of PDF first forms with different PDFs in the background and you expect that all that forms will be printed to PDF very often, you might want to configure file cache time to eliminate disk memory limit reach. To do this you need to add `FILE_CACHE_TIME` environment variable to PDF Server config. The value should be number of milliseconds. The smaller cache time will result to less disk space usage and lower performance. The bigger cache time will result to more disk space usage and increased performance.

#### Configuring default viewer

If you prefer to use custom viewer rather that default viewer for PDF downloads, you can increase PDF download performance by adding `PAGE_POOL_DEFAULT_VIEWER` an environment variable to PDF server config. Its value should be the URL of your custom viewer.

#### Configuring timeouts

Here is the list of all configurable timeouts and their purposes:

| Environment variable      | Description                                                                                                                                                                                                                                    | Default value |
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| PDF\_BROWSER\_TIMEOUT     | Used for PDF downloads. If you are printing large PDF files or your custom viewer has a lot of contents, you might want to increase this value.                                                                                                | 120000        |
| PDF\_PRINTING\_TIMEOUT    | Used for PDF downloads. If you are printing large submissions and/or large PDF files, consider increasing this value.                                                                                                                          | 30000         |
| HTML\_GENERATION\_TIMEOUT | Primary timeout for html generation. It's used for PDF uploads. If you are going to upload large PDF files, consider increasing this value.                                                                                                    | 7000          |
| HTML\_GENERATION\_BACKOFF | Backoff timeout for html generation. Its sum with primary timeout will be used for second try after postscript optimisation in case when first try failed. If you are going to upload large PDF files, consider increasing this value as well. | 23000         |

### PDF Server Plus Auto-Conversion&#x20;

The PDF Server Plus includes Auto-Conversion of existing fillable and non-fillable PDF Forms. Using AWS Textract for PDF First Forms that do not have any metadata that can be read for conversion, the PDF Server will auto-detect Field Input Types, Labels and more and convert the PDF to a dynamic Form.io Form!

### AWS Textract Integration

After setting up PDF Server it is possible to integrate with [AWS Textract](https://aws.amazon.com/textract/) to enable Formfields Recognition feature.

#### Steps to integrate:

1. Create IAM role with **AmazonTextractServiceRole** & **AmazonSNSFullAccess** policies

   * Create a **Custom trust policy** with the following JSON structure.

   ```json
   {
       "Version": "2012-10-17",
       "Statement": [
           {
               "Sid": "",
               "Effect": "Allow",
               "Principal": {
                   "Service": "textract.amazonaws.com"
               },
               "Action": "sts:AssumeRole"
           }
       ]
   }
   ```

   * Next add your **AmazonTextractServiceRole** & **AmazonSNSFullAccess** permissions.<br>

     <figure><img src="https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MPHoF2HwOA0s5HV_AIB%2Fuploads%2Fk5U87WJtysxe9ciH27Aj%2Fimage.png?alt=media&#x26;token=ed4ecb53-a166-4370-810f-7b6bcaeff3e7" alt=""><figcaption></figcaption></figure>
   * Add a **Role name** and then press **Create Role** in the bottom right corner.<br>

     <figure><img src="https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MPHoF2HwOA0s5HV_AIB%2Fuploads%2FHItrrOsgY6PEZ52QkxyJ%2Fimage.png?alt=media&#x26;token=bf271283-60f8-4f7c-9be8-738afad54b61" alt=""><figcaption></figcaption></figure>
2. Verify that the AWS User has permissions for **textract:StartDocumentAnalysis** & **textract:GetDocumentAnalysis**&#x20;

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "textract:StartDocumentAnalysis",
                "textract:GetDocumentAnalysis"
            ],
            "Resource": "*"
        }
    ]
}
```

3. Create AWS Simple Notification Service topic and save its ARN
4. Add the following environment variables to the configuration:

<table data-full-width="false"><thead><tr><th>Variable</th><th width="351.3333333333333">Description</th><th>Default value</th></tr></thead><tbody><tr><td>FORMIO_S3_BUCKET</td><td>Name of the S3 bucket where files are stored</td><td>-</td></tr><tr><td>FORMIO_S3_KEY</td><td>AWS public access key</td><td>-</td></tr><tr><td>FORMIO_S3_SECRET</td><td>AWS secret access key</td><td>-</td></tr><tr><td>FORMIO_S3_REGION</td><td>AWS bucket region</td><td>us-east-1</td></tr><tr><td>TEXTRACT_ROLE_ARN</td><td>ARN of IAM role for Textract.</td><td>-</td></tr><tr><td>TEXTRACT_SNS_TOPIC_ARN</td><td>ARN of SNS topic.</td><td>-</td></tr><tr><td>TEXTRACT_OUTPUT_FOLDER</td><td>S3 bucket folder where Textract will store its output.</td><td>textract-output</td></tr><tr><td>TEXTRACT_MAX_SYNC_COUNT</td><td>Allows for configure the timeout for Textract processing to finish. Increase for larger non-fillable documents. (Optional)</td><td>90</td></tr></tbody></table>

4\. After starting the PDF environment with new environment variables, go to the previously created SNS topic page and create a subscription:

* Leave **Topic ARN** as-is
* In Protocol field choose **HTTPS**
* In Endpoint field type \*public URL of your environment. \
  **Example: <https://my.formioapi.com/pdf/pdf/sns>**

<figure><img src="https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MPHoF2HwOA0s5HV_AIB%2Fuploads%2FNrNV5K8ZFFlHVSQMy8nd%2Fimage.png?alt=media&#x26;token=d78cb5f7-77c9-47d5-9c0b-d63517a45d9f" alt=""><figcaption><p>Creating a subscription for a specified SNS topic</p></figcaption></figure>

5. After configuration you can check the status of formfields recognition by visiting URL \
   `{{apiServerUrl}}/pdf/pdf/sns/recognizeFormfieldsStatus`

<figure><img src="https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MPHoF2HwOA0s5HV_AIB%2Fuploads%2FKL0kRU12o2FJR9gKA0Qn%2Fimage.png?alt=media&#x26;token=f3616881-d95d-458c-a9ef-85325bd0074f" alt=""><figcaption></figcaption></figure>

6. Validate that PDF Plus upload works with a Non-fillable PDF and has Textract perform field recognition on the non-fillable pdf form. An example non-fillable pdf form has been provided below.

{% file src="<https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MPHoF2HwOA0s5HV_AIB%2Fuploads%2FGh0iJFJUabiPfdccXAHE%2Fonboarding_form.pdf?alt=media&token=5c3eb6dc-a56a-448f-912c-721138888f4c>" %}

<figure><img src="https://3305536326-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MPHoF2HwOA0s5HV_AIB%2Fuploads%2Fba4BhUzqam7r5368Ehqz%2Fimage.png?alt=media&#x26;token=7e1ac73b-ff5a-40bc-bd49-991b8ec64a3c" alt=""><figcaption><p>How the onboarding form should look when Textract is enabled</p></figcaption></figure>

In the case of misconfiguration, you will not be able to upload a PDF file.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.form.io/deployments/deployment-guide/pdf-server.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
