This site uses cookies for better user experience. To use HTML PDF API, you must agree to our Privacy policy, including Cookie policy.
Let’s talk about security. Sometimes you will need to generate PDF from highly classified documents and send them securely to the HTML PDF API service.
There are a few techniques you can use depending on the level of security you want to achieve.
This article covers general concepts of security when you need to send data to a remote API and it is not only related to the HTML PDF API service.
Security level none/low
You have seen this type of URL many times in online applications. This is a very primitive type of security and should be used only if you want to hide content but it is still publicly available.
Example of URL: http://example.com/e58526b98fe3df274fc0e6fa4247d692
For that purpose you can use the following digest algorithms:
Let’s say you want to convert a user CV to PDF.
Pseudo code:
user.id = 43
user.username = 'luke_skywalker'
md5(user.id + user.username) //md5('43luke_skywalker')
"4036a9adcc389d4244c983981f68d956" //output of md5
Final URL could look something like this:
http://example.com/users/cv/4036a9adcc389d4244c983981f68d956
Pros:
Simple and fast
Cons:
Not secured, just hidden.
Usage:
You just want to hide URL from public domain and exposing the document does not really matter.
Security level medium
If you have some kind of REST API from where you need to authenticate users to fetch private data you could use this technique. HTTPS protocol on your side is much prefered in this case.
Types:
Example of Token based URL:
Pattern: /api/:token/users/:user_id/cv
URL: https://example.com/api/f6dfcda3fd5d4414b5155e9f297e97a0/users/22/cv
Example of URL with basic authentication:
URL: https://username:password@example.com/users/22/cv
.... or you could create the following request to the HTML PDF API service while generating PDF:
curl -H 'Authentication: Token <your token>' \
-d 'url=http://htmlpdfapi.com/examples/example.html' \
-d 'username=luke' \
-d 'pasword=secret' \
'https://htmlpdfapi.com/api/v1/pdf' > result.pdf
Hey, how is this different from a secret URL? Well if a secret URL pattern is broken any URL is available to the attacker. While if you accidentally expose your token or basic auth you can always generate a new one.
Pros:
You can secure your HTML content and assets.
Cons:
Assets must be embedded to achieve a higher level of security resulting in a bigger HTML file.
Usage:
When the text content of a generated PDF needs to be confidential. Assets are part of the design, they do not have any value or they are of reasonable size.
Security level high, very high.
You need a higher level of security and want to generate HTML on the fly just for conversion to PDF and discard it immediately. While using this technique you don’t have to worry about some URL that could get exposed.
Let’s describe it with a pseudo code:
//fetch user from database
user = User.find(43)
//render template to variable
user_top_secret_cv_html = render_template_as_string('cv', user)
response = SomeRestClient('https://htmlpdfapi.com/api/v1/pdf')
.headers({'Authentication': 'Token <your token>'})
.parameters({
html: user_top_secret_cv_html
})
pdf = response.body
As you can see the generated HTML and therefore the PDF document are only available in memory in that process. You can send PDF to the user email, send file to the user browser or save it to some private directory on your server.
Pros:
There is no URL. HTML content is only available in memory in your code. Assets which are part of design can be hosted on HTML PDF API service. Faster conversion due to less downloading/uploading to the HTML PDF API service.
Cons:
Content images like a user picture which is hosted on your server still needs to be public or embedded.
Usage:
Similar to “URL with Authentication”
Security level very high, ultra high
You need the maximum level of security as well as be in control of every bit of data.
- top_secret_report/
- index.html
- images/
- subject/
- luke_skywalker.jpg
- enemies/
- darth_vader.jpg
- associates:
- princess_leia.jpg
- chewbacca.jpg
- han_solo.jpg
- fonts/
- confidential.ttf
… and now the pseudo code:
zip_file = Zip.create_from_directory('./top_secret_report')
response = SomeRestClient('https://htmlpdfapi.com/api/v1/pdf')
.headers({'Authentication': 'Token <your token>'})
.parameters({
file: zip_file
})
pdf = response.body
Pros:
All of the data is secured. It can be combined with assets hosted on HTML PDF API service.
Cons:
Bigger file size.
Usage:
You need to securely send a lot of content images while converting PDF and have both HTML and assets secured.
In the end it all depends on how sensitive you data is so here is a table that can help you decide.
Technique | Type | Security | HTML | Links | Data-URI | Hosted | In file |
---|---|---|---|---|---|---|---|
Secret URL | URL | none, low | P | P | P | - | - |
URL with Authentication | URL | medium, high | S | P | S | - | - |
HTML in post body | HTML | high, very high | S | P | S | S | - |
Compressed file | FILE | very high, ultra high | S | P | S | S | S |
Legend:
PRO TIP:
Whatever level of security you are using always use HTTPS URL when sending data to the HTML PDF API service. https://htmlpdfapi.com/api/v1/
Using the HTTPS protocol on your side can also lift the level of security.