pgpdf

pgpdf

pgpdf : PDF type with meta admin & Full-Text Search

Overview

ID Extension Package Version Category License Language
3530
pgpdf
pgpdf
0.1.0
TYPE
GPL-3.0
C
Attribute Has Binary Has Library Need Load Has DDL Relocatable Trusted
--sLdtr
No
Yes
Yes
Yes
yes
yes
Relationships
See Also
pgjq
pgjwt
prefix
semver
unit
pglite_fusion
md5hash
asn1oid

Packages

Type Repo Version PG Major Compatibility Package Pattern Dependencies
EXT
PIGSTY
0.1.0
18
17
16
15
14
pgpdf -
RPM
PIGSTY
0.1.0
18
17
16
15
14
pgpdf_$v -
DEB
PIGSTY
0.1.0
18
17
16
15
14
postgresql-$v-pgpdf -
Linux / PG PG18 PG17 PG16 PG15 PG14
el8.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
el8.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
el9.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
el9.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
el10.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
el10.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
d12.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
d12.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
d13.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
d13.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
u22.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
u22.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
u24.x86_64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
u24.aarch64
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
PIGSTY 0.1.0
Package Version OS ORG SIZE File URL
pgpdf_18 0.1.0 el8.x86_64 pigsty 18.4 KiB pgpdf_18-0.1.0-1PIGSTY.el8.x86_64.rpm
pgpdf_18 0.1.0 el8.x86_64 pgdg 15.4 KiB pgpdf_18-0.1.0-1PGDG.rhel8.x86_64.rpm
pgpdf_18 0.1.0 el8.aarch64 pigsty 18.0 KiB pgpdf_18-0.1.0-1PIGSTY.el8.aarch64.rpm
pgpdf_18 0.1.0 el8.aarch64 pgdg 14.8 KiB pgpdf_18-0.1.0-1PGDG.rhel8.aarch64.rpm
pgpdf_18 0.1.0 el9.x86_64 pigsty 18.2 KiB pgpdf_18-0.1.0-1PIGSTY.el9.x86_64.rpm
pgpdf_18 0.1.0 el9.x86_64 pgdg 15.6 KiB pgpdf_18-0.1.0-1PGDG.rhel9.x86_64.rpm
pgpdf_18 0.1.0 el9.aarch64 pigsty 17.7 KiB pgpdf_18-0.1.0-1PIGSTY.el9.aarch64.rpm
pgpdf_18 0.1.0 el9.aarch64 pgdg 14.9 KiB pgpdf_18-0.1.0-1PGDG.rhel9.aarch64.rpm
pgpdf_18 0.1.0 el10.x86_64 pigsty 18.2 KiB pgpdf_18-0.1.0-1PIGSTY.el10.x86_64.rpm
pgpdf_18 0.1.0 el10.x86_64 pgdg 16.0 KiB pgpdf_18-0.1.0-1PGDG.rhel10.x86_64.rpm
pgpdf_18 0.1.0 el10.aarch64 pigsty 17.9 KiB pgpdf_18-0.1.0-1PIGSTY.el10.aarch64.rpm
pgpdf_18 0.1.0 el10.aarch64 pgdg 15.5 KiB pgpdf_18-0.1.0-1PGDG.rhel10.aarch64.rpm
postgresql-18-pgpdf 0.1.0 d12.x86_64 pigsty 25.3 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~bookworm_amd64.deb
postgresql-18-pgpdf 0.1.0 d12.aarch64 pigsty 24.7 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~bookworm_arm64.deb
postgresql-18-pgpdf 0.1.0 d13.x86_64 pigsty 25.3 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~trixie_amd64.deb
postgresql-18-pgpdf 0.1.0 d13.aarch64 pigsty 24.7 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~trixie_arm64.deb
postgresql-18-pgpdf 0.1.0 u22.x86_64 pigsty 26.7 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~jammy_amd64.deb
postgresql-18-pgpdf 0.1.0 u22.aarch64 pigsty 26.2 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~jammy_arm64.deb
postgresql-18-pgpdf 0.1.0 u24.x86_64 pigsty 26.5 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~noble_amd64.deb
postgresql-18-pgpdf 0.1.0 u24.aarch64 pigsty 25.7 KiB postgresql-18-pgpdf_0.1.0-1PIGSTY~noble_arm64.deb
Package Version OS ORG SIZE File URL
pgpdf_17 0.1.0 el8.x86_64 pigsty 18.4 KiB pgpdf_17-0.1.0-1PIGSTY.el8.x86_64.rpm
pgpdf_17 0.1.0 el8.x86_64 pgdg 15.4 KiB pgpdf_17-0.1.0-1PGDG.rhel8.x86_64.rpm
pgpdf_17 0.1.0 el8.aarch64 pigsty 18.0 KiB pgpdf_17-0.1.0-1PIGSTY.el8.aarch64.rpm
pgpdf_17 0.1.0 el8.aarch64 pgdg 14.8 KiB pgpdf_17-0.1.0-1PGDG.rhel8.aarch64.rpm
pgpdf_17 0.1.0 el9.x86_64 pigsty 18.2 KiB pgpdf_17-0.1.0-1PIGSTY.el9.x86_64.rpm
pgpdf_17 0.1.0 el9.x86_64 pgdg 15.6 KiB pgpdf_17-0.1.0-1PGDG.rhel9.x86_64.rpm
pgpdf_17 0.1.0 el9.aarch64 pigsty 17.8 KiB pgpdf_17-0.1.0-1PIGSTY.el9.aarch64.rpm
pgpdf_17 0.1.0 el9.aarch64 pgdg 15.0 KiB pgpdf_17-0.1.0-1PGDG.rhel9.aarch64.rpm
pgpdf_17 0.1.0 el10.x86_64 pigsty 18.2 KiB pgpdf_17-0.1.0-1PIGSTY.el10.x86_64.rpm
pgpdf_17 0.1.0 el10.x86_64 pgdg 16.0 KiB pgpdf_17-0.1.0-1PGDG.rhel10.x86_64.rpm
pgpdf_17 0.1.0 el10.aarch64 pigsty 17.9 KiB pgpdf_17-0.1.0-1PIGSTY.el10.aarch64.rpm
pgpdf_17 0.1.0 el10.aarch64 pgdg 15.5 KiB pgpdf_17-0.1.0-1PGDG.rhel10.aarch64.rpm
postgresql-17-pgpdf 0.1.0 d12.x86_64 pigsty 25.3 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~bookworm_amd64.deb
postgresql-17-pgpdf 0.1.0 d12.aarch64 pigsty 24.6 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~bookworm_arm64.deb
postgresql-17-pgpdf 0.1.0 d13.x86_64 pigsty 25.2 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~trixie_amd64.deb
postgresql-17-pgpdf 0.1.0 d13.aarch64 pigsty 24.6 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~trixie_arm64.deb
postgresql-17-pgpdf 0.1.0 u22.x86_64 pigsty 27.4 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~jammy_amd64.deb
postgresql-17-pgpdf 0.1.0 u22.aarch64 pigsty 27.0 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~jammy_arm64.deb
postgresql-17-pgpdf 0.1.0 u24.x86_64 pigsty 26.4 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~noble_amd64.deb
postgresql-17-pgpdf 0.1.0 u24.aarch64 pigsty 25.7 KiB postgresql-17-pgpdf_0.1.0-1PIGSTY~noble_arm64.deb
Package Version OS ORG SIZE File URL
pgpdf_16 0.1.0 el8.x86_64 pigsty 18.4 KiB pgpdf_16-0.1.0-1PIGSTY.el8.x86_64.rpm
pgpdf_16 0.1.0 el8.x86_64 pgdg 15.4 KiB pgpdf_16-0.1.0-1PGDG.rhel8.x86_64.rpm
pgpdf_16 0.1.0 el8.aarch64 pigsty 18.0 KiB pgpdf_16-0.1.0-1PIGSTY.el8.aarch64.rpm
pgpdf_16 0.1.0 el8.aarch64 pgdg 14.8 KiB pgpdf_16-0.1.0-1PGDG.rhel8.aarch64.rpm
pgpdf_16 0.1.0 el9.x86_64 pigsty 18.2 KiB pgpdf_16-0.1.0-1PIGSTY.el9.x86_64.rpm
pgpdf_16 0.1.0 el9.x86_64 pgdg 15.6 KiB pgpdf_16-0.1.0-1PGDG.rhel9.x86_64.rpm
pgpdf_16 0.1.0 el9.aarch64 pigsty 17.8 KiB pgpdf_16-0.1.0-1PIGSTY.el9.aarch64.rpm
pgpdf_16 0.1.0 el9.aarch64 pgdg 15.0 KiB pgpdf_16-0.1.0-1PGDG.rhel9.aarch64.rpm
pgpdf_16 0.1.0 el10.x86_64 pigsty 18.2 KiB pgpdf_16-0.1.0-1PIGSTY.el10.x86_64.rpm
pgpdf_16 0.1.0 el10.x86_64 pgdg 16.0 KiB pgpdf_16-0.1.0-1PGDG.rhel10.x86_64.rpm
pgpdf_16 0.1.0 el10.aarch64 pigsty 17.9 KiB pgpdf_16-0.1.0-1PIGSTY.el10.aarch64.rpm
pgpdf_16 0.1.0 el10.aarch64 pgdg 15.5 KiB pgpdf_16-0.1.0-1PGDG.rhel10.aarch64.rpm
postgresql-16-pgpdf 0.1.0 d12.x86_64 pigsty 25.3 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~bookworm_amd64.deb
postgresql-16-pgpdf 0.1.0 d12.aarch64 pigsty 24.6 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~bookworm_arm64.deb
postgresql-16-pgpdf 0.1.0 d13.x86_64 pigsty 25.2 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~trixie_amd64.deb
postgresql-16-pgpdf 0.1.0 d13.aarch64 pigsty 24.7 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~trixie_arm64.deb
postgresql-16-pgpdf 0.1.0 u22.x86_64 pigsty 27.4 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~jammy_amd64.deb
postgresql-16-pgpdf 0.1.0 u22.aarch64 pigsty 27.0 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~jammy_arm64.deb
postgresql-16-pgpdf 0.1.0 u24.x86_64 pigsty 26.4 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~noble_amd64.deb
postgresql-16-pgpdf 0.1.0 u24.aarch64 pigsty 25.6 KiB postgresql-16-pgpdf_0.1.0-1PIGSTY~noble_arm64.deb
Package Version OS ORG SIZE File URL
pgpdf_15 0.1.0 el8.x86_64 pigsty 18.4 KiB pgpdf_15-0.1.0-1PIGSTY.el8.x86_64.rpm
pgpdf_15 0.1.0 el8.x86_64 pgdg 15.4 KiB pgpdf_15-0.1.0-1PGDG.rhel8.x86_64.rpm
pgpdf_15 0.1.0 el8.aarch64 pigsty 17.9 KiB pgpdf_15-0.1.0-1PIGSTY.el8.aarch64.rpm
pgpdf_15 0.1.0 el8.aarch64 pgdg 14.8 KiB pgpdf_15-0.1.0-1PGDG.rhel8.aarch64.rpm
pgpdf_15 0.1.0 el9.x86_64 pigsty 18.2 KiB pgpdf_15-0.1.0-1PIGSTY.el9.x86_64.rpm
pgpdf_15 0.1.0 el9.x86_64 pgdg 15.6 KiB pgpdf_15-0.1.0-1PGDG.rhel9.x86_64.rpm
pgpdf_15 0.1.0 el9.aarch64 pigsty 17.7 KiB pgpdf_15-0.1.0-1PIGSTY.el9.aarch64.rpm
pgpdf_15 0.1.0 el9.aarch64 pgdg 15.0 KiB pgpdf_15-0.1.0-1PGDG.rhel9.aarch64.rpm
pgpdf_15 0.1.0 el10.x86_64 pigsty 18.2 KiB pgpdf_15-0.1.0-1PIGSTY.el10.x86_64.rpm
pgpdf_15 0.1.0 el10.x86_64 pgdg 16.0 KiB pgpdf_15-0.1.0-1PGDG.rhel10.x86_64.rpm
pgpdf_15 0.1.0 el10.aarch64 pigsty 17.9 KiB pgpdf_15-0.1.0-1PIGSTY.el10.aarch64.rpm
pgpdf_15 0.1.0 el10.aarch64 pgdg 15.5 KiB pgpdf_15-0.1.0-1PGDG.rhel10.aarch64.rpm
postgresql-15-pgpdf 0.1.0 d12.x86_64 pigsty 25.3 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~bookworm_amd64.deb
postgresql-15-pgpdf 0.1.0 d12.aarch64 pigsty 24.6 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~bookworm_arm64.deb
postgresql-15-pgpdf 0.1.0 d13.x86_64 pigsty 25.3 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~trixie_amd64.deb
postgresql-15-pgpdf 0.1.0 d13.aarch64 pigsty 24.7 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~trixie_arm64.deb
postgresql-15-pgpdf 0.1.0 u22.x86_64 pigsty 27.5 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~jammy_amd64.deb
postgresql-15-pgpdf 0.1.0 u22.aarch64 pigsty 27.0 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~jammy_arm64.deb
postgresql-15-pgpdf 0.1.0 u24.x86_64 pigsty 26.5 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~noble_amd64.deb
postgresql-15-pgpdf 0.1.0 u24.aarch64 pigsty 25.7 KiB postgresql-15-pgpdf_0.1.0-1PIGSTY~noble_arm64.deb
Package Version OS ORG SIZE File URL
pgpdf_14 0.1.0 el8.x86_64 pigsty 18.4 KiB pgpdf_14-0.1.0-1PIGSTY.el8.x86_64.rpm
pgpdf_14 0.1.0 el8.x86_64 pgdg 15.4 KiB pgpdf_14-0.1.0-1PGDG.rhel8.x86_64.rpm
pgpdf_14 0.1.0 el8.aarch64 pigsty 17.9 KiB pgpdf_14-0.1.0-1PIGSTY.el8.aarch64.rpm
pgpdf_14 0.1.0 el8.aarch64 pgdg 14.8 KiB pgpdf_14-0.1.0-1PGDG.rhel8.aarch64.rpm
pgpdf_14 0.1.0 el9.x86_64 pigsty 18.2 KiB pgpdf_14-0.1.0-1PIGSTY.el9.x86_64.rpm
pgpdf_14 0.1.0 el9.x86_64 pgdg 15.6 KiB pgpdf_14-0.1.0-1PGDG.rhel9.x86_64.rpm
pgpdf_14 0.1.0 el9.aarch64 pigsty 17.7 KiB pgpdf_14-0.1.0-1PIGSTY.el9.aarch64.rpm
pgpdf_14 0.1.0 el9.aarch64 pgdg 15.0 KiB pgpdf_14-0.1.0-1PGDG.rhel9.aarch64.rpm
pgpdf_14 0.1.0 el10.x86_64 pigsty 18.2 KiB pgpdf_14-0.1.0-1PIGSTY.el10.x86_64.rpm
pgpdf_14 0.1.0 el10.x86_64 pgdg 16.0 KiB pgpdf_14-0.1.0-1PGDG.rhel10.x86_64.rpm
pgpdf_14 0.1.0 el10.aarch64 pigsty 17.9 KiB pgpdf_14-0.1.0-1PIGSTY.el10.aarch64.rpm
pgpdf_14 0.1.0 el10.aarch64 pgdg 15.5 KiB pgpdf_14-0.1.0-1PGDG.rhel10.aarch64.rpm
postgresql-14-pgpdf 0.1.0 d12.x86_64 pigsty 25.2 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~bookworm_amd64.deb
postgresql-14-pgpdf 0.1.0 d12.aarch64 pigsty 24.6 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~bookworm_arm64.deb
postgresql-14-pgpdf 0.1.0 d13.x86_64 pigsty 25.2 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~trixie_amd64.deb
postgresql-14-pgpdf 0.1.0 d13.aarch64 pigsty 24.6 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~trixie_arm64.deb
postgresql-14-pgpdf 0.1.0 u22.x86_64 pigsty 27.4 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~jammy_amd64.deb
postgresql-14-pgpdf 0.1.0 u22.aarch64 pigsty 26.9 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~jammy_arm64.deb
postgresql-14-pgpdf 0.1.0 u24.x86_64 pigsty 26.4 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~noble_amd64.deb
postgresql-14-pgpdf 0.1.0 u24.aarch64 pigsty 25.7 KiB postgresql-14-pgpdf_0.1.0-1PIGSTY~noble_arm64.deb

Source

pig build pkg pgpdf;		# build rpm/deb

Install

Make sure PGDG and PIGSTY repo available:

pig repo add pgsql -u   # add both repo and update cache

Install this extension with pig:

pig install pgpdf;		# install via package name, for the active PG version

pig install pgpdf -v 18;   # install for PG 18
pig install pgpdf -v 17;   # install for PG 17
pig install pgpdf -v 16;   # install for PG 16
pig install pgpdf -v 15;   # install for PG 15
pig install pgpdf -v 14;   # install for PG 14

Config this extension to shared_preload_libraries:

shared_preload_libraries = 'pgpdf';

Create this extension with:

CREATE EXTENSION pgpdf;

Usage

The actual PDF parsing is done by poppler.

This allows you to work with PDFs in an ACID-compliant way. The usual alternative relies on external scripts or services which can easily make your data ingestion pipeline brittle and leave your raw data out-of-sync.

Download some PDFs.

wget https://wiki.postgresql.org/images/e/ea/PostgreSQL_Introduction.pdf -O /tmp/pgintro.pdf
wget https://pdfobject.com/pdf/sample.pdf -O /tmp/sample.pdf

You can create a pdf type, by casting either a text filepath or bytea column.

CREATE EXTENSION pgpdf;
SELECT '/tmp/pgintro.pdf'::pdf;
                                       pdf                                        
----------------------------------------------------------------------------------
 PostgreSQL Introduction                                                         +
 Digoal.Zhou                                                                     +
 7/20/2011Catalog                                                                +
  PostgreSQL Origin 

If you don’t have the PDF file in your filesystem, but have already stored its content in a bytea column, you can just cast it to pdf.

SELECT pg_read_binary_file('/tmp/pgintro.pdf')::bytea::pdf;

Examples

Create a table with a pdf column:

CREATE TABLE pdfs(name text primary key, doc pdf);

INSERT INTO pdfs VALUES ('pgintro', '/tmp/pgintro.pdf');
INSERT INTO pdfs VALUES ('pgintro', '/tmp/sample.pdf');

Parsing and validation should happen automatically. The files will be read from the disk only once!

Note

The filepath should be accessible by the postgres process / user! That’s different than the user running psql. If you don’t understand what this means, as your DBA!

String Functions and Operators

Standard Postgres String Functions and Operators should work as usual:

SELECT 'Below is the PDF we received ' || '/tmp/pgintro.pdf'::pdf;
SELECT upper('/tmp/pgintro.pdf'::pdf::text);
SELECT name
FROM pdfs
WHERE doc::text LIKE '%Postgres%';

Full-Text Search (FTS)

You can also perform full-text search (FTS), since you can work on a pdf file like normal text.

SELECT '/tmp/pgintro.pdf'::pdf::text @@ to_tsquery('postgres');
 ?column? 
----------
 t
(1 row)
SELECT '/tmp/pgintro.pdf'::pdf::text @@ to_tsquery('oracle');
 ?column? 
----------
 f
(1 row)

Document similarity with pg_trgm

You can use pg_trgm to get the similarity between two documents:

CREATE EXTENSION pg_trgm;

SELECT similarity('/tmp/pgintro.pdf'::pdf::text, '/tmp/sample.pdf'::pdf::text);

Metadata

The following functions are available:

  • pdf_title(pdf) → text

  • pdf_author(pdf) → text

  • pdf_num_pages(pdf) → integer

    Total number of pages in the document

  • pdf_page(pdf, integer) → text

    Get the i-th page as text

  • pdf_creator(pdf) → text

  • pdf_keywords(pdf) → text

  • pdf_metadata(pdf) → text

  • pdf_version(pdf) → text

  • pdf_subject(pdf) → text

  • pdf_creation(pdf) → timestamp

  • pdf_modification(pdf) → timestamp

SELECT pdf_title('/tmp/pgintro.pdf');
        pdf_title        
-------------------------
 PostgreSQL Introduction
(1 row)
SELECT pdf_author('/tmp/pgintro.pdf');
 pdf_author 
------------
 周正中
(1 row)

Getting a subset of pages

SELECT pdf_num_pages('/tmp/pgintro.pdf');
 pdf_num_pages 
---------------
            24
(1 row)
SELECT pdf_page('/tmp/pgintro.pdf', 1);
           pdf_page           
------------------------------
 Catalog                     +
  PostgreSQL Origin         +
  Layout                    +
  Features                  +
  Enterprise Class Attribute+
  Case
(1 row)
SELECT pdf_subject('/tmp/pgintro.pdf');
 pdf_subject 
-------------
 
(1 row)
SELECT pdf_creation('/tmp/pgintro.pdf');
       pdf_creation       
--------------------------
 Wed Jul 20 11:13:37 2011
(1 row)
SELECT pdf_modification('/tmp/pgintro.pdf');
     pdf_modification     
--------------------------
 Wed Jul 20 11:13:37 2011
(1 row)
SELECT pdf_creator('/tmp/pgintro.pdf');
            pdf_creator             
------------------------------------
 Microsoft® Office PowerPoint® 2007
(1 row)
SELECT pdf_metadata('/tmp/pgintro.pdf');
 pdf_metadata 
--------------
 
(1 row)
SELECT pdf_version('/tmp/pgintro.pdf');
 pdf_version 
-------------
 PDF-1.5
(1 row)

Installation

Install poppler dependencies

Linux

sudo apt install -y libpoppler-glib-dev pkg-config

Homebrew/MacOS

brew install poppler pkgconf
cd /tmp
git clone https://github.com/Florents-Tselai/pgpdf.git
cd pgpdf
make
make install # may need sudo

After the installation, in a session:

CREATE EXTENSION pgpdf;

Docker

Get the Docker image with:

docker pull florents/pgpdf:pg17

This adds pgpdf to the Postgres image (replace 17 with your Postgres server version, and run it the same way).

Run the image in a container.

docker run --name pgpdf -p 5432:5432 -e POSTGRES_PASSWORD=pass florents/pgpdf:pg17

Through another terminal, connect to the running server (container).

PGPASSWORD=pass psql -h localhost -p 5432 -U postgres

Warning

Reading arbitrary binary data (PDF) into your database can pose security risks. Only use this for files you trust.

Last updated on