Document & PDF Glossary
Understand key terms, formats, and technologies used in modern digital document workflows.
PDF (Portable Document Format)
A file format developed by Adobe to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.
OCR (Optical Character Recognition)
A technology that converts scanned paper documents, PDF files, or images into editable and searchable text data.
PDF/A
An ISO-standardized version of the Portable Document Format specialized for long-term archiving and digital preservation of electronic documents.
Metadata
Embedded information inside a PDF (such as author, title, creation date, keywords) that describes the document's content to search engines and indexing crawlers.
Redaction
The permanent and irreversible deletion of sensitive text or images from a PDF file to prevent unauthorized access or leakage of confidential details.
256-bit AES Encryption
Advanced Encryption Standard with a 256-bit key length, representing a bank-grade standard for locking and protecting PDF documents.
Vector Graphic
Resolution-independent visual elements defined by mathematical paths (lines and curves) inside a PDF, guaranteeing they remain sharp at any zoom level.
Raster Graphic
Pixel-based images (like JPEG or PNG) contained within a PDF that can become blurry or pixelated when scaled up or printed.
DPI (Dots Per Inch)
A measure of spatial printing or scanning dot density, determining the resolution and quality of images embedded within a document.
Compression
The process of shrinking a PDF's file size by downsampling high-resolution images, optimization of internal structures, and removal of metadata.
Linearization (Fast Web View)
A method of organizing a PDF file layout to allow web browsers to render and display pages sequentially before the entire file finishes downloading.
Form Flattening
The process of merging interactive form fields (text boxes, drop-downs) directly into the PDF layout, rendering the document static and non-editable.
Digital Signature
A mathematical scheme used to verify the authenticity and integrity of digital PDF documents, proving they have not been modified in transit.
PDF Parser
A software engine that reads, analyzes, and extracts text content, images, and layout streams from PDF files.
Watermark
A faint text or image overlay (e.g., 'CONFIDENTIAL' or 'DRAFT') placed behind or in front of PDF content to indicate ownership or security rules.
Interactive PDF
A document containing interactive rich-media elements such as clickable form fields, hyperlinks, buttons, audio, or video files.
CMYK Color Model
Cyan, Magenta, Yellow, and Key (Black) color space utilized in commercial printing. Print-ready PDFs are formatted in CMYK to ensure colors print accurately.
RGB Color Model
Red, Green, and Blue color space utilized for electronic displays (monitors, phones). heyPDF automatically optimizes color profiles based on target use.
Font Embedding
The process of storing font files directly inside the PDF file, ensuring the document renders identically on all devices even if the fonts are missing.
Crop Box
The boundary of a PDF page that defines the visible region displayed by PDF readers or printed by standard document output tools.
Bleed Box
The boundary of a PDF page that defines the clipping region when printed professionally to allow for paper trimming and edge alignment.