PDF Toolkit — Extract Text · Split · Merge
Pull plain text out of a PDF, split a document into pages or a custom range, or merge multiple PDFs into a single file — all in your browser, no upload.
PDF Toolkit — Extract Text, Split Pages, Merge PDFs Online (Free)
Free, privacy-friendly PDF toolkit. Extract text from any PDF page-by-page, split a PDF into specific pages or one-per-page, and merge multiple PDFs into one file. Everything happens in your browser — your documents are never uploaded.
Можливості
- Extract plain text from any text-based PDF, page by page, with word and character counts per page
- Split PDFs by page range (1-3,5,7-9) into a single new PDF, or burst every page into its own file packaged as a ZIP
- Merge two or more PDFs into a single file with drag-to-reorder controls before the merge runs
- Copy all extracted text to the clipboard or download as a .txt file ready for Word, Notion, or any text editor
- 100% client-side — PDF content is decoded in your browser; only the PDF.js worker script is fetched from a CDN, never your document
Як використовувати
- Pick a mode: Extract text pulls plain text out of a text-based PDF, Split breaks a PDF into one or more new PDFs by page range, and Merge combines multiple PDFs into a single document.
- Drag your PDF (or PDFs, for merge) into the drop zone — or click to pick from disk. The tool reads each file locally; nothing is uploaded.
- For Extract, browse the per-page text and copy or download a .txt. For Split, type a page range like 1-3,5,7-9 or choose to burst into a ZIP. For Merge, reorder the queue with the arrow buttons, then click Merge.
Поради та найкращі практики
- If the extracted text comes out as one giant paragraph, that's because PDFs do not store paragraph breaks reliably — you may need to manually break it after copying.
- When splitting, ranges are inclusive and 1-indexed: page 1 is the first page, not page 0.
- Merging keeps page sizes as they are — letter and A4 pages can coexist in the same merged file.
- Reorder merge files with the up/down arrows before running the merge; once it runs, you'd have to start over.
- For very large PDFs, splitting into batches before extracting text often runs faster than trying to read the whole document at once.
Запитання та відповіді
Is my PDF uploaded to a server?
No. The PDF bytes stay in your browser. The only network request made by this tool is fetching the PDF.js worker script from a CDN — that's a public JavaScript file (not your document) that runs the parser in a Web Worker. Your PDF content itself is never sent anywhere.
Why does the extracted text look wrong or have missing letters?
PDFs come in two flavors: text-based (the text is encoded as characters) and image-based (a scanned image with no machine-readable text). PDF.js can extract characters from the first kind but cannot read characters out of an image — for those you need OCR. If your PDF was made by scanning paper, run it through an OCR tool first. Also note: some PDFs encode glyphs with custom font mappings that produce garbled output even when the visual rendering is correct.
What page range syntax does Split accept?
A comma-separated list of single pages and ranges, e.g. `1-3,5,7-9`. Whitespace is ignored. Out-of-range or duplicate pages are silently clamped or dropped. Use the second mode (one-per-page) if you want every page as its own PDF without typing a range.
Are bookmarks, form fields, or signatures preserved when splitting or merging?
Split and Merge use pdf-lib, which copies the visible page content (text, images, vectors, annotations attached to pages). The document-level outline (bookmarks), interactive forms, and digital signatures are usually rebuilt or dropped — pdf-lib does its best to keep page-level annotations but is not a full PDF authoring suite. For signed legal documents you should verify the output before relying on it.
Can I split or merge password-protected PDFs?
If the password is enforced at the user-level (open password), this tool cannot decrypt the file and will show an error. Some PDFs use an owner-level password that only restricts certain operations; for those, pdf-lib's `ignoreEncryption` option lets us read and re-emit the pages. For truly encrypted PDFs, decrypt them in Acrobat or another reader first, then bring the unprotected file back here.
Is there a file-size or page-count limit?
There is no hard limit. The practical ceiling is your browser's available memory: a 200-page PDF with embedded images may need a couple of hundred megabytes of RAM to render. Most modern laptops handle hundred-page documents comfortably. If processing feels sluggish, work on smaller batches.
Can it OCR scanned PDFs?
No — this tool extracts text that is already encoded in the PDF; it does not run optical character recognition on raster pages. For scans, look for a dedicated OCR tool (Tesseract-based ones run in the browser too).