PDF Toolkit — Extract Text · Split · Merge

Pull plain text out of a PDF, split a document into pages or a custom range, or merge multiple PDFs into a single file — all in your browser, no upload.

Drop a PDF or click to chooseText is extracted page by page · no upload, all parsing happens locally

PDF Toolkit — Extract Text, Split Pages, Merge PDFs Online (Free)

Free, privacy-friendly PDF toolkit. Extract text from any PDF page-by-page, split a PDF into specific pages or one-per-page, and merge multiple PDFs into one file. Everything happens in your browser — your documents are never uploaded.

Можливості

Extract plain text from any text-based PDF, page by page, with word and character counts per page
Split PDFs by page range (1-3,5,7-9) into a single new PDF, or burst every page into its own file packaged as a ZIP
Merge two or more PDFs into a single file with drag-to-reorder controls before the merge runs
Copy all extracted text to the clipboard or download as a .txt file ready for Word, Notion, or any text editor
100% client-side — PDF content is decoded in your browser; only the PDF.js worker script is fetched from a CDN, never your document

Як використовувати

Pick a mode: Extract text pulls plain text out of a text-based PDF, Split breaks a PDF into one or more new PDFs by page range, and Merge combines multiple PDFs into a single document.
Drag your PDF (or PDFs, for merge) into the drop zone — or click to pick from disk. The tool reads each file locally; nothing is uploaded.
For Extract, browse the per-page text and copy or download a .txt. For Split, type a page range like 1-3,5,7-9 or choose to burst into a ZIP. For Merge, reorder the queue with the arrow buttons, then click Merge.

Поради та найкращі практики

If the extracted text comes out as one giant paragraph, that's because PDFs do not store paragraph breaks reliably — you may need to manually break it after copying.
When splitting, ranges are inclusive and 1-indexed: page 1 is the first page, not page 0.
Merging keeps page sizes as they are — letter and A4 pages can coexist in the same merged file.
Reorder merge files with the up/down arrows before running the merge; once it runs, you'd have to start over.
For very large PDFs, splitting into batches before extracting text often runs faster than trying to read the whole document at once.

Запитання та відповіді

Is my PDF uploaded to a server?

No. The PDF bytes stay in your browser. The only network request made by this tool is fetching the PDF.js worker script from a CDN — that's a public JavaScript file (not your document) that runs the parser in a Web Worker. Your PDF content itself is never sent anywhere.

Why does the extracted text look wrong or have missing letters?

PDFs come in two flavors: text-based (the text is encoded as characters) and image-based (a scanned image with no machine-readable text). PDF.js can extract characters from the first kind but cannot read characters out of an image — for those you need OCR. If your PDF was made by scanning paper, run it through an OCR tool first. Also note: some PDFs encode glyphs with custom font mappings that produce garbled output even when the visual rendering is correct.

What page range syntax does Split accept?

A comma-separated list of single pages and ranges, e.g. `1-3,5,7-9`. Whitespace is ignored. Out-of-range or duplicate pages are silently clamped or dropped. Use the second mode (one-per-page) if you want every page as its own PDF without typing a range.

Are bookmarks, form fields, or signatures preserved when splitting or merging?

Split and Merge use pdf-lib, which copies the visible page content (text, images, vectors, annotations attached to pages). The document-level outline (bookmarks), interactive forms, and digital signatures are usually rebuilt or dropped — pdf-lib does its best to keep page-level annotations but is not a full PDF authoring suite. For signed legal documents you should verify the output before relying on it.

Can I split or merge password-protected PDFs?

If the password is enforced at the user-level (open password), this tool cannot decrypt the file and will show an error. Some PDFs use an owner-level password that only restricts certain operations; for those, pdf-lib's `ignoreEncryption` option lets us read and re-emit the pages. For truly encrypted PDFs, decrypt them in Acrobat or another reader first, then bring the unprotected file back here.

Is there a file-size or page-count limit?

There is no hard limit. The practical ceiling is your browser's available memory: a 200-page PDF with embedded images may need a couple of hundred megabytes of RAM to render. Most modern laptops handle hundred-page documents comfortably. If processing feels sluggish, work on smaller batches.

Can it OCR scanned PDFs?

No — this tool extracts text that is already encoded in the PDF; it does not run optical character recognition on raster pages. For scans, look for a dedicated OCR tool (Tesseract-based ones run in the browser too).