PDF Toolkit — Extract Text · Split · Merge

Pull plain text out of a PDF, split a document into pages or a custom range, or merge multiple PDFs into a single file — all in your browser, no upload.

PDF Toolkit — Extract Text, Split Pages, Merge PDFs Online (Free)

Free, privacy-friendly PDF toolkit. Extract text from any PDF page-by-page, split a PDF into specific pages or one-per-page, and merge multiple PDFs into one file. Everything happens in your browser — your documents are never uploaded.

Можливості

  • Extract plain text from any text-based PDF, page by page, with word and character counts per page
  • Split PDFs by page range (1-3,5,7-9) into a single new PDF, or burst every page into its own file packaged as a ZIP
  • Merge two or more PDFs into a single file with drag-to-reorder controls before the merge runs
  • Copy all extracted text to the clipboard or download as a .txt file ready for Word, Notion, or any text editor
  • 100% client-side — PDF content is decoded in your browser; only the PDF.js worker script is fetched from a CDN, never your document

Як використовувати

  1. Pick a mode: Extract text pulls plain text out of a text-based PDF, Split breaks a PDF into one or more new PDFs by page range, and Merge combines multiple PDFs into a single document.
  2. Drag your PDF (or PDFs, for merge) into the drop zone — or click to pick from disk. The tool reads each file locally; nothing is uploaded.
  3. For Extract, browse the per-page text and copy or download a .txt. For Split, type a page range like 1-3,5,7-9 or choose to burst into a ZIP. For Merge, reorder the queue with the arrow buttons, then click Merge.

Поради та найкращі практики

  • If the extracted text comes out as one giant paragraph, that's because PDFs do not store paragraph breaks reliably — you may need to manually break it after copying.
  • When splitting, ranges are inclusive and 1-indexed: page 1 is the first page, not page 0.
  • Merging keeps page sizes as they are — letter and A4 pages can coexist in the same merged file.
  • Reorder merge files with the up/down arrows before running the merge; once it runs, you'd have to start over.
  • For very large PDFs, splitting into batches before extracting text often runs faster than trying to read the whole document at once.

Запитання та відповіді

Is my PDF uploaded to a server?

No. The PDF bytes stay in your browser. The only network request made by this tool is fetching the PDF.js worker script from a CDN — that's a public JavaScript file (not your document) that runs the parser in a Web Worker. Your PDF content itself is never sent anywhere.

Why does the extracted text look wrong or have missing letters?

PDFs come in two flavors: text-based (the text is encoded as characters) and image-based (a scanned image with no machine-readable text). PDF.js can extract characters from the first kind but cannot read characters out of an image — for those you need OCR. If your PDF was made by scanning paper, run it through an OCR tool first. Also note: some PDFs encode glyphs with custom font mappings that produce garbled output even when the visual rendering is correct.

What page range syntax does Split accept?

A comma-separated list of single pages and ranges, e.g. `1-3,5,7-9`. Whitespace is ignored. Out-of-range or duplicate pages are silently clamped or dropped. Use the second mode (one-per-page) if you want every page as its own PDF without typing a range.

Are bookmarks, form fields, or signatures preserved when splitting or merging?

Split and Merge use pdf-lib, which copies the visible page content (text, images, vectors, annotations attached to pages). The document-level outline (bookmarks), interactive forms, and digital signatures are usually rebuilt or dropped — pdf-lib does its best to keep page-level annotations but is not a full PDF authoring suite. For signed legal documents you should verify the output before relying on it.

Can I split or merge password-protected PDFs?

If the password is enforced at the user-level (open password), this tool cannot decrypt the file and will show an error. Some PDFs use an owner-level password that only restricts certain operations; for those, pdf-lib's `ignoreEncryption` option lets us read and re-emit the pages. For truly encrypted PDFs, decrypt them in Acrobat or another reader first, then bring the unprotected file back here.

Is there a file-size or page-count limit?

There is no hard limit. The practical ceiling is your browser's available memory: a 200-page PDF with embedded images may need a couple of hundred megabytes of RAM to render. Most modern laptops handle hundred-page documents comfortably. If processing feels sluggish, work on smaller batches.

Can it OCR scanned PDFs?

No — this tool extracts text that is already encoded in the PDF; it does not run optical character recognition on raster pages. For scans, look for a dedicated OCR tool (Tesseract-based ones run in the browser too).