PDF import filter • KDE Community Forums

This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Board index

PDF import filter

Page 1 of 1 (8 posts)

Tags:

racitup Registered Member Posts 3 Karma 0 OS	PDF import filter Wed Sep 29, 2010 11:26 am Hi all, I'm trying to convert an embedded text based PDF to an editable form. I've read around on the internet that KWord used to be able to do this, but from what I've read and seen (just installed KWord 2.1.2 on Ubuntu 10.04) this functionality was dropped in KOffice 2. Can someone confirm this? Does anyone know if this function is being brought back/worked on? KWord appears to be the only program that could do this so removing support seems like a crazy idea from a USP point of view! Thanks in advance!
cyrille Moderator Posts 110 Karma 1	Re: PDF import filter Wed Sep 29, 2010 11:35 am Yes it was removed from kword. But now, there is a PDF importer for karbon. Cyrille Berger Krita developer and Calligra release coordinator blog
google01103 Manager Posts 6668 Karma 25	Re: PDF import filter Wed Sep 29, 2010 1:43 pm there's also an extension for OpenOffice http://extensions.services.openoffice.o ... /pdfimport of course it may/may not work well OpenSuse Leap 42.1 x64, Plasma 5.x
racitup Registered Member Posts 3 Karma 0 OS	Re: PDF import filter Wed Sep 29, 2010 2:06 pm cyrille wrote:Yes it was removed from kword. But now, there is a PDF importer for karbon. Wow, thanks for the fast response! Unfortunately the new importer isn't much use to me. I have just tried it and tried exporting as all the different file formats supported, but they are all drawing formats. The most promising is the Opendocument Drawing format which is essentially XML. But each character (Yes, character!) is a separate drawing object. So I would need to write a script to group adjacent characters as paragraphs, and then convert the object type to be a Document text box or something. Does anyone know a good way of doing this?
racitup Registered Member Posts 3 Karma 0 OS	Re: PDF import filter Wed Sep 29, 2010 2:56 pm google01103 wrote:there's also an extension for OpenOffice http://extensions.services.openoffice.o ... /pdfimport of course it may/may not work well Wow2, thanks for the further info! FYI, I just tried the OO import extension and that worked much better than the Karbon one in that it had single line sentences instead of characters. The only promising option given by OO is to save a XHTML. This actually writes a text based HTML file which is editable! The XHTML renders great in IE but not in Firefox. I suspect I can just copy and paste into a Document to edit. This is all a bit of a pain though. Can't someone just write an XSLT script to convert odg to odt? Or even better just add a convertor into OO? Thanks all!
google01103 Manager Posts 6668 Karma 25	Re: PDF import filter Wed Sep 29, 2010 4:04 pm there are other possiblities out there: http://en.wikipedia.org/wiki/Pdftotext http://pdfedit.petricek.net/en/index.html and some are web based (some require email address) http://www.convertpdftoword.net/ (I tried this one, worked for me) http://www.freepdfconvert.com/ http://www.zamzar.com/ OpenSuse Leap 42.1 x64, Plasma 5.x
panda84 Moderator Posts 376 Karma 1 OS	Re: PDF import filter Wed Sep 29, 2010 4:23 pm Okular can too export pdf to text (File → Export as..). Usate il pulsante Accept this answer per marcare una discussione come risolta! Blog - LUG - KDE - Lavoro
john_hudson Registered Member Posts 549 Karma 2 OS	Re: PDF import filter Sun Oct 03, 2010 8:10 pm Assuming the text is embedded in the PDF as text, Okular works fine for extracting text. You can also extract images by selecting them and saving them. The only problem you may encounter is that high quality text may have ligatures in it which, as they are legitimate Unicode characters, will be saved with the file but cause problems with spell-checkers and some programs which don't recognise ligatures as valid characters. You can extract text which is embedded in a graphic by scanning it using an OCR. John Hudson, proud to be a member of KDE forums since 2008-Oct.

Page 1 of 1 (8 posts)

Bookmarks

Who is online

Registered users: Bing [Bot], claydoh, Google [Bot], markhm, rblackwell, sethaaaa, Sogou [Bot], Yahoo [Bot]