1

So, I have a program which is kind of a text editor. I need it's output format to be pdf, yet again I need to be able to edit that PDF again. Since my programs output is never very complicated, and since my program is the one who creates the PDF, I could read directly from created PDF, but I thought it would be easier to just attach another file to PDF which will be easier to read.

However, I don't want the user to see that a file is attached to the PDF.

I have read once somewhere that you can trick PDF readers by changing /EmbeddedFiles to /Embeddedfiles. That way they will not detect there are files attached to PDF they are processing.

The question is, how can I read the PDF in order to do that change and then again prior to editing to revert it back?

I don't think PDF libraries would help me here, since I'm trying to "corrupt" the PDF. I guess I should parse it as somekind of string and then look for the substring I want to change. But I'm too unfamiliar with the PDF format to know if it's really that simple or is there a specific way to do that...

9
  • :( too localized.... can you show us some code too? Commented Sep 6, 2012 at 11:16
  • I haven't written anything... I'm just asking how to read a PDF as string that can be edited, saved as a PDF file and be editable again. But since I'm messing with core structure of PDF, I don't expect PDF libraries to have support for that. If ist's nay help, the file shoudl always be attached somewhere to the top of the document, it would be same for every document created by my program... Commented Sep 6, 2012 at 11:20
  • Just being curious: Why do you want to hide the attachment if it´s so essential for the functionality you are trying to offer? Commented Sep 6, 2012 at 11:41
  • 1
    OK, you are not interested in editing the contents of PDF. But you edit the file itself, and embed it as an attached file ? I'm confused. Why are you doing this in such a complicated way ? Question is very unclear my friend. Could you describe what exactly do you edit in this PDF file and how do other users access this PDF file. Commented Sep 6, 2012 at 12:17
  • 1
    I attach a file to PDF using PDF libraray. Then I hide it using some java's method to edit the whole file as a bytestream or however should it be done. And then I end up with an output file that the user can open with any PDF reader. Commented Sep 6, 2012 at 12:22

2 Answers 2

2

PDF isn't a format meant for editing and tacking on an attachment (hidden or not which I'm not even sure will work) is kind of iffy. Assuming your trick works:

  • Is this a valid PDF? You may want to trick readers, but you'd be creating invalid PDFs, which worries me more than the method you're trying to use.

  • What if a PDF reader updates its functionality to support invalid syntax? That would mean all of a sudden your file is visible, defeating your intentions.

The best way would be:

Let the user create its document. Store the text in a program folder. Create a PDF. When editing, just load the text document (or whatever) based on the PDF's title. Once again, PDF is not an editing format.

Or use Jonathan's solution. Which works around storing the text locally.

Either way, corrupting a PDF file is not desirable.

Sign up to request clarification or add additional context in comments.

3 Comments

I have just found another way. I can open the PDF in text editor and add % to beggining of every line which is related to the attachment, commenting it out. That would probably be a better method then one in the question... However, I still don't know how to do it with Java.. Do I just open PDF like nay other text file, or something else?
@Karlovsky120 "I can open the PDF in text editor and add % to beggining of every line". No, you cannot do that. PDF is a binary format, if you do something like that to a PDF file it will become corrupt and unreadable for sure.
EXACTLY. So how DO I poen it in Java in order for it not to become corrupt? P.S. Okay, not text editor, I edited it using Eclipse...
1

If you just one to create your own version of a binary format and just call it PDF, then you can try adding a "custom" entry to any dictonary object of your PDF file, and associate a data stream to that entry. Since the entry will be outside the PDF spec, all (well implemented) readers should be able to ignore it.
You can probably do this with iText using PdfDictionary.put, and you could add your non-stanard data to the Catalog dictionary for example.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.