Java: How to count read bytes from InputStream without allocating the full memory before

Question

I have a Java-backend where user can upload files to it. I want to limit these uploaded files to a max size and want to check the amount of uploaded bytes while the upload happens and break the transmission as soon as the limit is reached.

Currently I am using InputStream.available() before allocation for determination of estimated size, but that seems to be seen as unreliable.

Any suggestions?

Reliable or not, available() doesn't do what you're looking for. There is an explicit warning in the Javadoc against using it as the number of bytes remaining in the stream. That's not what it's for. Your problem isn't difficult. You can just keep track of the accumulated read count and abort when it gets too high. — user207421
– user207421, Commented Jun 7, 2016 at 11:41
Side note: if possible, avoid such a design. One should always strive for fail fast user interaction. Meaning: if you know that an operation will not be working, then fail immediately. Thus: if there is any chance that your front end code can be checking file sizes ... then you should failing right there before even starting the upload process. (of course, the ability to do so depends on your frontend, and you still need the backend checking - but your users won't appreciate a backend-only solution!) — GhostCat
– GhostCat, Commented Jun 7, 2016 at 11:42
Of course .available() doesn't work in here. I'd suggest to check Multipart Request file size before you read the data with InputStream. Implementation of such lightweight validation depends on your backend implementation (netty, tomcat, etc?) Also, you could wrap InputStream by a decorator which extends InputStream and uses internal buffer like BufferedInputStream and has a counter of read bytes. Read buffer by buffer until the total byte count < limit. But again, you may spend time if the file is too long. Try to check Multipart file size before. — AnatolyG
– AnatolyG, Commented Jun 7, 2016 at 12:03

Fabian Barney · Accepted Answer · 2016-06-07 11:37:46Z

5

You can use Guava's CountingInputstream or Apache IO's CountingInputStream when you want to know how many bytes have been read.

On the other hand when you want to stop the upload immediatly when reaching some limit then just count while reading chunks of bytes and close the stream when the limit has been exceeded.

answered Jun 7, 2016 at 11:37

Fabian Barney

14.6k6 gold badges45 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Holger Over a year ago

The question explicitly asked for a way to get the number before reading the data. Knowing the number of bytes after you have read them, is trivial and doesn’t need a special 3rd party input stream.

user207421 Over a year ago

@Holger You can't know the number of bytes without reading them.

Fabian Barney Over a year ago

@Holger Maybe you've read another question. [...] check the amount of uploaded bytes while the upload happens [...]. You cannot know how many bytes will be there before reading. The most you can do is relying on header fields but this is about trusting the sender.

Holger Over a year ago

@EJP: unless there is some protocol telling you the number beforehand. However, saying, “there is no solution” would be an appropriate answer.

user207421 Over a year ago

@Holger I haven't said there is no solution. There is a solution. Read until you have exceeded the quota. NB he says 'while the upload happens', not before it.

|

user207421 · Accepted Answer · 2016-06-07 23:01:44Z

You don't have to 'allocat[e] the full memory before'. Just use a normally sized buffer, say 8k, and perform the normal copy loop, tallying the total transferred. If it exceeds the quota, stop, and destroy the output file.

MN7272 · Accepted Answer · 2021-04-04 08:50:30Z

int count = 1; InputStream stream; if (stream.available() < 3) { count++; } Result: [0][1]{2][3] 1 1 1 1

StephaneM · Accepted Answer · 2016-06-07 11:41:52Z

If you're using a servlet and a multipart request you can do this:

public void doPost( final HttpServletRequest request, final HttpServletResponse response ) throws ServletException, IOException { String contentLength = request.getHeader("Content-Length"); if (contentLength != null && maxRequestSize > 0 && Integer.parseInt(contentLength) > maxRequestSize) { throw new MyFileUploadException("Multipart request is larger than allowed size"); } }

Wiesi · Accepted Answer · 2016-06-08 08:50:36Z

My solution looks like this:

public static final byte[] readBytes (InputStream in, int maxBytes) throws IOException { byte[] result = new byte[maxBytes]; int bytesRead = in.read (result); if (bytesRead > maxBytes) { throw new IOException ("Reached max bytes (" + maxBytes + ")"); } if (bytesRead < 0) { result = new byte[0]; } else { byte[] tmp = new byte[bytesRead]; System.arraycopy (result, 0, tmp, 0, bytesRead); result = tmp; } return result; }

EDIT: New variant

public static final byte[] readBytes (InputStream in, int bufferSize, int maxBytes) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); byte[] buffer = new byte[bufferSize]; int bytesRead = in.read (buffer); out.write (buffer, 0, bytesRead); while (bytesRead >= 0) { if (maxBytes > 0 && out.size() > maxBytes) { String message = "Reached max bytes (" + maxBytes + ")"; log.trace (message); throw new IOException (message); } bytesRead = in.read (buffer); if (bytesRead < 0) break; out.write (buffer, 0, bytesRead); } return out.toByteArray(); }

It is impossible to read more than maxBytes when your buffer’s size is exactly maxBytes. Further, read does not necessarily read all bytes, so you should be prepared for the case that you have to call read more than once.
And you don't need a buffer the size of the largest possible input. And you can't assume that all the data will arrive in a single read. It won't.

Michael Gantman · Accepted Answer · 2016-06-07 11:45:31Z

All method implementations of read return the number of bytes read. So you can initiate a counter and increment it appropriately with each read to see how many bytes you've reads so far. Method available() allows you to see how many bytes are available for reading at the buffer at the moment and it has no relation to the total size of the file. this method could be very useful though to optimize your reading so each time you can request to read the chunk that is readily available and avoid blocking. Also in your case you can predict before reading if the amount of bytes that you will have after the upcoming reading will exceed your limit and thus you can cancel it even before you read the next chunk

You will always read the chunk that is readily available, and you will only block once to do so. Adding available() into the process is just a complete waste of time.
I disagree, I once solved performance problem by using available(). My reader was faster then writer and in a lot of attempts available() was returning 0. So I implemented a logic that if it is indeed 0 I skipped the reading attempt until the info was available. Plus you can allocate a buffer of appropriate size to read your chunk
You can't know how long to sleep for if you don't read while available() is zero. On average you will sleep twice as long as necessary. Blocking in read on the other hand will block for exactly the correct length of time. Allocating a new buffer per read is not necessary, and wastes both time and space.

Collectives™ on Stack Overflow

Java: How to count read bytes from InputStream without allocating the full memory before

6 Answers 6

8 Comments

Comments

Comments

Comments

2 Comments

3 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

8 Comments

Comments

Comments

Comments

2 Comments

3 Comments

Linked

Related