Rectangle 27 0

java How to Cache InputStream for Multiple Use?


byte[] bytes = getBytes(inputStream);
POIFSFileSystem fileSystem = new POIFSFileSystem(new ByteArrayInputStream(bytes));
private static byte[] getBytes(InputStream is) throws IOException {
    byte[] buffer = new byte[8192];
ByteArrayOutputStream baos = new ByteArrayOutputStream(2048);
int n;
baos.reset();

while ((n = is.read(buffer, 0, buffer.length)) != -1) {
      baos.write(buffer, 0, n);
    }

   return baos.toByteArray();
 }
Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


private String convertStreamToString(InputStream is) {
    Writer w = new StringWriter();
    char[] buf = new char[1024];
    Reader r;
    is.mark(1 << 24);
    try {
        r = new BufferedReader(new InputStreamReader(is, "UTF-8"));
        int n;
        while ((n=r.read(buf)) != -1) {
            w.write(buf, 0, n);
        }
        is.reset();
    } catch(UnsupportedEncodingException e) {
        Logger.debug(this.getClass(), "Cannot convert stream to string.", e);
    } catch(IOException e) {
        Logger.debug(this.getClass(), "Cannot convert stream to string.", e);
    }
    return w.toString();
}

I appreciate your effort. And for the problem I submitted, sometimes details make it different. The question I asked is a bit more specific, a multiple use of a Stream to be consumed by Apache POI which may or may not work with String. So you actually answered more general question, and not the more specific I posted. That's why the most specific answer won.

I just add my solution here, as this works for me. It basically is a combination of the top two answers :)

It is great that it works for you, but you shoudln't put answers to your problems, but answers to the questions asked ;)

This is my solution on how to cache an InputStream for multiple use. Isn't that the problem you submitted?

Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


public class ReusableBufferedInputStream extends BufferedInputStream
{

    private int totalUse;
    private int used;

    public ReusableBufferedInputStream(InputStream in, Integer totalUse)
    {
        super(in);
        if (totalUse > 1)
        {
            super.mark(Integer.MAX_VALUE);
            this.totalUse = totalUse;
            this.used = 1;
        }
        else
        {
            this.totalUse = 1;
            this.used = 1;
        }
    }

    @Override
    public void close() throws IOException
    {
        if (used < totalUse)
        {
            super.reset();
            ++used;
        }
        else
        {
            super.close();
        }
    }
}
Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


class ResetOnCloseInputStream extends InputStream {

    private final InputStream decorated;

    public ResetOnCloseInputStream(InputStream anInputStream) {
        if (!anInputStream.markSupported()) {
            throw new IllegalArgumentException("marking not supported");
        }

        anInputStream.mark( 1 << 24); // magic constant: BEWARE
        decorated = anInputStream;
    }

    @Override
    public void close() throws IOException {
        decorated.reset();
    }

    @Override
    public int read() throws IOException {
        return decorated.read();
    }
}
static void closeAfterInputStreamIsConsumed(InputStream is)
        throws IOException {
    int r;

    while ((r = is.read()) != -1) {
        System.out.println(r);
    }

    is.close();
    System.out.println("=========");

}

public static void main(String[] args) throws IOException {
    InputStream is = new ByteArrayInputStream("sample".getBytes());
    ResetOnCloseInputStream decoratedIs = new ResetOnCloseInputStream(is);
    closeAfterInputStreamIsConsumed(decoratedIs);
    closeAfterInputStreamIsConsumed(decoratedIs);
    closeAfterInputStreamIsConsumed(is);
}

How big files does it handle while using the magic constant in anInputStream.mark( 1 << 24) ?

What purpose serves the mark call here?

forget about it, you can make it a parameter

you can decorate InputStream being passed to POIFSFileSystem with a version that when close() is called it respond with reset():

you can read the entire file in a byte[] (slurp mode) then passing it to a ByteArrayInputStream

Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


If the file is big, then you shouldn't care, since the OS will do the caching for you as best as it can.

If the file is not that big, read it into a byte[] array and give POI a ByteArrayInputStream created from that array.

If you want to do it yourself, use a File object to get the length, create the array and the a loop which reads bytes from the file. You must loop since read(byte[], int offset, int len) can read less than len bytes (and usually does).

[EDIT] Use Apache commons-io to read the File into a byte array in an efficient way. Do not use int read() since it reads the file byte by byte which is very slow!

read returns always 0-255 or -1. Check first for -1(end of stream) and then you can cast it safety to byte.

the Read() method returns int, how do i split the bytes: little or big endian ?

Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


UnclosableBufferedInputStream  bis = new UnclosableBufferedInputStream (inputStream);
public class UnclosableBufferedInputStream extends BufferedInputStream {

    public UnclosableBufferedInputStream(InputStream in) {
    	super(in);
    	super.mark(Integer.MAX_VALUE);
    }

    @Override
    public void close() throws IOException {
    	super.reset();
    }
}

@androiddeveloper If you are using a library, for example, that needs an InputStream and closes it after using it.

It doesn't matter whether your InputStream supports it or not. BufferedInputStream wraps around another stream, buffers the input, and supports marking on its own. The overridden close method, will also conveniently reset it, whenever it's consumed.

Please check EDIT2 of the question: "... the InputStream i get ... doesn't support markings thus cannot reset..."

Try BufferedInputStream, which adds mark and reset functionality to another input stream, and just override its close method:

and use bis wherever inputStream was used before.

Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


Not very wise to presume that the input stream is resetable indeed.

Or do you wan to continue reading at the point where the first POIFSFileSystem stopped? That's not caching, and it's very difficult to do. The only way I can think of if you can't avoid the stream getting closed would be to write a thin wrapper that counts how many bytes have been read and then open a new stream and skip that many bytes. But that could fail when POIFSFileSystem internally uses something like a BufferedInputStream.

What exactly do you mean with "cache"? Do you want the different POIFSFileSystem to start at the beginning of the stream? If so, there's absolutely no point caching anything in your Java code; it will be done by the OS, just open a new stream.

Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


  • add a new method release() which will remove the temporary file and release any open stream.
  • dump everything read from the original input stream into this temporary file
  • from now on you will loose the reference of the original stream(can be collected)
  • when the stream was completely read you will have all the data mirrored in the temporary file
  • write your own InputStream wrapper where you create a temporary file to mirror the original stream content
  • you can even call release() from finalize to be sure the temporary file is release in case you forget to call release()(most of the time you should avoid using finalize, always call a method to release object resources). see Why would you ever implement finalize()?
Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


public static void main(String[] args) throws IOException {
    BufferedInputStream inputStream = new BufferedInputStream(IOUtils.toInputStream("Foobar"));
    inputStream.mark(Integer.MAX_VALUE);
    System.out.println(IOUtils.toString(inputStream));
    inputStream.reset();
    System.out.println(IOUtils.toString(inputStream));
}
Note
Rectangle 27 0

java How to Cache InputStream for Multiple Use?


public class CachingInputStream extends BufferedInputStream {    
    public CachingInputStream(InputStream source) {
        super(new PostCloseProtection(source));
        super.mark(Integer.MAX_VALUE);
    }

    @Override
    public synchronized void close() throws IOException {
        if (!((PostCloseProtection) in).decoratedClosed) {
            in.close();
        }
        super.reset();
    }

    private static class PostCloseProtection extends InputStream {
        private volatile boolean decoratedClosed = false;
        private final InputStream source;

        public PostCloseProtection(InputStream source) {
            this.source = source;
        }

        @Override
        public int read() throws IOException {
            return decoratedClosed ? -1 : source.read();
        }

        @Override
        public int read(byte[] b) throws IOException {
            return decoratedClosed ? -1 : source.read(b);
        }

        @Override
        public int read(byte[] b, int off, int len) throws IOException {
            return decoratedClosed ? -1 : source.read(b, off, len);
        }

        @Override
        public long skip(long n) throws IOException {
            return decoratedClosed ? 0 : source.skip(n);
        }

        @Override
        public int available() throws IOException {
            return source.available();
        }

        @Override
        public void close() throws IOException {
            decoratedClosed = true;
            source.close();
        }

        @Override
        public void mark(int readLimit) {
            source.mark(readLimit);
        }

        @Override
        public void reset() throws IOException {
            source.reset();
        }

        @Override
        public boolean markSupported() {
            return source.markSupported();
        }
    }
}

One limitation though is that if the stream is closed before the whole content of the original stream has been read, then this decorator will have incomplete data, so make sure the whole stream is read before closing.

This answer iterates on previous ones 1|2 based on the BufferInputStream. The main changes are that it allows infinite reuse. And takes care of closing the original source input stream to free-up system resources. Your OS defines a limit on those and you don't want the program to run out of file handles (That's also why you should always 'consume' responses e.g. with the apache EntityUtils.consumeQuietly()). EDIT Updated the code to handle for gready consumers that use read(buffer, offset, length), in that case it may happen that BufferedInputStream tries hard to look at the source, this code protects against that use.

To reuse it just close it first if it wasn't.

Note