Rectangle 27 0

java Check line for unprintable characters while reading text file?


String line;
try (
    InputStream fis = new FileInputStream("the_file_name");
    InputStreamReader isr = new InputStreamReader(fis, Charset.forName("UTF-8"));
    BufferedReader br = new BufferedReader(isr);
) {
    while ((line = br.readLine()) != null) {
        // Deal with the line
    }
}

@abhisheknaik96: Thank you for your edit, but only the isr bit was correct; the () are supposed to be (), not {}, and the last semicolon isn't required (but it's allowed, so I've left it -- more in keeping with the lines above it).

E.g. (without error checking), using try-with-resources (which is in vaguely modern Java version):

Open the file with a FileInputStream, then use an InputStreamReader with the UTF-8 Charset to read characters from the stream, and use a BufferedReader to read lines, e.g. via BufferedReader#readLine, which will give you a string. Once you have the string, you can check for characters that aren't what you consider to be printable.

Or, for one less step, open the file with a FileReader and use a BufferedReader to read lines.

Note
Rectangle 27 0

java Check line for unprintable characters while reading text file?


FileReader fileReader = new FileReader(new File("test.txt"));

 BufferedReader br = new BufferedReader(fileReader);

 String line = null;
 // if no more lines the readLine() returns null
 while ((line = br.readLine()) != null) {
      // reading lines until the end of the file

 }

Nope - delete this - you are using default encoding - and entering a world of pain.

Note
Rectangle 27 0

java Check line for unprintable characters while reading text file?


BufferedReader
FileInputStream
List<String> lines=Files.readAllLines(Paths.get("/tmp/test.csv"), Charset.forName("UTF-8"));
for(String line:lines){
  System.out.println(line);
}

Just found out that with the Java NIO (java.nio.file.*) you can easily write:

Note
Rectangle 27 0

java Check line for unprintable characters while reading text file?


This, however, includes the whitespace and tab characters in your set of non-printing characters while they influence the place of the words in the page.

Note
Rectangle 27 0

java Check line for unprintable characters while reading text file?


List<String> lines = Files.readLines(file, Charsets.UTF_8);

EDIT: Note that this will read the whole file into memory in one go. In most cases that's actually fine - and it's certainly simpler than reading it line by line, processing each line as you read it. If it's an enormous file, you may need to do it that way as per T.J. Crowder's answer.

Guava alse propose a method with callback Files.readLines(File file, Charset charset, LineProcessor<T> callback)

If the purpose is to process line by line, using BufferedRead is as simple. It is also overkilling to add another library dependency just for line reading while the core Java library already supports that.

While it's not hard to do this manually using BufferedReader and InputStreamReader, I'd use Guava:

You can then do whatever you like with those lines.

Note