20.3 Character Streams: Readers and Writers
A character encoding is a scheme for representing characters. Java programs represent values of the char type internally in the 16-bit Unicode character encoding, but the host platform might use another character encoding to represent and store characters externally. For example, the ASCII (American Standard Code for Information Interchange) character encoding is widely used to represent characters on many platforms. However, it is only one small subset of the Unicode standard.
The abstract classes Reader and Writer are the roots of the inheritance hierarchies for streams that read and write Unicode characters using a specific character encoding (Figure 20.3). A reader is an input character stream that implements the Readable interface and reads a sequence of Unicode characters, and a writer is an output character stream that implements the Writer interface and writes a sequence of Unicode characters. Character encodings (usually called charsets) are used by readers and writers to convert between external bytes and internal Unicode characters. The same character encoding that was used to write the characters must be used to read those characters. The java.nio.charset.Charset class represents charsets. Kindly refer to the Charset class API documentation for more details.
Figure 20.3 Selected Character Streams in the java.io Package
static Charset forName(String charsetName)
Returns a charset object for the named charset. Selected common charset names are “UTF-8”, “UTF-16”, “US-ASCII”, and “ISO-8859-1”.
static Charset defaultCharset()
Returns the default charset of this Java virtual machine.
Table 20.4 and Table 20.5 give an overview of some selected character streams found in the java.io package.
Table 20.4 Selected Readers
Reader | Description |
BufferedReader | A reader is a high-level input stream that buffers the characters read from an underlying stream. The underlying stream must be specified and an optional buffer size can be given. |
InputStreamReader | Characters are read from a byte input stream which must be specified. The default character encoding is used if no character encoding is explicitly specified in the constructor. This class provides the bridge from byte streams to character streams. |
FileReader | Characters are read from a file, using the default character encoding, unless an encoding is explicitly specified in the constructor. The file can be specified by a String file name. It automatically creates a FileInputStream that is associated with the file. |
Table 20.5 Selected Writers
Writers | Description |
BufferedWriter | A writer is a high-level output stream that buffers the characters before writing them to an underlying stream. The underlying stream must be specified, and an optional buffer size can be specified. |
OutputStreamWriter | Characters are written to a byte output stream that must be specified. The default character encoding is used if no explicit character encoding is specified in the constructor. This class provides the bridge from character streams to byte streams. |
FileWriter | Characters are written to a file, using the default character encoding, unless an encoding is explicitly specified in the constructor. The file can be specified by a String file name. It automatically creates a FileOutputStream that is associated with the file. A boolean parameter can be specified to indicate whether the file should be overwritten or appended with new content. |
PrintWriter | A print writer is a high-level output stream that allows text representation of Java objects and Java primitive values to be written to an underlying output stream or writer. The underlying output stream or writer must be specified. An explicit encoding can be specified in the constructor, and also whether the print writer should do automatic line flushing. |
Readers use the following methods for reading Unicode characters:
int read() throws IOException
int read(char cbuf[]) throws IOException
int read(char cbuf[], int off, int len) throws IOException
Note that the read() methods read the character as an int in the range 0 to 65,535 (0x0000–0xFFFF).
The first method returns the character as an int value. The last two methods store the characters in the specified array and return the number of characters read. The value -1 is returned if the end of the stream has been reached.
long skip(long n) throws IOException
A reader can skip over characters using the skip() method.
void close() throws IOException
Like byte streams, a character stream should be closed when no longer needed in order to free system resources.
boolean ready() throws IOException
When called, this method returns true if the next read operation is guaranteed not to block. Returning false does not guarantee that the next read operation will block.
long transferTo(Writer out) throws IOException
Reads all characters from this reader and writes the characters to the specified writer in the order they are read. The I/O streams are not closed after the operation.
Writers use the following methods for writing Unicode characters:
void write(int c) throws IOException
The write() method takes an int as an argument, but writes only the least significant 16 bits.
void write(char[] cbuf) throws IOException
void write(String str) throws IOException
void write(char[] cbuf, int off, int length) throws IOException
void write(String str, int off, int length) throws IOException
Write the characters from an array of characters or a string.
void close() throws IOException
void flush() throws IOException
Like byte streams, a character stream should be closed when no longer needed in order to free system resources. Closing a character output stream automatically flushes the stream. A character output stream can also be manually flushed.
Like byte streams, many methods of the character stream classes throw a checked IOException that a calling method must either catch explicitly or specify in a throws clause. They also implement the AutoCloseable interface, and can thus be declared in a try-with-resources statement (§7.7, p. 407) that will ensure they are automatically closed after use at runtime.
Analogous to Example 20.1 that demonstrates usage of a byte buffer for writing and reading bytes to and from file streams, Example 20.3 demonstrates using a character buffer for writing and reading characters to and from file streams. Later in this section, we will use buffered readers (p. 1251) and buffered writers (p. 1250) for reading and writing characters from files, respectively.
Example 20.3 Copying a File Using a Character Buffer
/* Copy a file using a character buffer.
Command syntax: java CopyCharacterFile <from_file> <to_file> */
import java.io.*;
class CopyCharacterFile {
public static void main(String[] args) {
try (// Assign the files:
FileReader fromFile = new FileReader(args[0]); // (1)
FileWriter toFile = new FileWriter(args[1])) { // (2)
// Copy characters using buffer: // (3a)
char[] buffer = new char[1024];
int length = 0;
while((length = fromFile.read(buffer)) != -1) {
toFile.write(buffer, 0, length);
}
// Transfer characters:
// fromFile.transferTo(toFile); // (3b)
} catch(ArrayIndexOutOfBoundsException e) {
System.err.println(“Usage: java CopyCharacterFile <from_file> <to_file>”);
} catch(FileNotFoundException e) {
System.err.println(“File could not be copied: ” + e);
} catch(IOException e) {
System.err.println(“I/O error.”);
}
}
}