The task
When interviewing technical people the most challenging thing is to discover the candidate’s creativity. This post presents an idea of such a task of reversing the order of lines in a text file, e.g. a random one generated by the the following code.
package io.reverse;
import org.apache.commons.lang3.RandomStringUtils;
import java.io.BufferedWriter;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Random;
public class WriteRandomTextFile {
public static final String DIRECTORY = "spool";
public static final String FILE = "big.txt";
private static final Random random = new Random();
public static void main(String[] args) throws IOException {
try (var writer = Files.newBufferedWriter(Paths.get(DIRECTORY, FILE))) {
for (int i = 0; i < 100_000 - 1; i++)
writeRandomString(writer);
}
}
private static void writeRandomString(BufferedWriter writer)
throws IOException {
String str = RandomStringUtils.randomAlphabetic(random.nextInt(255));
.write(str);
writer.newLine();
writer}
}
A good candidate is expected to elaborate some details, first of all, whether or not an upfront estimate of the file’s size exists and how much memory is available for the process.
Take one
For reasonably small files and as well as for testing purposes we can start off with the simplest possible solution. Let’s read the whole source file into memory and write data in the reverse order. Here a candidate could take some additional points by showing knowledge how to iterate over a collection in the reverse direction.
package io.reverse;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
/**
* starts working from ~ -Xmx20m on JDK11
*/
public class ReverseInMemory {
public static final String DIRECTORY = "spool";
public static final String SOURCE = "big.txt";
public static final String TARGET = "reverse-memory.txt";
public static void main(String[] args) throws IOException {
List<String> lines = Files.readAllLines(Paths.get(DIRECTORY, SOURCE));
var iter = lines.listIterator(lines.size());
try (var writer = Files.newBufferedWriter(Paths.get(DIRECTORY, TARGET))) {
while (iter.hasPrevious()) {
.write(iter.previous());
writer.newLine();
writer}
}
}
}
This works, but takes a lot of memory and obviously not suitable for large files.
And here a game begins.
From my experience most people tend to find a way to read a source file in the reverse order, and this is the trap.
It turns out there is no easy way to do this.
All reading facilities including java.util.Scanner
(which in addition swallows exceptions silently while reading a file) work only in the direct mode and not the other way round.
Take two
Wait, we have java.io.RandomAccessFile
to the rescue!
But it doesn’t provide an easy way to seek between new lines separators.
We can read or even map to memory a fragment of a fixed size, but we don’t know what is the length of the longest line in a file… an algorithm becomes not easy to comprehend, let alone implement.
By far the most favorite mental pattern of mine when locked in such dead-ends is inversion. It works quite often and this case is not an exception. If we are not able to read a source file in the reverse order – relax and don’t worry, let’s read it in a straightforward effective way but inverse the way we write a target file instead. It turns out to be a surprisingly easy thing to do. Having a next line from a source file we know its length and can calculate the position to perform writing, starting from the end of a target file.
NB: This code doesn’t take into account charsets and line separators.
package io.reverse;
import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.file.Files;
import java.nio.file.Paths;
/**
* works with -Xmx2m on JDK11
*/
public class ReverseWithRAF {
private static final String SOURCE = "spool/big.txt";
private static final String TARGET = "spool/reverse-raf.txt";
public static void main(String[] args) throws IOException {
var in = new File(SOURCE);
try (var out = new RandomAccessFile(new File(TARGET), "rw");
var reader = Files.newBufferedReader(Paths.get(SOURCE))) {
long pos = in.length();
.setLength(pos);
out
String str;
while ((str = reader.readLine()) != null) {
int len = str.length();
-= (len + 1);
pos .seek(pos);
out.writeBytes(str + "\n");
out}
}
}
}
To validate results one can use diff
utility with --report-identical-files
option:
$ diff -s reverse-memory.txt reverse-raf.txt
Files reverse-memory.txt and reverse-raf.txt are identical