Each dump output file consists of a tar. The output is stored in several files of similar size in a given directory. Using the multi-stream files, the reader can be parallelized and using network based message queues, we can grow this beyond just a single PC. You've successfully subscribed to James Thorne! The current version of mwparserfromhtml constitutes a first starting point.
nest...