directory-splitting.txt -- 2006.05.04 -- Algorithms and scripts for splitting directories into DVDs to be backed up makePathLists.py recursively parses a directory tree. It generates listing files with names of the form projectID_nn.lst for each DVD that will be created. It populates each .lst file with a list of files (full path name) that will be included in the corresponding DVD. When it reaches a DVD's capacity, it starts a new .lst file. makePathLists.py calls the following subprograms: chkDir.py* -- verifies path and converts to absolute +-- splitall.py* -- not used? dirIter.py* -- establishes the iterDir class; returns successively all non-link non-directory files in path. Problems: 1. It would usually be preferable to to split DVDs on whole-directory boundaries, even if it uses space slightly less efficiently. 2. .lst files are long and usually useless for any other purpose. 3. Filenames that contain unusual characters (like = ) will cause mkisofs to fail, if theya re included in the .lst file. But if only their parent directory is included, they will be properly archived. Algorithm we want to implement: If directory an everything in it fits on a DVD, write out directory and quit Else: descend into directory: Make list of each directory and size. Include all non-directory files as a special category in directory ".". Sort with largest first. Foreach directory: if directory>DVDsize, descend into directory and repeat from there. On return, delete from directory list. Else: find next largest directory that can be included without exceeding DVDsize. add and repeat. Note: if "." is < DVDsize treat it like any other directory (except the filespecs will have to be written individually). If it is too big, sort it alphabetically, take the first DVD worth of files, then repeat.