Class CsvLoader


  • public class CsvLoader
    extends Object
    This is a basic Comma-separated-value (CSV, Csv) reader. As input it ultimately takes a java.io.Reader but has helper support for java.io.InputStream, file names and java.io.File. One can also specify a separator character other than the default comma, ',', character and, also, that the input's first line contains the names of the columns (by default this is not assumed). Lastly, this supports only the comment character '#' and only at the start of a line. This comment support could be generalized but that task is left to others.

    To use this class one gives it a java.io.Reader and then calls the hasNextLine and nextLine methods much like a java.io.Iterator but in this case the nextLine method returns a String[] holding the, possibly null, values of the parsed next line. The size of the String[] is the size of the first line parsed that contains the separator character (comment lines are not used). If the number of separator characters in subsequent lines is less than the initial numbers, the trailing entries in the String[] returned by the nextLine method are null. On the other hand, if there are more separator characters in a subsequent line, the world ends with an IndexOutOfBoundsException (sorry, making this more graceful is also a task for others). When one is through using a CsvLoader instance one should call the close method (which closes the Reader).

    All well and good, but there are two additional methods that can be used to extend the capabilities of this CSV parser, the nextSet and putBack methods. With these methods one can, basically, reset the CsvLoader to a state where it does not yet know how many separator characters to expect per line (while stay at the current line in the Reader). The nextSet (next set of CSV lines) resets the loader while the putBack method can be used to place the last line returned back into loader. These methods are used in CsvDBLoader allowing one to have multiple sets of CSV rows with differing number of values per sets.

    There are six special start/end characters when seen prevent the recognition of both the separator character and new lines:

        double quotes: "" ""
        single quotes: '  '
        bracket: i     [ ]
        parenthesis:   ()
        braces:        { }
        chevrons:      < >
     

    Its certainly not the penultimate such parser but its hoped that its adequate.

    Author:
    Richard M. Emberson