Class Table

  • All Implemented Interfaces:
    TableDataSource
    Direct Known Subclasses:
    AbstractDataTable, CompositeTable, FilterTable, JoinedTable

    public abstract class Table
    extends java.lang.Object
    implements TableDataSource
    This is a definition for a table in the database. It stores the name of the table, and the fields (columns) in the table. A table represents either a 'core' DataTable that directly maps to the information stored in the database, or a temporary table generated on the fly.

    It is an abstract class, because it does not implement the methods to add, remove or access row data in the table.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      (package private) class  Table.TableVariableResolver
      An implementation of VariableResolver that we can use to resolve column names in this table to cells for a specific row.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected Table()
      The Constructor.
    • Field Detail

      • DEBUG_QUERY

        protected static boolean DEBUG_QUERY
      • col_name_lookup

        private java.util.HashMap col_name_lookup
      • COL_LOOKUP_LOCK

        private java.lang.Object COL_LOOKUP_LOCK
    • Constructor Detail

      • Table

        protected Table()
        The Constructor. Requires a name and the fields in the table.
    • Method Detail

      • getDatabase

        public abstract Database getDatabase()
        Returns the Database object that this table is derived from.
      • Debug

        public DebugLogger Debug()
        Returns a DebugLogger object that we can use to log debug messages to.
      • getColumnCount

        public abstract int getColumnCount()
        Returns the number of columns in the table.
      • getRowCount

        public abstract int getRowCount()
        Returns the number of rows stored in the table.
        Specified by:
        getRowCount in interface TableDataSource
      • getTTypeForColumn

        public TType getTTypeForColumn​(int column)
        Returns a TType object that would represent values at the given column index. Throws an error if the column can't be found.
      • getTTypeForColumn

        public TType getTTypeForColumn​(Variable v)
        Returns a TType object that would represent values in the given column. Throws an error if the column can't be found.
      • findFieldName

        public abstract int findFieldName​(Variable v)
        Given a fully qualified variable field name, ie. 'APP.CUSTOMER.CUSTOMERID' this will return the column number the field is at. Returns -1 if the field does not exist in the table.
      • getResolvedVariable

        public abstract Variable getResolvedVariable​(int column)
        Returns a fully qualified Variable object that represents the name of the column at the given index. For example, new Variable(new TableName("APP", "CUSTOMER"), "ID")
      • getSelectableSchemeFor

        abstract SelectableScheme getSelectableSchemeFor​(int column,
                                                         int original_column,
                                                         Table table)
        Returns a SelectableScheme for the given column in the given VirtualTable row domain. The 'column' variable may be modified as it traverses through the tables, however the 'original_column' retains the link to the column in 'table'.
      • setToRowTableDomain

        abstract void setToRowTableDomain​(int column,
                                          IntegerVector row_set,
                                          TableDataSource ancestor)
        Given a set, this trickles down through the Table hierarchy resolving the given row_set to a form that the given ancestor understands. Say you give the set { 0, 1, 2, 3, 4, 5, 6 }, this function may check down three levels and return a new 7 element set with the rows fully resolved to the given ancestors domain.
      • getCellContents

        public abstract TObject getCellContents​(int column,
                                                int row)
        Returns an object that represents the information in the given cell in the table. This will generally be an expensive algorithm, so calls to it should be kept to a minimum. Note that the offset between two rows is not necessarily 1. Use 'rowEnumeration' to get the contents of a set.
        Specified by:
        getCellContents in interface TableDataSource
      • rowEnumeration

        public abstract RowEnumeration rowEnumeration()
        Returns an Enumeration of the rows in this table. Each call to 'RowEnumeration.nextRowIndex()' returns the next valid row in the table. Note that the order that rows are retreived depend on a number of factors. For a DataTable the rows are accessed in the order they are in the data file. For a VirtualTable, the rows are accessed in the order of the last select operation.

        If you want the rows to be returned by a specific column order then use the 'selectxxx' methods.

        Specified by:
        rowEnumeration in interface TableDataSource
      • getDataTableDef

        public abstract DataTableDef getDataTableDef()
        Returns a DataTableDef object that defines the name of the table and the layout of the columns of the table. Note that for tables that are joined with other tables, the table name and schema for this object become mangled. For example, a table called 'PERSON' joined with a table called 'MUSIC' becomes a table called 'PERSON#MUSIC' in a null schema.
        Specified by:
        getDataTableDef in interface TableDataSource
      • addDataTableListener

        abstract void addDataTableListener​(DataTableListener listener)
        Adds a DataTableListener to the DataTable objects at the root of this table tree hierarchy. If this table represents the join of a number of tables then the DataTableListener is added to all the DataTable objects at the root.

        A DataTableListener is notified of all modifications to the raw entries of the table. This listener can be used for detecting changes in VIEWs, for triggers or for caching of common queries.

      • removeDataTableListener

        abstract void removeDataTableListener​(DataTableListener listener)
        Removes a DataTableListener from the DataTable objects at the root of this table tree hierarchy. If this table represents the join of a number of tables, then the DataTableListener is removed from all the DataTable objects at the root.
      • lockRoot

        public abstract void lockRoot​(int lock_key)
        Locks the root table(s) of this table so that it is impossible to overwrite the underlying rows that may appear in this table. This is used when cells in the table need to be accessed 'outside' the lock. So we may have late access to cells in the table. 'lock_key' is a given key that will also unlock the root table(s). NOTE: This is nothing to do with the 'LockingMechanism' object.
      • unlockRoot

        public abstract void unlockRoot​(int lock_key)
        Unlocks the root tables so that the underlying rows may once again be used if they are not locked and have been removed. This should be called some time after the rows have been locked.
      • hasRootsLocked

        public abstract boolean hasRootsLocked()
        Returns true if the table has its row roots locked (via the lockRoot(int) method.
      • getColumnDefAt

        public DataTableColumnDef getColumnDefAt​(int col_index)
        Returns the DataTableColumnDef object for the given column index.
      • dumpTo

        public final void dumpTo​(java.io.PrintStream out)
                          throws java.io.IOException
        Dumps the contents of the table in a human readable form to the given output stream. This should only be used for debuging the database.
        Throws:
        java.io.IOException
      • emptySelect

        public final Table emptySelect()
        Returns a new Table based on this table with no rows in it.
      • singleRowSelect

        public final Table singleRowSelect​(int row_index)
        Selects a single row at the given index from this table.
      • columnMerge

        public final Table columnMerge​(Table table)
        Returns a Table that is a merge of this table and the destination table. The rows that are in the destination table are included in this table. The tables must have
      • rangeSelect

        public final Table rangeSelect​(Variable col_var,
                                       SelectableRange[] ranges)
        A single column range select on this table. This can often be solved very quickly especially if there is an index on the column. The SelectableRange array represents a set of ranges that are returned that meet the given criteria.
        Parameters:
        col_var - the column variable in this table (eg. Part.id)
        ranges - the normalized (no overlapping) set of ranges to find.
      • simpleSelect

        public final Table simpleSelect​(QueryContext context,
                                        Variable lhs_var,
                                        Operator op,
                                        Expression rhs)
        A simple select on this table. We select against a column, with an Operator and a rhs Expression that is constant (only needs to be evaluated once).
        Parameters:
        context - the context of the query.
        lhs_var - the left has side column reference.
        op - the operator.
        rhs - the expression to select against (the expression must be a constant).
      • simpleJoin

        public final Table simpleJoin​(QueryContext context,
                                      Table table,
                                      Variable lhs_var,
                                      Operator op,
                                      Expression rhs)
        A simple join operation. A simple join operation is one that has a single joining operator, a Variable on the lhs and a simple expression on the rhs that includes only columns in the rhs table. For example, 'id = part_id' or 'id == part_id * 2' or 'id == part_id + vendor_id * 2'

        It is important to understand how this algorithm works because all optimization of the expression must happen before the method starts.

        The simple join algorithm works as follows: Every row of the right hand side table 'table' is iterated through. The select opreation is applied to this table given the result evaluation. Each row that matches is included in the result table.

        For optimal performance, the expression should be arranged so that the rhs table is the smallest of the two tables (because we must iterate through all rows of this table). This table should be the largest.

      • exhaustiveSelect

        public final Table exhaustiveSelect​(QueryContext context,
                                            Expression exp)
        Exhaustively searches through this table for rows that match the expression given. This is the slowest type of query and is not able to use any type of indexing.

        A QueryContext object is used for resolving sub-query plans. If there are no sub-query plans in the expression, this can safely be 'null'.

      • any

        public Table any​(QueryContext context,
                         Expression lhs,
                         Operator op,
                         Table right_table)
        Evaluates a non-correlated ANY type operator given the LHS expression, the RHS subquery and the ANY operator to use. For example;

           Table.col > ANY ( SELECT .... )
         

        ANY creates a new table that contains only the rows in this table that the expression and operator evaluate to true for any values in the given table.

        The IN operator can be represented by using '= ANY'.

        Note that unlike the other join and select methods in this object this will take a complex expression as the lhs provided all the Variable objects resolve to this table.

        Parameters:
        lhs - the left has side expression. The Variable objects in this expression must all reference columns in this table.
        op - the operator to use.
        right_table - the subquery table should only contain on column.
        context - the context of the query.
      • all

        public Table all​(QueryContext context,
                         Expression lhs,
                         Operator op,
                         Table table)
        Evaluates a non-correlated ALL type operator given the LHS expression, the RHS subquery and the ALL operator to use. For example;

           Table.col > ALL ( SELECT .... )
         

        ALL creates a new table that contains only the rows in this table that the expression and operator evaluate to true for all values in the giventable.

        The NOT IN operator can be represented by using '<> ALL'.

        Note that unlike the other join and select methods in this object this will take a complex expression as the lhs provided all the Variable objects resolve to this table.

        Parameters:
        lhs - the left has side expression. The Variable objects in this expression must all reference columns in this table.
        op - the operator to use.
        table - The subquery table should only contain on column.
        context - The context of the query.
      • join

        public final Table join​(Table table)
        Performs a natural join of this table with the given table. This is the same as calling the above 'join' with no conditional.
      • outside

        public final VirtualTable outside​(Table rtable)
        Finds all rows in this table that are 'outside' the result in the given table. This is used in OUTER JOIN's. We perform a normal join, then determine unmatched joins with this function. We can then create an OuterTable with this result to make the completed table.

        'rtable' must be a decendent of this table.

      • union

        public final Table union​(Table table)
        Returns a new Table that is the union of the this table and the given table. A union operation will remove any duplicate rows.
      • distinct

        public final VirtualTable distinct()
        Deprecated.
        - not a proper SQL distinct.
        Returns a new table with any duplicate rows in this table removed.
      • distinct

        public final Table distinct​(int[] col_map)
        Returns a new table that has only distinct rows in it. This is an expensive operation. We sort over all the columns, then iterate through the result taking out any duplicate rows.

        The int array contains the columns to make distinct over.

        NOTE: This will change the order of this table in the result.

      • indexStringArray

        private final int indexStringArray​(java.lang.String val,
                                           java.lang.String[] array)
        Helper function. Returns the index in the String array of the given string value.
      • columnContainsValue

        public final boolean columnContainsValue​(int column,
                                                 TObject ob)
        Returns true if the given column number contains the value given.
      • columnMatchesValue

        public final boolean columnMatchesValue​(int column,
                                                Operator op,
                                                TObject ob)
        Returns true if the given column contains a value that the given operator returns true for with the given value.
      • allColumnMatchesValue

        public final boolean allColumnMatchesValue​(int column,
                                                   Operator op,
                                                   TObject ob)
        Returns true if the given column contains all values that the given operator returns true for with the given value.
      • orderByColumns

        public final Table orderByColumns​(int[] col_map)
        Returns a table that is ordered by the given column numbers. This can be used by various functions from grouping to distinction to ordering. Always sorted by ascending.
      • orderedRowList

        public final IntegerVector orderedRowList​(int[] col_map)
        Returns an IntegerVector that represents the list of rows in this table in sorted order by the given column map.
      • orderByColumn

        public final VirtualTable orderByColumn​(int col_index,
                                                boolean ascending)
        Returns a Table which is identical to this table, except it is sorted by the given column name. This means that if you access the rows sequentually you will be reading the sorted order of the column.
      • getTableAccessState

        public final TableAccessState getTableAccessState()
        This returns an object that can only access the cells that are in this table, and has no other access to the 'Table' class's functionality. The purpose of this object is to provide a clean way to access the state of a table without being able to access any of the row sorting (SelectableScheme) methods that would return incorrect information in the situation where the table locks (via LockingMechanism) were removed. NOTE: The methods in this class will only work if this table has its rows locked via the 'lockRoot(int)' method.
      • selectRows

        final IntegerVector selectRows​(int[] cols,
                                       Operator op,
                                       TObject[] cells)
        Returns a set that respresents the list of multi-column row numbers selected from the table given the condition.

        NOTE: This can be used to exploit multi-column indexes if they exist.

      • selectRows

        final IntegerVector selectRows​(int column,
                                       Operator op,
                                       TObject cell)
        Returns a set that represents the list of row numbers selected from the table given the condition.
      • selectRows

        IntegerVector selectRows​(int column,
                                 TObject min_cell,
                                 TObject max_cell)
        Selects the rows in a table column between two minimum and maximum bounds. This is all rows which are >= min_cell and < max_cell.

        NOTE: The returns IntegerVector _must_ be sorted be the 'column' cells.

      • selectFromRegex

        final IntegerVector selectFromRegex​(int column,
                                            Operator op,
                                            TObject ob)
        Selects all the rows where the given column matches the regular expression. This uses the static class 'PatternSearch' to perform the operation.

        This method must guarentee the result is ordered by the given column.

      • selectFromPattern

        final IntegerVector selectFromPattern​(int column,
                                              Operator op,
                                              TObject ob)
        Selects all the rows where the given column matches the given pattern. This uses the static class 'PatternSearch' to perform these operations. 'operation' will be either Condition.LIKE or Condition.NOT_LIKE. NOTE: The returns IntegerVector _must_ be sorted be the 'column' cells.
      • allRowsIn

        final IntegerVector allRowsIn​(int column,
                                      Table table)
        Given a table and column (from this table), this returns all the rows from this table that are also in the first column of the given table. This is the basis of a fast 'in' process.
      • allRowsNotIn

        final IntegerVector allRowsNotIn​(int column,
                                         Table table)
        Given a table and column (from this table), this returns all the rows from this table that are not in the first column of the given table. This is the basis of a fast 'not in' process.
      • selectAll

        public final IntegerVector selectAll​(int column)
        Returns an array that represents the sorted order of this table by the given column number.
      • selectAll

        public final IntegerVector selectAll()
        Returns a list of rows that represents the enumerator order of this table.
      • selectRange

        public final IntegerVector selectRange​(int column,
                                               SelectableRange[] ranges)
        Returns an array that represents the sorted order of this table of all values in the given SelectableRange objects of the given column index. If there is an index on the column, the result can be found very quickly. The range array must be normalized (no overlapping ranges).
      • selectLast

        public final IntegerVector selectLast​(int column)
        Returns an array that represents the last sorted element(s) of the given column number.
      • selectFirst

        public final IntegerVector selectFirst​(int column)
        Returns an array that represents the first sorted element(s) of the given column number.
      • selectRest

        public final IntegerVector selectRest​(int column)
        Returns an array that represents the rest of the sorted element(s) of the given column number. (not the 'first' set).
      • singleArrayCellMap

        private TObject[] singleArrayCellMap​(TObject cell)
        Convenience, returns a TObject[] array given a single TObject, or null if the TObject is null (not if TObject represents a null value).
      • getFirstCellContent

        public final TObject getFirstCellContent​(int column)
        Returns the TObject value that represents the first item in the set or null if there are no items in the column set.
      • getFirstCellContent

        public final TObject[] getFirstCellContent​(int[] col_map)
        Returns the TObject value that represents the first item in the set or null if there are no items in the column set.
      • getLastCellContent

        public final TObject getLastCellContent​(int column)
        Returns the TObject value that represents the last item in the set or null if there are no items in the column set.
      • getLastCellContent

        public final TObject[] getLastCellContent​(int[] col_map)
        Returns the TObject value that represents the last item in the set or null if there are no items in the column set.
      • getSingleCellContent

        public final TObject getSingleCellContent​(int column)
        If the given column contains all items of the same value, this method returns the value. If it doesn't, or the column set is empty it returns null.
      • getSingleCellContent

        public final TObject[] getSingleCellContent​(int[] col_map)
        If the given column contains all items of the same value, this method returns the value. If it doesn't, or the column set is empty it returns null.
      • columnContainsCell

        public final boolean columnContainsCell​(int column,
                                                TObject cell)
        Returns true if the given cell is found in the table.
      • compareCells

        public static boolean compareCells​(TObject ob1,
                                           TObject ob2,
                                           Operator op)
        Compares cell1 with cell2 and if the given operator evalutes to true then returns true, otherwise false.
      • toMap

        public java.util.Map toMap()
        Assuming this table is a 2 column key/value table, and the first column is a string, this will convert it into a map. The returned map can then be used to access values in the second column.
      • fastFindFieldName

        public final int fastFindFieldName​(Variable col)
        A faster way to find a column index given a string column name. This caches column name -> column index in a HashMap.
      • toString

        public java.lang.String toString()
        Returns a string that represents this table.
        Overrides:
        toString in class java.lang.Object
      • printGraph

        public void printGraph​(java.io.PrintStream out,
                               int indent)
        Prints a graph of the table hierarchy to the stream.