Deleting Tuples from Relations Oct 17, 1995 ------------------------------ Background ---------- Relations store tuples in their indexes. All relations always have at least one index, which is needed to store the tuples. New indexes can be created and old indexes deleted at any time. Multiple indexes on a relation all use the same tuples, ie the data is only stored once. By default, a relation "owns" the tuples stored within it, and should delete any remaining tuples when it is deleted. A relation either owns all tuples within it, or owns none (and "borrows" them all from another relation.) Temporary relations which "borrow" tuples from an owning relation are allowed; but the owning relation remains unaware of them, and it is the responsibilty of the code using the borrowing relation to ensure that its indexes will not contain any dangling pointers to tuples that have been deleted. If this is not possible, a "borrower list" could be set-up in the owning relation, and all borrowers would be informed of deletes which they would have to perform in a non-lazy fashion; but nothing will be done in this area until we find out how common these borrowing relations are. If a relation cannot be a borrowing relation, it must duplicate the tuples and become an owning relation. Tuples are deleted from indexes in a lazy fashion. During a scan of an index, if a deleted tuple is found, it is deleted from that index as long as this scan was the only one currently open over that index. Plan ---- Assumption: Inserts are frequent, deletes are rare. A delete of a single tuple currently only occurs in programs using grouping, head deletes, or at an explicit request from the user. A large majority of programs only delete entire relations at once, when they are done with them. Programs doing only inserts should not be penalized because other programs want to perform single tuple deletes. We use reference counting, but only _deleted_ tuples are reference-counted. Each tuple has two relevant fields: unsigned int deleted : 1; unsigned short del_ref_cnt; When a tuple is deleted, the 'deleted' bit of the tuple is set and the del_ref_cnt field is set to the number of indexes currently being used. Lazy delete: When the deleted tuple is encountered during a scan of an index, the HashNode container is deleted and the tuple's 'del_ref_cnt' field is decremented. If the ref count is zero, then the actual tuples is deleted as well. Narrowly Avoided Problems: ------------------------- 0) Deleting a relation (all indexes) at once (ie, in ~IndexSet() ): All the indexes but one would be deleted as normal, allowing them a "final" scan of the relation. The last index will use a special version of delete which ignores the reference counts and simply deletes all tuples in this index. Since this index is deleted last, any tuples which were previously deleted from this index but were present in other indexes have already been deleted. 1) Deleting an index that contains tuples marked as deleted: The deleting of an index will also perform these operations if the tuple is marked as deleted, so this would simply be the "final" scan of that index. 2) Adding an index after a tuple is marked as deleted: The tuple will not be inserted into this new index, and so its ref count will still be correct. -----------------------------