|
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
|
*
|
|
|
|
|
* nodeSeqscan.c
|
|
|
|
|
* Support routines for sequential scans of relations.
|
|
|
|
|
*
|
|
|
|
|
* Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
|
|
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
|
*
|
|
|
|
|
*
|
|
|
|
|
* IDENTIFICATION
|
|
|
|
|
* src/backend/executor/nodeSeqscan.c
|
|
|
|
|
*
|
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
/*
|
|
|
|
|
* INTERFACE ROUTINES
|
|
|
|
|
* ExecSeqScan sequentially scans a relation.
|
|
|
|
|
* ExecSeqNext retrieve next tuple in sequential order.
|
|
|
|
|
* ExecInitSeqScan creates and initializes a seqscan node.
|
|
|
|
|
* ExecEndSeqScan releases any storage allocated.
|
|
|
|
|
* ExecReScanSeqScan rescans the relation
|
|
|
|
|
* ExecSeqMarkPos marks scan position
|
|
|
|
|
* ExecSeqRestrPos restores scan position
|
|
|
|
|
*/
|
|
|
|
|
#include "postgres.h"
|
|
|
|
|
|
|
|
|
|
#include "access/heapam.h"
|
|
|
|
|
#include "access/relscan.h"
|
|
|
|
|
#include "executor/execdebug.h"
|
|
|
|
|
#include "executor/nodeSeqscan.h"
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
15 years ago
|
|
|
#include "storage/predicate.h"
|
|
|
|
|
|
|
|
|
|
static void InitScanRelation(SeqScanState *node, EState *estate);
|
|
|
|
|
static TupleTableSlot *SeqNext(SeqScanState *node);
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* Scan Support
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* SeqNext
|
|
|
|
|
*
|
|
|
|
|
* This is a workhorse for ExecSeqScan
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
static TupleTableSlot *
|
|
|
|
|
SeqNext(SeqScanState *node)
|
|
|
|
|
{
|
|
|
|
|
HeapTuple tuple;
|
|
|
|
|
HeapScanDesc scandesc;
|
|
|
|
|
EState *estate;
|
|
|
|
|
ScanDirection direction;
|
|
|
|
|
TupleTableSlot *slot;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* get information from the estate and scan state
|
|
|
|
|
*/
|
|
|
|
|
scandesc = node->ss_currentScanDesc;
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
estate = node->ps.state;
|
|
|
|
|
direction = estate->es_direction;
|
|
|
|
|
slot = node->ss_ScanTupleSlot;
|
|
|
|
|
|
|
|
|
|
/*
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
* get the next tuple from the table
|
|
|
|
|
*/
|
|
|
|
|
tuple = heap_getnext(scandesc, direction);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* save the tuple and the buffer returned to us by the access methods in
|
|
|
|
|
* our scan tuple slot and return the slot. Note: we pass 'false' because
|
|
|
|
|
* tuples returned by heap_getnext() are pointers onto disk pages and were
|
|
|
|
|
* not created with palloc() and so should not be pfree()'d. Note also
|
|
|
|
|
* that ExecStoreTuple will increment the refcount of the buffer; the
|
|
|
|
|
* refcount will not be dropped until the tuple table slot is cleared.
|
|
|
|
|
*/
|
|
|
|
|
if (tuple)
|
|
|
|
|
ExecStoreTuple(tuple, /* tuple to store */
|
|
|
|
|
slot, /* slot to store in */
|
|
|
|
|
scandesc->rs_cbuf, /* buffer associated with this
|
|
|
|
|
* tuple */
|
|
|
|
|
false); /* don't pfree this pointer */
|
|
|
|
|
else
|
|
|
|
|
ExecClearTuple(slot);
|
|
|
|
|
|
|
|
|
|
return slot;
|
|
|
|
|
}
|
|
|
|
|
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
/*
|
|
|
|
|
* SeqRecheck -- access method routine to recheck a tuple in EvalPlanQual
|
|
|
|
|
*/
|
|
|
|
|
static bool
|
|
|
|
|
SeqRecheck(SeqScanState *node, TupleTableSlot *slot)
|
|
|
|
|
{
|
|
|
|
|
/*
|
|
|
|
|
* Note that unlike IndexScan, SeqScan never use keys in heap_beginscan
|
|
|
|
|
* (and this is very bad) - so, here we do not check are keys ok or not.
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
*/
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* ExecSeqScan(node)
|
|
|
|
|
*
|
|
|
|
|
* Scans the relation sequentially and returns the next qualifying
|
|
|
|
|
* tuple.
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
* We call the ExecScan() routine and pass it the appropriate
|
|
|
|
|
* access method functions.
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
15 years ago
|
|
|
* For serializable transactions, we first acquire a predicate
|
|
|
|
|
* lock on the entire relation.
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
TupleTableSlot *
|
|
|
|
|
ExecSeqScan(SeqScanState *node)
|
|
|
|
|
{
|
Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.
To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.
A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.
Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.
We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.
Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.
Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
15 years ago
|
|
|
PredicateLockRelation(node->ss_currentRelation);
|
|
|
|
|
node->ss_currentScanDesc->rs_relpredicatelocked = true;
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
return ExecScan((ScanState *) node,
|
|
|
|
|
(ExecScanAccessMtd) SeqNext,
|
|
|
|
|
(ExecScanRecheckMtd) SeqRecheck);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* InitScanRelation
|
|
|
|
|
*
|
|
|
|
|
* This does the initialization for scan relations and
|
|
|
|
|
* subplans of scans.
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
static void
|
|
|
|
|
InitScanRelation(SeqScanState *node, EState *estate)
|
|
|
|
|
{
|
|
|
|
|
Relation currentRelation;
|
|
|
|
|
HeapScanDesc currentScanDesc;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* get the relation object id from the relid'th entry in the range table,
|
|
|
|
|
* open that relation and acquire appropriate lock on it.
|
|
|
|
|
*/
|
|
|
|
|
currentRelation = ExecOpenScanRelation(estate,
|
|
|
|
|
((SeqScan *) node->ps.plan)->scanrelid);
|
|
|
|
|
|
|
|
|
|
currentScanDesc = heap_beginscan(currentRelation,
|
|
|
|
|
estate->es_snapshot,
|
|
|
|
|
0,
|
|
|
|
|
NULL);
|
|
|
|
|
|
|
|
|
|
node->ss_currentRelation = currentRelation;
|
|
|
|
|
node->ss_currentScanDesc = currentScanDesc;
|
|
|
|
|
|
|
|
|
|
ExecAssignScanType(node, RelationGetDescr(currentRelation));
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* ExecInitSeqScan
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
SeqScanState *
|
|
|
|
|
ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
|
|
|
|
|
{
|
|
|
|
|
SeqScanState *scanstate;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Once upon a time it was possible to have an outerPlan of a SeqScan, but
|
|
|
|
|
* not any more.
|
|
|
|
|
*/
|
|
|
|
|
Assert(outerPlan(node) == NULL);
|
|
|
|
|
Assert(innerPlan(node) == NULL);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* create state structure
|
|
|
|
|
*/
|
|
|
|
|
scanstate = makeNode(SeqScanState);
|
|
|
|
|
scanstate->ps.plan = (Plan *) node;
|
|
|
|
|
scanstate->ps.state = estate;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Miscellaneous initialization
|
|
|
|
|
*
|
|
|
|
|
* create expression context for node
|
|
|
|
|
*/
|
|
|
|
|
ExecAssignExprContext(estate, &scanstate->ps);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* initialize child expressions
|
|
|
|
|
*/
|
|
|
|
|
scanstate->ps.targetlist = (List *)
|
|
|
|
|
ExecInitExpr((Expr *) node->plan.targetlist,
|
|
|
|
|
(PlanState *) scanstate);
|
|
|
|
|
scanstate->ps.qual = (List *)
|
|
|
|
|
ExecInitExpr((Expr *) node->plan.qual,
|
|
|
|
|
(PlanState *) scanstate);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* tuple table initialization
|
|
|
|
|
*/
|
|
|
|
|
ExecInitResultTupleSlot(estate, &scanstate->ps);
|
|
|
|
|
ExecInitScanTupleSlot(estate, scanstate);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* initialize scan relation
|
|
|
|
|
*/
|
|
|
|
|
InitScanRelation(scanstate, estate);
|
|
|
|
|
|
|
|
|
|
scanstate->ps.ps_TupFromTlist = false;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Initialize result tuple type and projection info.
|
|
|
|
|
*/
|
|
|
|
|
ExecAssignResultTypeFromTL(&scanstate->ps);
|
|
|
|
|
ExecAssignScanProjectionInfo(scanstate);
|
|
|
|
|
|
|
|
|
|
return scanstate;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* ExecEndSeqScan
|
|
|
|
|
*
|
|
|
|
|
* frees any storage allocated through C routines.
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
void
|
|
|
|
|
ExecEndSeqScan(SeqScanState *node)
|
|
|
|
|
{
|
|
|
|
|
Relation relation;
|
|
|
|
|
HeapScanDesc scanDesc;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* get information from node
|
|
|
|
|
*/
|
|
|
|
|
relation = node->ss_currentRelation;
|
|
|
|
|
scanDesc = node->ss_currentScanDesc;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Free the exprcontext
|
|
|
|
|
*/
|
|
|
|
|
ExecFreeExprContext(&node->ps);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* clean out the tuple table
|
|
|
|
|
*/
|
|
|
|
|
ExecClearTuple(node->ps.ps_ResultTupleSlot);
|
|
|
|
|
ExecClearTuple(node->ss_ScanTupleSlot);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* close heap scan
|
|
|
|
|
*/
|
|
|
|
|
heap_endscan(scanDesc);
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* close the heap relation.
|
|
|
|
|
*/
|
|
|
|
|
ExecCloseScanRelation(relation);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* Join Support
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* ExecReScanSeqScan
|
|
|
|
|
*
|
|
|
|
|
* Rescans the relation.
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
void
|
|
|
|
|
ExecReScanSeqScan(SeqScanState *node)
|
|
|
|
|
{
|
|
|
|
|
HeapScanDesc scan;
|
|
|
|
|
|
|
|
|
|
scan = node->ss_currentScanDesc;
|
|
|
|
|
|
|
|
|
|
heap_rescan(scan, /* scan desc */
|
|
|
|
|
NULL); /* new scan keys */
|
Re-implement EvalPlanQual processing to improve its performance and eliminate
a lot of strange behaviors that occurred in join cases. We now identify the
"current" row for every joined relation in UPDATE, DELETE, and SELECT FOR
UPDATE/SHARE queries. If an EvalPlanQual recheck is necessary, we jam the
appropriate row into each scan node in the rechecking plan, forcing it to emit
only that one row. The former behavior could rescan the whole of each joined
relation for each recheck, which was terrible for performance, and what's much
worse could result in duplicated output tuples.
Also, the original implementation of EvalPlanQual could not re-use the recheck
execution tree --- it had to go through a full executor init and shutdown for
every row to be tested. To avoid this overhead, I've associated a special
runtime Param with each LockRows or ModifyTable plan node, and arranged to
make every scan node below such a node depend on that Param. Thus, by
signaling a change in that Param, the EPQ machinery can just rescan the
already-built test plan.
This patch also adds a prohibition on set-returning functions in the
targetlist of SELECT FOR UPDATE/SHARE. This is needed to avoid the
duplicate-output-tuple problem. It seems fairly reasonable since the
other restrictions on SELECT FOR UPDATE are meant to ensure that there
is a unique correspondence between source tuples and result tuples,
which an output SRF destroys as much as anything else does.
16 years ago
|
|
|
|
|
|
|
|
ExecScanReScan((ScanState *) node);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* ExecSeqMarkPos(node)
|
|
|
|
|
*
|
|
|
|
|
* Marks scan position.
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
void
|
|
|
|
|
ExecSeqMarkPos(SeqScanState *node)
|
|
|
|
|
{
|
|
|
|
|
HeapScanDesc scan = node->ss_currentScanDesc;
|
|
|
|
|
|
|
|
|
|
heap_markpos(scan);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* ----------------------------------------------------------------
|
|
|
|
|
* ExecSeqRestrPos
|
|
|
|
|
*
|
|
|
|
|
* Restores scan position.
|
|
|
|
|
* ----------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
void
|
|
|
|
|
ExecSeqRestrPos(SeqScanState *node)
|
|
|
|
|
{
|
|
|
|
|
HeapScanDesc scan = node->ss_currentScanDesc;
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Clear any reference to the previously returned tuple. This is needed
|
|
|
|
|
* because the slot is simply pointing at scan->rs_cbuf, which
|
|
|
|
|
* heap_restrpos will change; we'd have an internally inconsistent slot if
|
|
|
|
|
* we didn't do this.
|
|
|
|
|
*/
|
|
|
|
|
ExecClearTuple(node->ss_ScanTupleSlot);
|
|
|
|
|
|
|
|
|
|
heap_restrpos(scan);
|
|
|
|
|
}
|