|
|
|
|
@ -3937,3 +3937,564 @@ TIP 6: Have you searched our list archives? |
|
|
|
|
|
|
|
|
|
http://archives.postgresql.org |
|
|
|
|
|
|
|
|
|
From pgsql-hackers-owner+M37860@postgresql.org Fri Apr 11 15:37:03 2003 |
|
|
|
|
Return-path: <pgsql-hackers-owner+M37860@postgresql.org> |
|
|
|
|
Received: from relay3.pgsql.com (relay3.pgsql.com [64.117.224.149]) |
|
|
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h3BJaxv13018 |
|
|
|
|
for <pgman@candle.pha.pa.us>; Fri, 11 Apr 2003 15:37:01 -0400 (EDT) |
|
|
|
|
Received: from postgresql.org (postgresql.org [64.49.215.8]) |
|
|
|
|
by relay3.pgsql.com (Postfix) with ESMTP |
|
|
|
|
id 3F9D0EA81E7; Fri, 11 Apr 2003 19:36:56 +0000 (GMT) |
|
|
|
|
X-Original-To: pgsql-hackers@postgresql.org |
|
|
|
|
Received: from spampd.localdomain (postgresql.org [64.49.215.8]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id D27B2476036 |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Fri, 11 Apr 2003 15:35:32 -0400 (EDT) |
|
|
|
|
Received: from mail1.ihs.com (mail1.ihs.com [170.207.70.222]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id 742DD475F5F |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Fri, 11 Apr 2003 15:35:31 -0400 (EDT) |
|
|
|
|
Received: from css120.ihs.com (css120.ihs.com [170.207.105.120]) |
|
|
|
|
by mail1.ihs.com (8.12.9/8.12.9) with ESMTP id h3BJZHRF027332; |
|
|
|
|
Fri, 11 Apr 2003 13:35:17 -0600 (MDT) |
|
|
|
|
Date: Fri, 11 Apr 2003 13:31:06 -0600 (MDT) |
|
|
|
|
From: "scott.marlowe" <scott.marlowe@ihs.com> |
|
|
|
|
To: Ron Peacetree <rjpeace@earthlink.net> |
|
|
|
|
cc: <pgsql-hackers@postgresql.org> |
|
|
|
|
Subject: Re: [HACKERS] Anyone working on better transaction locking? |
|
|
|
|
In-Reply-To: <eS0la.16229$ey1.1398978@newsread1.prod.itd.earthlink.net> |
|
|
|
|
Message-ID: <Pine.LNX.4.33.0304111314130.3232-100000@css120.ihs.com> |
|
|
|
|
MIME-Version: 1.0 |
|
|
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII |
|
|
|
|
X-MailScanner: Found to be clean |
|
|
|
|
X-Spam-Status: No, hits=-31.5 required=5.0 |
|
|
|
|
tests=BAYES_10,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, |
|
|
|
|
QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE |
|
|
|
|
autolearn=ham version=2.50 |
|
|
|
|
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) |
|
|
|
|
Precedence: bulk |
|
|
|
|
Sender: pgsql-hackers-owner@postgresql.org |
|
|
|
|
Status: OR |
|
|
|
|
|
|
|
|
|
On Wed, 9 Apr 2003, Ron Peacetree wrote: |
|
|
|
|
|
|
|
|
|
> "Andrew Sullivan" <andrew@libertyrms.info> wrote in message |
|
|
|
|
> news:20030409170926.GH2255@libertyrms.info... |
|
|
|
|
> > On Wed, Apr 09, 2003 at 05:41:06AM +0000, Ron Peacetree wrote: |
|
|
|
|
> > Nonsense. You explicitly made the MVCC comparison with Oracle, and |
|
|
|
|
> > are asking for a "better" locking mechanism without providing any |
|
|
|
|
> > evidence that PostgreSQL's is bad. |
|
|
|
|
> > |
|
|
|
|
> Just because someone else's is "better" does not mean PostgreSQL's is |
|
|
|
|
> "bad", and I've never said such. As I've said, I'll get back to Tom |
|
|
|
|
> and the list on this. |
|
|
|
|
|
|
|
|
|
But you didn't identify HOW it was better. I think that's the point |
|
|
|
|
being made. |
|
|
|
|
|
|
|
|
|
> > > Please see my posts with regards to ... |
|
|
|
|
> > |
|
|
|
|
> > I think your other posts were similar to the one which started this |
|
|
|
|
> > thread: full of mighty big pronouncements which turned out to depend |
|
|
|
|
> > on a bunch of not-so-tenable assumptions. |
|
|
|
|
> > |
|
|
|
|
> Hmmm. Well, I don't think of algorithm analysis by the likes of |
|
|
|
|
> Knuth, Sedgewick, Gonnet, and Baeza-Yates as being "not so tenable |
|
|
|
|
> assumptions", but YMMV. As for "mighty pronouncements", that also |
|
|
|
|
> seems a bit misleading since we are talking about quantifiable |
|
|
|
|
> programming and computer science issues, not unquantifiable things |
|
|
|
|
> like politics. |
|
|
|
|
|
|
|
|
|
But the real truth is revealed when the rubber hits the pavement. |
|
|
|
|
Remember that Linux Torvalds was roundly criticized for his choice of a |
|
|
|
|
monolithic development model for his kernel, and was literally told that |
|
|
|
|
his choice would restrict to "toy" status and that no commercial OS could |
|
|
|
|
scale with a monolithic kernel. |
|
|
|
|
|
|
|
|
|
There's no shortage of people with good ideas, just people with the skills |
|
|
|
|
to implement those good ideas. If you've got a patch to apply that's been |
|
|
|
|
tested to show something is faster EVERYONE here wants to see it. |
|
|
|
|
|
|
|
|
|
If you've got a theory, no matter how well backed up by academic research, |
|
|
|
|
it's still just a theory. Until someone writes to code to implement it, |
|
|
|
|
the gains are theoretical, and many things that MIGHT help don't because |
|
|
|
|
of the real world issues underlying your database, like I/O bandwidth or |
|
|
|
|
CPU <-> memory bandwidth. |
|
|
|
|
|
|
|
|
|
> > I'm sorry to be so cranky about this, but I get tired of having to |
|
|
|
|
> > defend one of my employer's core technologies from accusations based |
|
|
|
|
> > on half-truths and "everybody knows" assumptions. For instance, |
|
|
|
|
> > |
|
|
|
|
> Again, "accusations" is a bit strong. I thought the discussion was |
|
|
|
|
> about the technical merits and costs of various features and various |
|
|
|
|
> ways to implement them, particularly when this product must compete |
|
|
|
|
> for installed base with other solutions. Being coldly realistic about |
|
|
|
|
> what a product's strengths and weaknesses are is, again, just good |
|
|
|
|
> business. Sun Tzu's comment about knowing the enemy and yourself |
|
|
|
|
> seems appropriate here... |
|
|
|
|
|
|
|
|
|
No, you're wrong. Postgresql doesn't have to compete. It doesn't have to |
|
|
|
|
win. it doesn't need a marketing department. All those things are nice, |
|
|
|
|
and I'm glad if it does them, but doesn't HAVE TO. Postgresql has to |
|
|
|
|
work. It does that well. |
|
|
|
|
|
|
|
|
|
Postgresql CAN compete if someone wants to put the effort into competing, |
|
|
|
|
but it isn't a priority for me. Working is the priority, and if other |
|
|
|
|
people aren't smart enough to test Postgresql to see if it works for them, |
|
|
|
|
all the better, I keep my edge by having a near zero cost database engine, |
|
|
|
|
while the competition spends money on MSSQL or Oracle. |
|
|
|
|
|
|
|
|
|
Tom and Andrew ARE coldly realistic about the shortcomings of postgresql. |
|
|
|
|
It has issues, and things that need to be fixed. It needs more coders. |
|
|
|
|
It doesn't need every feature that Oracle or DB2 have. Heck some of their |
|
|
|
|
"features" would be considered a mis-feature in the Postgresql world. |
|
|
|
|
|
|
|
|
|
> > > I'll mention thread support in passing, |
|
|
|
|
> > |
|
|
|
|
> > there's actually a FAQ item about thread support, because in the |
|
|
|
|
> > opinion of those who have looked at it, the cost is just not worth |
|
|
|
|
> > the benefit. If you have evidence to the contrary (specific |
|
|
|
|
> > evidence, please, for this application), and have already read all |
|
|
|
|
> the |
|
|
|
|
> > previous discussion of the topic, perhaps people would be interested |
|
|
|
|
> in |
|
|
|
|
> > opening that debate again (though I have my doubts). |
|
|
|
|
> > |
|
|
|
|
> Zeus had a performance ceiling roughly 3x that of Apache when Zeus |
|
|
|
|
> supported threading as well as pre-forking and Apache only supported |
|
|
|
|
> pre forking. The Apache folks now support both. DB2, Oracle, and SQL |
|
|
|
|
> Server all use threads. Etc, etc. |
|
|
|
|
|
|
|
|
|
Yes, and if you configured your apache server to have 20 or 30 spare |
|
|
|
|
servers, in the real world, it was nearly neck and neck to Zeus, but since |
|
|
|
|
Zeus cost like $3,000 a copy, it is still cheaper to just overwhelm it |
|
|
|
|
with more servers running apache than to use zeus. |
|
|
|
|
|
|
|
|
|
> That's an awful lot of very bright programmers and some serious $$ |
|
|
|
|
> voting that threads are worth it. |
|
|
|
|
|
|
|
|
|
For THAT application. for what a web server does, threads can be very |
|
|
|
|
useful, even useful enough to put up with the problems created by running |
|
|
|
|
threads on multiple threading libs on different OSes. |
|
|
|
|
|
|
|
|
|
Let me ask you, if Zeus scrams and crashes out, and it's installed |
|
|
|
|
properly so it just comes right back up, how much data can you lose? |
|
|
|
|
|
|
|
|
|
If Postgresql scrams and crashes out, how much data can you lost? |
|
|
|
|
|
|
|
|
|
> Given all that, if PostgreSQL |
|
|
|
|
> specific |
|
|
|
|
> thread support is =not= showing itself to be a win that's an |
|
|
|
|
> unexpected |
|
|
|
|
> enough outcome that we should be asking hard questions as to why not. |
|
|
|
|
|
|
|
|
|
There HAS been testing on threads in Postgresql. It has been covered to |
|
|
|
|
death. The fact that you're still arguing proves you likely haven't read |
|
|
|
|
the archive (google has it back to way back when, use that to look it up) |
|
|
|
|
about this subject. |
|
|
|
|
|
|
|
|
|
Threads COULD help on multi-sorted results, and a few other areas, but the |
|
|
|
|
increase in performance really wasn't that great for 95% of all the cases, |
|
|
|
|
and for the 5% it was, simple query planner improvements have provided far |
|
|
|
|
greater performance increases. |
|
|
|
|
|
|
|
|
|
The problem with threading is that we can either use the one process -> |
|
|
|
|
many thread design, which I personally don't trust for something like a |
|
|
|
|
database, or a process per backend connection which can run |
|
|
|
|
multi-threaded. This scenario makes Postgresql just as stable and |
|
|
|
|
reliable as it was as a multi-process app, but allows threaded performance |
|
|
|
|
in certain areas of the backend that are parallelizable to run in parallel |
|
|
|
|
on multi-CPU systems. |
|
|
|
|
|
|
|
|
|
the gain, again, is minimal, and on a system with many users accessing it, |
|
|
|
|
there is NO real world gain. |
|
|
|
|
|
|
|
|
|
> At their core, threads are a context switching efficiency tweak. |
|
|
|
|
|
|
|
|
|
Except that on the two OSes which Postgresql runs on the most, threads are |
|
|
|
|
really no faster than processes. In the Linux kernel, the only real |
|
|
|
|
difference is how the OS treats them, creation, destruction of threads |
|
|
|
|
versus processes is virtually identical there. |
|
|
|
|
|
|
|
|
|
> Certainly it's =possible= that threads have nothing to offer |
|
|
|
|
> PostgreSQL, but IMHO it's not =probable=. Just another thing for me |
|
|
|
|
> to add to my TODO heap for looking at... |
|
|
|
|
|
|
|
|
|
It's been tested, it didn't help a lot, and it made it MUCH harder to |
|
|
|
|
maintain, as threads in Linux are handled by a different lib than in say |
|
|
|
|
Solaris, or Windows or any other OS. I.e. you can't guarantee the thread |
|
|
|
|
lib you need will be there, and that there are no bugs. MySQL still has |
|
|
|
|
thread bug issues pop up, most of which are in the thread libs themselves. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)--------------------------- |
|
|
|
|
TIP 4: Don't 'kill -9' the postmaster |
|
|
|
|
|
|
|
|
|
From pgsql-hackers-owner+M37865@postgresql.org Fri Apr 11 17:34:21 2003 |
|
|
|
|
Return-path: <pgsql-hackers-owner+M37865@postgresql.org> |
|
|
|
|
Received: from relay1.pgsql.com (relay1.pgsql.com [64.49.215.129]) |
|
|
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h3BLYIv28485 |
|
|
|
|
for <pgman@candle.pha.pa.us>; Fri, 11 Apr 2003 17:34:19 -0400 (EDT) |
|
|
|
|
Received: from postgresql.org (postgresql.org [64.49.215.8]) |
|
|
|
|
by relay1.pgsql.com (Postfix) with ESMTP |
|
|
|
|
id 0AF036F77ED; Fri, 11 Apr 2003 17:34:19 -0400 (EDT) |
|
|
|
|
X-Original-To: pgsql-hackers@postgresql.org |
|
|
|
|
Received: from spampd.localdomain (postgresql.org [64.49.215.8]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id EBB41476323 |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Fri, 11 Apr 2003 17:33:02 -0400 (EDT) |
|
|
|
|
Received: from filer (12-234-86-219.client.attbi.com [12.234.86.219]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id CED7D4762E1 |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Fri, 11 Apr 2003 17:32:57 -0400 (EDT) |
|
|
|
|
Received: from localhost (localhost [127.0.0.1]) |
|
|
|
|
(uid 1000) |
|
|
|
|
by filer with local; Fri, 11 Apr 2003 14:32:59 -0700 |
|
|
|
|
Date: Fri, 11 Apr 2003 14:32:59 -0700 |
|
|
|
|
From: Kevin Brown <kevin@sysexperts.com> |
|
|
|
|
To: pgsql-hackers@postgresql.org |
|
|
|
|
Subject: Re: [HACKERS] Anyone working on better transaction locking? |
|
|
|
|
Message-ID: <20030411213259.GU1833@filer> |
|
|
|
|
Mail-Followup-To: Kevin Brown <kevin@sysexperts.com>, |
|
|
|
|
pgsql-hackers@postgresql.org |
|
|
|
|
References: <20030409170926.GH2255@libertyrms.info> <eS0la.16229$ey1.1398978@newsread1.prod.itd.earthlink.net> |
|
|
|
|
MIME-Version: 1.0 |
|
|
|
|
Content-Type: text/plain; charset=us-ascii |
|
|
|
|
Content-Transfer-Encoding: 7bit |
|
|
|
|
Content-Disposition: inline |
|
|
|
|
In-Reply-To: <eS0la.16229$ey1.1398978@newsread1.prod.itd.earthlink.net> |
|
|
|
|
User-Agent: Mutt/1.4i |
|
|
|
|
Organization: Frobozzco International |
|
|
|
|
X-Spam-Status: No, hits=-38.0 required=5.0 |
|
|
|
|
tests=BAYES_10,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, |
|
|
|
|
REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT |
|
|
|
|
autolearn=ham version=2.50 |
|
|
|
|
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) |
|
|
|
|
Precedence: bulk |
|
|
|
|
Sender: pgsql-hackers-owner@postgresql.org |
|
|
|
|
Status: OR |
|
|
|
|
|
|
|
|
|
Ron Peacetree wrote: |
|
|
|
|
> Zeus had a performance ceiling roughly 3x that of Apache when Zeus |
|
|
|
|
> supported threading as well as pre-forking and Apache only supported |
|
|
|
|
> pre forking. The Apache folks now support both. DB2, Oracle, and SQL |
|
|
|
|
> Server all use threads. Etc, etc. |
|
|
|
|
|
|
|
|
|
You can't use Apache as an example of why you should thread a database |
|
|
|
|
engine, except for the cases where the database is used much like the |
|
|
|
|
web server is: for numerous short transactions. |
|
|
|
|
|
|
|
|
|
> That's an awful lot of very bright programmers and some serious $$ |
|
|
|
|
> voting that threads are worth it. Given all that, if PostgreSQL |
|
|
|
|
> specific thread support is =not= showing itself to be a win that's |
|
|
|
|
> an unexpected enough outcome that we should be asking hard questions |
|
|
|
|
> as to why not. |
|
|
|
|
|
|
|
|
|
It's not that there won't be any performance benefits to be had from |
|
|
|
|
threading (there surely will, on some platforms), but gaining those |
|
|
|
|
benefits comes at a very high development and maintenance cost. You |
|
|
|
|
lose a *lot* of robustness when all of your threads share the same |
|
|
|
|
memory space, and make yourself vulnerable to classes of failures that |
|
|
|
|
simply don't happen when you don't have shared memory space. |
|
|
|
|
|
|
|
|
|
PostgreSQL is a compromise in this regard: it *does* share memory, but |
|
|
|
|
it only shares memory that has to be shared, and nothing else. To get |
|
|
|
|
the benefits of full-fledged threads, though, requires that all memory |
|
|
|
|
be shared (otherwise the OS has to tweak the page tables whenever it |
|
|
|
|
switches contexts between your threads). |
|
|
|
|
|
|
|
|
|
> At their core, threads are a context switching efficiency tweak. |
|
|
|
|
|
|
|
|
|
This is the heart of the matter. Context switching is an operating |
|
|
|
|
system problem, and *that* is where the optimization belongs. Threads |
|
|
|
|
exist in large part because operating system vendors didn't bother to |
|
|
|
|
do a good job of optimizing process context switching and |
|
|
|
|
creation/destruction. |
|
|
|
|
|
|
|
|
|
Under Linux, from what I've read, process creation/destruction and |
|
|
|
|
context switching happens almost as fast as thread context switching |
|
|
|
|
on other operating systems (Windows in particular, if I'm not |
|
|
|
|
mistaken). |
|
|
|
|
|
|
|
|
|
> Since DB's switch context a lot under many circumstances, threads |
|
|
|
|
> should be a win under such circumstances. At the least, it should be |
|
|
|
|
> helpful in situations where we have multiple CPUs to split query |
|
|
|
|
> execution between. |
|
|
|
|
|
|
|
|
|
This is true, but I see little reason that we can't do the same thing |
|
|
|
|
using fork()ed processes and shared memory instead. |
|
|
|
|
|
|
|
|
|
There is context switching within databases, to be sure, but I think |
|
|
|
|
you'll be hard pressed to demonstrate that it is anything more than an |
|
|
|
|
insignificant fraction of the total overhead incurred by the database. |
|
|
|
|
I strongly suspect that much larger gains are to be had by optimizing |
|
|
|
|
other areas of the database, such as the planner, the storage manager |
|
|
|
|
(using mmap for file handling may prove useful here), the shared |
|
|
|
|
memory system (mmap may be faster than System V style shared memory), |
|
|
|
|
etc. |
|
|
|
|
|
|
|
|
|
The big overhead in the process model on most platforms is in creation |
|
|
|
|
and destruction of processes. PostgreSQL has a relatively high |
|
|
|
|
connection startup cost. But there are ways of dealing with this |
|
|
|
|
problem other than threading, namely the use of a connection caching |
|
|
|
|
middleware layer. Such layers exist for databases other than |
|
|
|
|
PostgreSQL, so the high cost of fielding and setting up a database |
|
|
|
|
connection is *not* unique to PostgreSQL ... which suggests that while |
|
|
|
|
threading may help, it doesn't help *enough*. |
|
|
|
|
|
|
|
|
|
I'd rather see some development work go into a connection caching |
|
|
|
|
process that understands the PostgreSQL wire protocol well enough to |
|
|
|
|
look like a PostgreSQL backend to connecting processes, rather than |
|
|
|
|
see a much larger amount of effort be spent on converting PostgreSQL |
|
|
|
|
to a threaded architecture (and then discover that connection caching |
|
|
|
|
is still needed anyway). |
|
|
|
|
|
|
|
|
|
> Certainly it's =possible= that threads have nothing to offer |
|
|
|
|
> PostgreSQL, but IMHO it's not =probable=. Just another thing for me |
|
|
|
|
> to add to my TODO heap for looking at... |
|
|
|
|
|
|
|
|
|
It's not that threads don't have anything to offer. It's that the |
|
|
|
|
costs associated with them are high enough that it's not at all clear |
|
|
|
|
that they're an overall win. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-- |
|
|
|
|
Kevin Brown kevin@sysexperts.com |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)--------------------------- |
|
|
|
|
TIP 6: Have you searched our list archives? |
|
|
|
|
|
|
|
|
|
http://archives.postgresql.org |
|
|
|
|
|
|
|
|
|
From pgsql-hackers-owner+M37876@postgresql.org Sat Apr 12 06:56:17 2003 |
|
|
|
|
Return-path: <pgsql-hackers-owner+M37876@postgresql.org> |
|
|
|
|
Received: from relay3.pgsql.com (relay3.pgsql.com [64.117.224.149]) |
|
|
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h3CAuDS20700 |
|
|
|
|
for <pgman@candle.pha.pa.us>; Sat, 12 Apr 2003 06:56:15 -0400 (EDT) |
|
|
|
|
Received: from postgresql.org (postgresql.org [64.49.215.8]) |
|
|
|
|
by relay3.pgsql.com (Postfix) with ESMTP |
|
|
|
|
id 35797EA81FF; Sat, 12 Apr 2003 10:55:59 +0000 (GMT) |
|
|
|
|
X-Original-To: pgsql-hackers@postgresql.org |
|
|
|
|
Received: from spampd.localdomain (postgresql.org [64.49.215.8]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id 7393E4762EF |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Sat, 12 Apr 2003 06:54:48 -0400 (EDT) |
|
|
|
|
Received: from filer (12-234-86-219.client.attbi.com [12.234.86.219]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id 423294762E1 |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Sat, 12 Apr 2003 06:54:44 -0400 (EDT) |
|
|
|
|
Received: from localhost (localhost [127.0.0.1]) |
|
|
|
|
(uid 1000) |
|
|
|
|
by filer with local; Sat, 12 Apr 2003 03:54:52 -0700 |
|
|
|
|
Date: Sat, 12 Apr 2003 03:54:52 -0700 |
|
|
|
|
From: Kevin Brown <kevin@sysexperts.com> |
|
|
|
|
To: pgsql-hackers@postgresql.org |
|
|
|
|
Subject: Re: [HACKERS] Anyone working on better transaction locking? |
|
|
|
|
Message-ID: <20030412105452.GV1833@filer> |
|
|
|
|
Mail-Followup-To: Kevin Brown <kevin@sysexperts.com>, |
|
|
|
|
pgsql-hackers@postgresql.org |
|
|
|
|
References: <20030409170926.GH2255@libertyrms.info> <eS0la.16229$ey1.1398978@newsread1.prod.itd.earthlink.net> <20030411213259.GU1833@filer> <200304121221.12377.shridhar_daithankar@nospam.persistent.co.in> |
|
|
|
|
MIME-Version: 1.0 |
|
|
|
|
Content-Type: text/plain; charset=us-ascii |
|
|
|
|
Content-Transfer-Encoding: 7bit |
|
|
|
|
Content-Disposition: inline |
|
|
|
|
In-Reply-To: <200304121221.12377.shridhar_daithankar@nospam.persistent.co.in> |
|
|
|
|
User-Agent: Mutt/1.4i |
|
|
|
|
Organization: Frobozzco International |
|
|
|
|
X-Spam-Status: No, hits=-39.4 required=5.0 |
|
|
|
|
tests=BAYES_01,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, |
|
|
|
|
QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT |
|
|
|
|
autolearn=ham version=2.50 |
|
|
|
|
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) |
|
|
|
|
Precedence: bulk |
|
|
|
|
Sender: pgsql-hackers-owner@postgresql.org |
|
|
|
|
Status: OR |
|
|
|
|
|
|
|
|
|
Shridhar Daithankar wrote: |
|
|
|
|
> Apache does too many things to be a speed daemon and what it offers |
|
|
|
|
> is pretty impressive from performance POV. |
|
|
|
|
> |
|
|
|
|
> But database is not webserver. It is not suppose to handle tons of |
|
|
|
|
> concurrent requests. That is a fundamental difference. |
|
|
|
|
|
|
|
|
|
I'm not sure I necessarily agree with this. A database is just a |
|
|
|
|
tool, a means of reliably storing information in such a way that it |
|
|
|
|
can be retrieved quickly. Whether or not it "should" handle lots of |
|
|
|
|
concurrent requests is a question that the person trying to use it |
|
|
|
|
must answer. |
|
|
|
|
|
|
|
|
|
A better answer is that a database engine that can handle lots of |
|
|
|
|
concurrent requests can also handle a smaller number, but not vice |
|
|
|
|
versa. So it's clearly an advantage to have a database engine that |
|
|
|
|
can handle lots of concurrent requests because such an engine can be |
|
|
|
|
applied to a larger number of problems. That is, of course, assuming |
|
|
|
|
that all other things are equal... |
|
|
|
|
|
|
|
|
|
There are situations in which a database would have to handle a lot of |
|
|
|
|
concurrent requests. Handling ATM transactions over a large area is |
|
|
|
|
one such situation. A database with current weather information might |
|
|
|
|
be another, if it is actively queried by clients all over the country. |
|
|
|
|
Acting as a mail store for a large organization is another. And, of |
|
|
|
|
course, acting as a filesystem is definitely another. :-) |
|
|
|
|
|
|
|
|
|
> Well. Threading does not necessarily imply one thread per connection |
|
|
|
|
> model. Threading can be used to make CPU work during I/O and taking |
|
|
|
|
> advantage of SMP for things like sort etc. This is especially true |
|
|
|
|
> for 2.4.x linux kernels where async I/O can not be used for threaded |
|
|
|
|
> apps. as threads and signal do not mix together well. |
|
|
|
|
|
|
|
|
|
This is true, but whether you choose to limit the use of threads to a |
|
|
|
|
few specific situations or use them throughout the database, the |
|
|
|
|
dangers and difficulties faced by the developers when using threads |
|
|
|
|
will be the same. |
|
|
|
|
|
|
|
|
|
> One connection per thread is not a good model for postgresql since |
|
|
|
|
> it has already built a robust product around process paradigm. If I |
|
|
|
|
> have to start a new database project today, a mix of process+thread |
|
|
|
|
> is what I would choose bu postgresql is not in same stage of life. |
|
|
|
|
|
|
|
|
|
Certainly there are situations for which it would be advantageous to |
|
|
|
|
have multiple concurrent actions happening on behalf of a single |
|
|
|
|
connection, as you say. But that doesn't automatically mean that a |
|
|
|
|
thread is the best overall solution. On systems such as Linux that |
|
|
|
|
have fast process handling, processes are almost certainly the way to |
|
|
|
|
go. On other systems such as Solaris or Windows, threads might be the |
|
|
|
|
right answer (on Windows they might be the *only* answer). But my |
|
|
|
|
argument here is simple: the responsibility of optimizing process |
|
|
|
|
handling belongs to the maintainers of the OS. Application developers |
|
|
|
|
shouldn't have to worry about this stuff. |
|
|
|
|
|
|
|
|
|
Of course, back here in the real world they *do* have to worry about |
|
|
|
|
this stuff, and that's why it's important to quantify the problem. |
|
|
|
|
It's not sufficient to say that "processes are slow and threads are |
|
|
|
|
fast". Processes on the target platform may well be slow relative to |
|
|
|
|
other systems (and relative to threads). But the question is: for the |
|
|
|
|
problem being solved, how much overhead does process handling |
|
|
|
|
represent relative to the total amount of overhead the solution itself |
|
|
|
|
incurs? |
|
|
|
|
|
|
|
|
|
For instance, if we're talking about addressing the problem of |
|
|
|
|
distributing sorts across multiple CPUs, the amount of overhead |
|
|
|
|
involved in doing disk activity while sorting could easily swamp, in |
|
|
|
|
the typical case, the overhead involved in creating parallel processes |
|
|
|
|
to do the sorts themselves. And if that's the case, you may as well |
|
|
|
|
gain the benefits of using full-fledged processes rather than deal |
|
|
|
|
with the problems that come with the use of threads -- because the |
|
|
|
|
gains to be found by using threads will be small in relative terms. |
|
|
|
|
|
|
|
|
|
> > > At their core, threads are a context switching efficiency tweak. |
|
|
|
|
> > |
|
|
|
|
> > This is the heart of the matter. Context switching is an operating |
|
|
|
|
> > system problem, and *that* is where the optimization belongs. Threads |
|
|
|
|
> > exist in large part because operating system vendors didn't bother to |
|
|
|
|
> > do a good job of optimizing process context switching and |
|
|
|
|
> > creation/destruction. |
|
|
|
|
> |
|
|
|
|
> But why would a database need a tons of context switches if it is |
|
|
|
|
> not supposed to service loads to request simaltenously? If there are |
|
|
|
|
> 50 concurrent connections, how much context switching overhead is |
|
|
|
|
> involved regardless of amount of work done in a single connection? |
|
|
|
|
> Remeber that database state is maintened in shared memory. It does |
|
|
|
|
> not take a context switch to access it. |
|
|
|
|
|
|
|
|
|
If there are 50 concurrent connections with one process per |
|
|
|
|
connection, then there are 50 database processes. The context switch |
|
|
|
|
overhead is incurred whenever the current process blocks (or exhausts |
|
|
|
|
its time slice) and the OS activates a different process. Since |
|
|
|
|
database handling is generally rather I/O intensive as services go, |
|
|
|
|
relatively few of those 50 processes are likely to be in a runnable |
|
|
|
|
state, so I would expect the overall hit from context switching to be |
|
|
|
|
rather low -- I'd expect the I/O subsystem to fall over well before |
|
|
|
|
context switching became a real issue. |
|
|
|
|
|
|
|
|
|
Of course, all of that is independent of whether or not the database |
|
|
|
|
can handle a lot of simultaneous requests. |
|
|
|
|
|
|
|
|
|
> > Under Linux, from what I've read, process creation/destruction and |
|
|
|
|
> > context switching happens almost as fast as thread context switching |
|
|
|
|
> > on other operating systems (Windows in particular, if I'm not |
|
|
|
|
> > mistaken). |
|
|
|
|
> |
|
|
|
|
> I hear solaris also has very heavy processes. But postgresql has |
|
|
|
|
> other issues with solaris as well. |
|
|
|
|
|
|
|
|
|
Yeah, I didn't want to mention Solaris because I haven't kept up with |
|
|
|
|
it and thought that perhaps they had fixed this... |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-- |
|
|
|
|
Kevin Brown kevin@sysexperts.com |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)--------------------------- |
|
|
|
|
TIP 2: you can get off all lists at once with the unregister command |
|
|
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
|
|
|
|
|
|
|
|
|
From pgsql-hackers-owner+M37883@postgresql.org Sat Apr 12 16:09:19 2003 |
|
|
|
|
Return-path: <pgsql-hackers-owner+M37883@postgresql.org> |
|
|
|
|
Received: from relay1.pgsql.com (relay1.pgsql.com [64.49.215.129]) |
|
|
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h3CK9HS03520 |
|
|
|
|
for <pgman@candle.pha.pa.us>; Sat, 12 Apr 2003 16:09:18 -0400 (EDT) |
|
|
|
|
Received: from postgresql.org (postgresql.org [64.49.215.8]) |
|
|
|
|
by relay1.pgsql.com (Postfix) with ESMTP |
|
|
|
|
id 507626F768B; Sat, 12 Apr 2003 16:09:01 -0400 (EDT) |
|
|
|
|
X-Original-To: pgsql-hackers@postgresql.org |
|
|
|
|
Received: from spampd.localdomain (postgresql.org [64.49.215.8]) |
|
|
|
|
by postgresql.org (Postfix) with ESMTP id 06543475AE4 |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Sat, 12 Apr 2003 16:08:03 -0400 (EDT) |
|
|
|
|
Received: from mail.gmx.net (mail.gmx.net [213.165.65.60]) |
|
|
|
|
by postgresql.org (Postfix) with SMTP id C6DC347580B |
|
|
|
|
for <pgsql-hackers@postgresql.org>; Sat, 12 Apr 2003 16:08:01 -0400 (EDT) |
|
|
|
|
Received: (qmail 31386 invoked by uid 65534); 12 Apr 2003 20:08:13 -0000 |
|
|
|
|
Received: from chello062178186201.1.15.tuwien.teleweb.at (EHLO beeblebrox) (62.178.186.201) |
|
|
|
|
by mail.gmx.net (mp001-rz3) with SMTP; 12 Apr 2003 22:08:13 +0200 |
|
|
|
|
Message-ID: <01cc01c3012f$526aaf80$3201a8c0@beeblebrox> |
|
|
|
|
From: "Michael Paesold" <mpaesold@gmx.at> |
|
|
|
|
To: "Neil Conway" <neilc@samurai.com>, "Kevin Brown" <kevin@sysexperts.com> |
|
|
|
|
cc: "PostgreSQL Hackers" <pgsql-hackers@postgresql.org> |
|
|
|
|
References: <20030409170926.GH2255@libertyrms.info> <eS0la.16229$ey1.1398978@newsread1.prod.itd.earthlink.net> <20030411213259.GU1833@filer> <1050175777.392.13.camel@tokyo> |
|
|
|
|
Subject: Re: [HACKERS] Anyone working on better transaction locking? |
|
|
|
|
Date: Sat, 12 Apr 2003 22:08:40 +0200 |
|
|
|
|
MIME-Version: 1.0 |
|
|
|
|
Content-Type: text/plain; |
|
|
|
|
charset="Windows-1252" |
|
|
|
|
Content-Transfer-Encoding: 7bit |
|
|
|
|
X-Priority: 3 |
|
|
|
|
X-MSMail-Priority: Normal |
|
|
|
|
X-Mailer: Microsoft Outlook Express 6.00.2800.1106 |
|
|
|
|
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 |
|
|
|
|
X-Spam-Status: No, hits=-25.8 required=5.0 |
|
|
|
|
tests=BAYES_20,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,REFERENCES, |
|
|
|
|
REPLY_WITH_QUOTES |
|
|
|
|
autolearn=ham version=2.50 |
|
|
|
|
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) |
|
|
|
|
Precedence: bulk |
|
|
|
|
Sender: pgsql-hackers-owner@postgresql.org |
|
|
|
|
Status: OR |
|
|
|
|
|
|
|
|
|
Neil Conway wrote: |
|
|
|
|
|
|
|
|
|
> Furthermore, IIRC PostgreSQL's relatively slow connection creation time |
|
|
|
|
> has as much to do with other per-backend initialization work as it does |
|
|
|
|
> with the time to actually fork() a new backend. If there is interest in |
|
|
|
|
> optimizing backend startup time, my guess would be that there is plenty |
|
|
|
|
> of room for improvement without requiring the replacement of processes |
|
|
|
|
> with threads. |
|
|
|
|
|
|
|
|
|
I see there is a whole TODO Chapter devoted to the topic. There is the idea |
|
|
|
|
of pre-forked and persistent backends. That would be very useful in an |
|
|
|
|
environment where it's quite hard to use connection pooling. We are |
|
|
|
|
currently working on a mail system for a free webmail. The mda (mail |
|
|
|
|
delivery agent) written in C connects to the pg database to do some queries |
|
|
|
|
everytime a new mail comes in. I didn't find a solution for connection |
|
|
|
|
pooling yet. |
|
|
|
|
|
|
|
|
|
About the TODO items, apache has a nice description of their accept() |
|
|
|
|
serialization: |
|
|
|
|
http://httpd.apache.org/docs-2.0/misc/perf-tuning.html |
|
|
|
|
|
|
|
|
|
Perhaps this could be useful if someone decided to start implementing those |
|
|
|
|
features. |
|
|
|
|
|
|
|
|
|
Regards, |
|
|
|
|
Michael Paesold |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)--------------------------- |
|
|
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
|
|
|
|
|
|
|
|
|
|