Discussion:
On the uses of semaphores sets
Louis Munro
2010-09-16 17:46:11 UTC
Permalink
Hi,
I'm trying to understand the relationship between semaphores, semaphore sets and the relevant resource controls in solaris 10.

My current understanding is that the total number of semaphores available to a project is the product of
project.max-sem-ids * process.max-sem-nsems (number of sets * number of semaphores per set).
Is that correct?

I also understand that when a set is full a request for an aditional semaphore will be allocated form another set if there is one available.
Again, is that right?

Last, if that is the case, why do we have sets to begin with?
Why not use a single big pool of semaphores?
There must be some use to the sets or else I don't see why they would still be around but I can't figure out a case where a large number of small sets would be better than a small number of large sets.
Does anyone know?

Thank you for your advice,

Louis
--
This message posted from opensolaris.org
Jason
2010-09-16 18:17:27 UTC
Permalink
Post by Louis Munro
Hi,
I'm trying to understand the relationship between semaphores, semaphore sets and the relevant resource controls in solaris 10.
My current understanding is that the total number of semaphores available to a project is the product of
project.max-sem-ids * process.max-sem-nsems (number of sets * number of semaphores per set).
Is that correct?
That is my understanding.
Post by Louis Munro
I also understand that when a set is full a request for an aditional semaphore will be allocated form another set if there is one available.
Again, is that right?
Not sure, but seems likely. I suspect it's probably sequential
through each set.
Post by Louis Munro
Last, if that is the case, why do we have sets to begin with?
I'm not familiar enough with the history of them wrt implementation,
however on Solaris there is some housekeeping within the kernel
associated with each set (since they are visible to potentially any
process), and with that comes locks on the data structures used to
house them in the kernel, which IIRC are one lock per set.
Post by Louis Munro
Why not use a single big pool of semaphores?
Due to the above, a single set of semaphores means that all semaphore
operations will be serialized.
Post by Louis Munro
There must be some use to the sets or else I don't see why they would still be around but I can't figure out a case where a large number of small sets would be better than a small number of large sets.
Does anyone know?
Increased parallelism on semaphore operations (at least on Solaris,
other platforms may differ). Probably the most well known user of
SysV semaphores is Oracle's RDBMS (which provides a good example). I
know places have encountered arguably a misconfiguration around this
that caused performance (especially on larger machines) to be
terrible. The resource limits were set such that Oracle allocated all
it's semaphores in a single set (causing the serialization I
mentioned). By lowering the amount of semaphores per set, and
allowing multiple sets, it fixed the problem. Of course the question
'how many sets' is probably best done by experiment. If I were to
hazard a guess, using #sets = #cpus would probably be a reasonable
starting point, probably up to sqrt(# semaphores needed) if > #cpus.
In the case of Oracle, there is documentation that will tell you how
many semaphores it wants based on various parameters (I believe in the
init file). IIRC, Oracle will allocate as many sets as needed to get
the needed # of semaphores.
Post by Louis Munro
Thank you for your advice,
Louis
--
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
j***@public.gmane.org
2010-09-17 01:01:25 UTC
Permalink
Post by Jason
Post by Louis Munro
I also understand that when a set is full a request for an aditional semaphore will be allocated form another set if there is one available.
Again, is that right?
Not sure, but seems likely. I suspect it's probably sequential
through each set.
No. The manpage [see semget(2)] states that the number of semaphores in
the set is specified at creation time. If you want to change the number
of semaphores in the set, you've got to delete the set and create a new
one.
Post by Jason
Post by Louis Munro
Last, if that is the case, why do we have sets to begin with? There
must be some use to the sets or else I don't see why they would
still be around but I can't figure out a case where a large number
of small sets would be better than a small number of large sets.
Does anyone know?
Lookup by ID is going to be faster than looking up one ID and iterating
through a long list of sempahores. Since the operations are applied
atomically to all semaphores in the set, you'd want other sets for
disjoint operations. The SysV IPC is also part of UNIX standards, so we
can't just change it at will, otherwise applications that depend upon
the interface will break.

-j
David Powell
2010-09-17 02:14:50 UTC
Permalink
Louis,
Post by Louis Munro
Hi,
I'm trying to understand the relationship between semaphores,
semaphore sets and the relevant resource controls in solaris 10.
My current understanding is that the total number of semaphores
available to a project is the product of project.max-sem-ids *
process.max-sem-nsems (number of sets * number of semaphores per
set). Is that correct?
There is no limit on the number of semaphores available to a
project. Just a limit on the number of semaphore ids.

In theory max-sem-ids * max-sem-nsems forms an upper bound, but
different processes in a project could have different
process.max-sem-nsems settings (as it is a process rctl). A more
conservative upper bound is max-sem-ids * 32K (32K is the inviolable
limit on the number of semaphores per set, imposed by the semop
interface).
Post by Louis Munro
I also understand that when a set is full a request for an aditional
semaphore will be allocated form another set if there is one
available. Again, is that right?
Not at all. The system doesn't keep semaphore sets lying around,
they are created when you allocate them. process.max-sem-nsems just
limits their size.

Once upon a time, sets were allocated from a single large pool. That
was unnecessary, and placed confusing, hard to observe limits on
allocation (e.g. fragmentation of the semaphore pool could result in
allocation failures even though sufficient semaphores were
available). When we replaced the tunables with the resource
controls, we ripped that mechanism out entirely.

Now the logic is simple:

semget(nsems):

if project's semids >= max-sem-ids:
fail

if nsems > max-sem-nsems:
fail

project's semids += 1

dynamically allocate semaphores
Post by Louis Munro
Last, if that is the case, why do we have sets to begin with?
Why not use a single big pool of semaphores?
There must be some use to the sets or else I don't see why they would
still be around but I can't figure out a case where a large number of
small sets would be better than a small number of large sets. Does
anyone know?
See http://hub.opensolaris.org/bin/view/Project+rm/sysv for more
information on the resource controls and the rationale behind the
changes.

As others have pointed out, the purpose of semaphore sets was to let
you perform multiple operations on a single semaphore set
atomically. I think the number of people who truly need that these
days is small. But the functionality is popular because it lets
people perform many semaphore operations with a single system call.

Should you use large sets or small sets? It's a trade-off.

A large number of small sets scale better if you have a lot of
independent operations, but bulk operations will require more system
calls and you will be limited in what you can do atomically.

A small number of large sets can be a serious bottleneck if you have
a lot of independent operations, but will let you perform large,
aggregate operations for atomicity or performance.

Dave

Continue reading on narkive:
Loading...