Skip to content

RFC: Rename --threads to --cores ? #88

Closed
donald opened this issue Apr 30, 2020 · 6 comments · Fixed by #107
Closed

RFC: Rename --threads to --cores ? #88

donald opened this issue Apr 30, 2020 · 6 comments · Fixed by #107

Comments

@donald
Copy link
Contributor

donald commented Apr 30, 2020

To me its a terrible misnomer that mxq globally refers to the term "threads" where in fact "cores" would be correct. This error is all over the place starting with mxqsub:

 -j, --threads=NUMBER     set number of threads       (default: 1)

This option in fact sets the number of cores which are reserved for your job on the execution node, it does not limit the number of threads at all. I think it more difficult for users to select optimal options (or for us to explain) when we use wrong terms.

Should we rename that globally, leaving --threads only for compatibility?

@wwwutz
Copy link
Contributor

wwwutz commented Apr 30, 2020

I don't like that idea. Quote: https://en.wikipedia.org/wiki/Hyper-threading

Architecturally, a processor with Hyper-Threading Technology consists of two logical processors per core, each of which has its own processor architectural state. Each logical processor can be individually halted, interrupted or directed to execute a specified thread, independently from the other logical processor sharing the same physical core

We are running 95% of our clusternodes hyper threaded.

thread in mxqd speech has nothing to do with threads in software.

I like threads used here or, if you intend to rename it: how about "slot" ?

thatllbefinewithme.

@arndt
Copy link
Contributor

arndt commented Apr 30, 2020

Although I like this distinction, I like to compare it to how something like that is called in languages a user might use:

in python it would be a thread
in Julia it would be a thread
in R it would be a thread
in make it would be a job
in C/C++ you have to include pthread.h
...

But AFAIK a slot in mxq-speak means something else, because sometimes I have single threaded jobs occupying more than one slot because they need more memory.

@donald
Copy link
Contributor Author

donald commented Apr 30, 2020

I think, "thread" in "hyper-threading" has a differnt meaning than "thread" in e.g. pthread_create. The first one is CPU execution context at the hardware level while the second is program execution context at the programmers level. Both are independent. But the fact, that "thread" is used with a yet another meaning by CPU manufacturers doesn't make it a better choice for mxqsub -j, because neither is meant by -j.

"slots" is used by mxqd internally with its own definition, but again refers to something different, because these "slots" have a certain amount of memory, too, which differs from node to node. You might submit with "-j 1" but still get more than one "slot", because of memory requirements.

You are right, when hyper-threading is taken into account, "cores" is wrong, too. So I take this suggestion back. In Linux the correct word would be "processor" (/proc/cpuinfo) or "cpu" (lscpu), these are the entities recognized Linux scheduler and this is what you actually get with mxqsub -j. Although, these term, too, have a different meaning when used by CPU manufacturers.

On a system we have n1 hyperthreads per core, n2 cores per socket, n3 sockets and n1 * n2 * n3 cpus or processors in linux speak. Would "cpus" or "processors" be a better option? I'm not sure.

@arndt
Copy link
Contributor

arndt commented Apr 30, 2020

What about proc - at least the linux nproc gives the number of such things.

@donald
Copy link
Contributor Author

donald commented Apr 30, 2020

Although I like this distinction, I like to compare it to how something like that is called in languages a user might use.

That's my point. These are "threads" as seen by the programmer. And they are different from what is selected by mxqsub -j, which refers to the number of hardware cpus you get. People could run 60 thread with "-j 1" but they might wrongly assume they need "-j 60".

What about proc - at least the linux nproc gives the number of such things.

I think that's the same as "processors" or "cpus". The proc manpage say "processing units", which, of course, is the "PUS" from "cpus". And --processors could be abbreviated to --proc anyway, which by itself might possibly be misinterpreted as "number of processes".

@donald
Copy link
Contributor Author

donald commented Aug 21, 2021

Any objections to make "--processors" and "number of hardware processors" the official term while, of course, keeping "--threads" as deprecated for compatibility?

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants