A really simple job queue utility for linux written in C.
The goal is to process jobs on a network of workstations (NOW) sharing a filesystem. Jobs are run either imediately returning stdout and stderr to the user or queued for execution when sufficient resources are available.
The runnow mode is a rsh splat utility function that is useful for maintaining a NOW. The command is run sequentially on all specified hosts using an context very similar to the user's currently active environment and current working directory.
For example,
% q -n host=linux date
mach1: Wed Oct 29 09:03:01 CDT 2025
mach2: Wed Oct 29 09:03:01 CDT 2025
mach3: Wed Oct 29 09:03:01 CDT 2025
mach4: Wed Oct 29 09:03:01 CDT 2025
options:
-n prefix output with the hostname to make the report easier to understand
-sin uses the same stdin text for all runs.
parms:
host specifies either a hostname or machine group to run the command on. The default is all.
Submitting a job to the job queue creates a job description and sends that job description to the master queue server. The master will send the job to an execution server running on a workstation capable of running the job. The job will be sent for execution either immediately upon submission or at the earliest time that it is the highest priority job that could schedule.
Priority order is:
- the user running the least number of jobs,
- the user that least recently had a job scheduled,
- the pri parm (0 first .. 7 last), and
- submisstion time with the oldest job first.
However, higher priority jobs that cannot schedule do not block lower priority jobs that can schedule. Users cannot use pri parm to advance their jobs before another user's jobs.
After the job is run, either the job description is deleted or update with job status. Often a job will be submitted with a foreach loop or a script. For example to transcode a video,
% q -e group=linux ffmpeg -i LostInTranslation.mkv -codec copy LostInTranslation.mp4
jobid: jobid uid gid
sch 1
option:
-especified enqueue the job rather than run it now
parms:
jg=0integer job group, for matching in list and dequeue.pri=4scheduling piority with 0 being first and 7 last.groupspecifies a host or group of hosts capable of running the jobkeep=errordirects the system to leave the job description in place after an error (return code != 0) Could usealwaysinstead.mem=10Glets the scheduler know the job requires 10GB of memory to executethreads=10lets the scheduler know the job requires 10 threads to execute
output:
jobidreturns the created jobidschreturns 1 if the job ran immediately or 0 if it is queued- 'rej' returns the reason the job was rejected
Job descriptions are stored in ~/.queued as directories using
jobid is the directory name.
The command line and the parms as well as stdout and stderr are
stored there.
This requires that ~ is in a shared filesystem and is accessable
to the submitting machine, the master machine and the execution
machine with the same pathname.
The current list of jobs that are not yet complete can be queried. Generally, a users other than root are limited to viewing the list of thier own jobs and are not shown jobs queued by other users.
% q -l
username jobhost 2025/10/29@12:04:50 0 jobid command line given to q -e
Listing of the users jobs (one line per job) in submission order, either currently running or waiting to run.
The job list function utilizes parms to control which jobs are included in the list.
parms:
time=beg,endselects jobs submitted between beg and end. If end is omitted, then the range ends now.cmd=regexpselects jobs whose command lines match the regular expression.host=hostnameselects jobs that are running on a host.jg=integerselects jobs that belong to the job group.pri=integerselects jobs that were submitted with a priority.a=allselects all jobs.u=unameselects the jobs submitted by a user or '-' for all users.jobidselects a job by its jobid.
usernameis the unix username of submitter.jobhostis the hostname of host running the job or-if job is waiting to run.date@timeis the date and time the job was submitted.0is the job group the user used to submit the job.jobidis the jobs jobid.commandis the command line submitted.
The status of the machines in the NOW can be queried.
% q -s show=jobs hostgroup
jobhost: 24/16cores 4224mips lavg: 300 276 198 xproc 64/1/99 q 6/2/6 2 mem 125gb used 87 avail 45 x 59gb q 33gb users 1 6sec up 5w,2d,45:12 0s-drift
- username jobid 3 1 3 4 command line given to q -q
A report with one section per host is produced. The first line of a host section is host status. If show=jobs is requested, one line per job running on the host is follows the host status.
jobhostis the hostname of the host.24/16is the total number of threads and cores available in the host.mipsis the linux kernel bogo mips rating.lavgare the three columns of load average *100.xprocare the number of processes, running processes and threads running on the machine external to the queue.qare the number of processes, running processes, threads and jobs running on the machine from the queue.memare the total memory, used memory, available memory and the memory used for buffers and cache in GB.xvis the virtual memory used by processes external to the queue.xris the resident size of processes external to the queue.qis the virtual memory used by jobs running from the queue.usersprovides the number of users logged in to the host who have typed a command in the last 30 mins and the number of seconds since a command has been typed.upis the uptime since boot, the current time and the seconds of drift.
-this is a job status line.usernameuser who submitted the job.jobidis the jobid for the job.3 1 3 4are the number of processes, number of running processes and number of threads running and the virtual size of the jobcommandthe command line submitted to the queue
Jobs can be removed from the queue, before or while running. If the job is running it will be terminated,
% q -d jobid
dequeue of jobid suceeded
All of the selection logic available for -l are available also for -d.
For complex job dequeuing, use -l to develop the critereon for -d.
The q binary is a set-uid root process and is used both for the user client interface and for the server daemons.
The makefile includes the install target that copies and ug+s.
You will need to configure the INSTALL_DIR for your system.
% cd src
% make
% sudo make install
Some systems running systemd might benefit from the queued.service example in the source directory.
When q starts, it reads the configuration file /etc/queued.conf.
The source directory contains an example file.
Comments begin with # and continue to the end of line.
The daemon opens a port for recieving messages from the client or other servers.
The port number defaults to 9090 but can be set with a command line option -p 9090 or
in the config file with:
port server = 9090;
The port number used must not conflict with other services and must be open in the firewall configuration.
There must be a directory in a shared filesystem to store the pid files created by the sever daemons.
The pid files will be created when the daemon starts and contain the server key.
The files are owned by root read/writeable only by root.
The client needs the key inorder to send tcp messages to the daemon.
Daemons also need the keys for inter-server communications.
dir pid = /user/utility/packages/queued/pids;
Specifies the path to the directory where the pid files will be created. It should be created owned by root:root with mode 755.
Machine groups are used to restrict the job execution to particular machines.
The master group is manditory and specifies which machine will be the master scheduler.
There can be only one machine in the master group.
group master = mist;
The group all is also manditory and is used as the default for all operations.
group all = mist smoke asst luke;
The all group does not need to have all of the machines. For example, I dont have the laptops because they arent always connected to the NOW.
You can create any other groupings that make sense.
group name = list of machines
You can limit the available resources for each machine.
Jobs wont schedule if they dont fit the available resources.
limits machine = mem=16G buf=20g threads=8 busy=8:00-19:00;
You can create token pools to manage resources such as licenses. Jobs must be able to claim the specified tokens before they can be schedule, and they hold thier tokens until they exit.
token LIC = 2;
Queued schedules jobs on any available machine with sufficient resources when they are submitted. When a job completes on a machine, queued attempts to schedule as many jobs as it can on that machine. By default, if for some reason a new job cannot be scheduled when the last job running on a machine exits, no job will be scheduled on that machine until a new job is submitted. If it is possible that other uses of the system may consume resources that prevent scheduling, and those resources free up, a kick start timer can be configured with:
alarm kickstart = 30m;
When the kick start timer expires, the master will attempt to schedule jobs on all hosts. Time is in seconds, but can be suffixed with:
mfor minutes,hfor hours anddfor days.
A daemon will reload the configuration file if it handles SIGUSR1, or recieves a request from a client:
% q -reload [host|group]
If host is not specified, the group master is used by default.
The daemon will not recognize a change to dir pid during an update.