Scheduling in Linux Kernel

Scheduling in Linux Kernel

linux
linux, kernel, scheduler

There are two ways that you can set priorities in scheduling. sched_setscheduler sched_getscheduler You can do man sched_setscheduler to know more. sched_setscheduler sets scheduling policy and parameters.

The other option is to use setpriority- This sets the nice value. A brief guide to priority and nice value In Linux system priorities are 0 to 139 in which 0 to 99 for real-time and 100 to 139 for users.

Nice value — Nice values are user-space values that we can use to control the priority of a process. The nice value range is -20 to +19 where -20 is highest, 0 default and +19 is lowest.

The relation between nice value and priority is as such - Priority_value = Nice_value + 20

You can see all the api’s related to scheduling using man 7 sched

There are two different scheduling groups in linux

  1. Normal shceduling policies
  2. Real-time scheduling policies

With normal scheduling policies i.e. SCHED_OTHER, SCHED_IDLE, SCHED_BATCH the sched_priority must be set to 0.
In the case of real-time scheduling policies SCHED_FIFO, SCHED_RR the sched_priority can have the values of 1(low) to 99(high). As these numbers (1-99) imply, the real-time threads always have higher priority then normal threads.

For normal process the (priority) is nice value ranging from -20 to 19 (-20) being the highest. To change the nice value you can use renice -n <priority value> -p <pid>

For real time process its real time priority ranging from 0 to 99 and can be changed using chrd -r -p <priority+1> <pid> In the output of the ps command, for a real time process, you’ll not see the nice value.

The base scheduler code has two main functions: schedule and scheduler_tick. When processing timer interrupt scheduler_tick() is called. When kernel calls schedule() (call returned from system call)

8_3 timer

Interval timer is programmed using kernel parameter CONFIG_HZ. (value being from 100 to 1000). At the boot time the timer interval is programmed to generate so many interrupts per sec. Setting to 100, interval timer generates interrupt every 10ms.. Minimum is usually 1ms. Kernel basically uses a global variable called JIFFIES, its a global counter which is incremented on every timer interrupt.if HZ is 1000, jiffies is incremented 1000 per sec. Each increment of jiffie counter is called a kernel tick. In case of multicore each core can have indifidual timer for counting jiffie ticks. (CPU/0 timer is default system timer.)

We can search for CONFIG_HZ in the config file in the kernel file system.

Ex if CONFIG_HZ = 250(Hz), the resolution of the timer would be 1/250 = 4ms, so every 4ms there would be one tick. 4ms is the scheduler tick meaning scheduler is guaranteed to trigger every 4ms.

The kernel tick becomes the basis of operation for:

  1. All time related operations.
  2. process scheduling.
  3. execution of periodic/single-shot timer routines.
  4. bounded wait calls (time-out functions).
  5. network protocol state machines.

<linux/time.h> provides functions to convert jiffies stamp into user representation formats and vice-versa (using struct timeval, struct timespec).