Friday, January 29, 2016

How to enter into Kernel Mode

The only way an user space application can explicitly initiate a switch to kernel mode during normal operation is by making an system call such as openreadwrite etc.
Whenever a user application calls these system call APIs with appropriate parameters, a software interrupt/exception(SWI) is triggered.

  • Make a system call, i.e. explicitly request service from the kernel
  • trap into the kernel because of either:
    • an error (segmentation violation, invalid instruction, etc.) - this is fatal,
    • or a page fault - accessing mapped, but not resident memory page.
A kernel code snippet is run on request of a user process. This code runs in ring 0 (with current privilege level -CPL- 0), which is the highest level of privilege in x86 architecture. All user processes run in ring 3 (CPL 3). So, to implement system call mechanism, what we need is 1) a way to call ring 0 code from ring 3 and 2) some kernel code to service the request.

It was found out that this software interrupt method was much slower on Pentium IV processors. To solve this issue, Linus implemented an alternative system call mechanism to take advantage of SYSENTER/SYSEXIT instructions provided by all Pentium II+ processors. Before going further with this new way of doing it, let's make ourselves more familiar with these instructions.

The SYSENTER instruction is part of the "Fast System Call" facility introduced on the Pentium® II processor. The SYSENTER instruction is optimized to provide the maximum performance for transitions to protection ring 0 (CPL = 0). The SYSENTER instruction sets the following registers according to values specified by the operating system in certain model-specific registers.
  • CS register set to the value of (SYSENTER_CS_MSR)
  • EIP register set to the value of (SYSENTER_EIP_MSR)
  • SS register set to the sum of (8 plus the value in SYSENTER_CS_MSR)
  • ESP register set to the value of (SYSENTER_ESP_MSR)
Looks like processor is trying to help us. Let's look at SYSEXIT also very quickly:
The SYSEXIT instruction is part of the "Fast System Call" facility introduced on the Pentium® II processor. The SYSEXIT instruction is optimized to provide the maximum performance for transitions to protection ring 3 (CPL = 3) from protection ring 0 (CPL = 0). The SYSEXIT instruction sets the following registers according to values specified by the operating system in certain model-specific or general purpose registers.
  • CS register set to the sum of (16 plus the value in SYSENTER_CS_MSR)
  • EIP register set to the value contained in the EDX register
  • SS register set to the sum of (24 plus the value in SYSENTER_CS_MSR)
  • ESP register set to the value contained in the ECX register

No comments:

Post a Comment