Writing Device Drivers
Previous Next

Tuning Drivers

The Solaris OS provides kernel statistics structures so that you can implement counters for your driver. The DTrace facility enables you to analyze performance in real time. This section presents the following topics on device performance:

  • Kernel Statistics – The Solaris OS provides a set of data structures and functions for capturing performance statistics in the kernel. Kernel statistics (called kstats) enable your driver to export continuous statistics while the system is running. The kstat data is handled programmatically by using the kstat functions.

  • DTrace for Dynamic Instrumentation – DTrace enables you to add instrumentation to your driver dynamically so that you can perform tasks like analyzing the system and measuring performance. DTrace takes advantage of predefined kstat structures.

Kernel Statistics

To assist in performance tuning, the Solaris kernel provides the kstat(3KSTAT) facility. The kstat facility provides a set of functions and data structures for device drivers and other kernel modules to export module-specific kernel statistics.

A kstat is a data structure for recording quantifiable aspects of a device's usage. A kstat is stored as a null-terminated linked list. Each kstat has a common header section and a type-specific data section. The header section is defined by the kstat_t structure.

The article “Using kstat From Within a Program in the Solaris OS” on the Sun Developer Network at http://developers.sun.com/solaris/articles/kstat_api.html provides two practical examples on how to use the kstat(3KSTAT) and libkstat(3LIB) APIs to extract metrics from the Solaris OS. The examples include “Walking Through All the kstat” and “Getting NIC kstat Output Using the Java Platform.”

Kernel Statistics Structure Members

The members of a kstat structure are:

ks_class[KSTAT_STRLEN]

Categorizes the kstat type as bus, controller, device_error, disk, hat, kmem_cache, kstat, misc, net, nfs, pages, partition, rps, ufs, vm, or vmem.

ks_crtime

Time at which the kstat was created. ks_crtime is commonly used in calculating rates of various counters.

ks_data

Points to the data section for the kstat.

ks_data_size

Total size of the data section in bytes.

ks_instance

The instance of the kernel module that created this kstat. ks_instance is combined with ks_module and ks_name to give the kstat a unique, meaningful name.

ks_kid

Unique ID for the kstat.

ks_module[KSTAT_STRLEN]

Identifies the kernel module that created this kstat. ks_module is combined with ks_instance and ks_name to give the kstat a unique, meaningful name. KSTAT_STRLEN sets the maximum length of ks_module.

ks_name[KSTAT_STRLEN]

A name assigned to the kstat in combination with ks_module and ks_instance. KSTAT_STRLEN sets the maximum length of ks_module.

ks_ndata

Indicates the number of data records for those kstat types that support multiple records: KSTAT_TYPE_RAW, KSTAT_TYPE_NAMED, and KSTAT_TYPE_TIMER

ks_next

Points to next kstat in the chain.

ks_resv

A reserved field.

ks_snaptime

The timestamp for the last data snapshot, useful in calculating rates.

ks_type

The data type, which can be KSTAT_TYPE_RAW for binary data, KSTAT_TYPE_NAMED for name/value pairs, KSTAT_TYPE_INTR for interrupt statistics, KSTAT_TYPE_IO for I/O statistics, and KSTAT_TYPE_TIMER for event timers.

Kernel Statistics Structures

The structures for the different kinds of kstats are:

kstat(9S)

Each kernel statistic (kstat) that is exported by device drivers consists of a header section and a data section. The kstat(9S) structure is the header portion of the statistic.

kstat_intr(9S)

Structure for interrupt kstats. The types of interrupts are:

  • Hard interrupt – Sourced from the hardware device itself

  • Soft interrupt – Induced by the system through the use of some system interrupt source

  • Watchdog interrupt – Induced by a periodic timer call

  • Spurious interrupt – An interrupt entry point was entered but there was no interrupt to service

  • Multiple service – An interrupt was detected and serviced just prior to returning from any of the other types

Drivers generally report only claimed hard interrupts and soft interrupts from their handlers, but measurement of the spurious class of interrupts is useful for auto-vectored devices to locate any interrupt latency problems in a particular system configuration. Devices that have more than one interrupt of the same type should use multiple structures.

kstat_io(9S)

Structure for I/O kstats.

kstat_named(9S)

Structure for named kstats. A named kstat is an array of name-value pairs. These pairs are kept in the kstat_named structure.

Kernel Statistics Functions

The functions for using kstats are:

kstat_create(9F)

Allocate and initialize a kstat(9S) structure.

kstat_delete(9F)

Remove a kstat from the system.

kstat_install(9F)

Add a fully initialized kstat to the system.

kstat_named_init(9F), kstat_named_setstr(9F)

Initialize a named kstat. kstat_named_setstr() associates str, a string, with the named kstat pointer.

kstat_queue(9F)

A large number of I/O subsystems have at least two basic queues of transactions to be managed. One queue is for transactions that have been accepted for processing but for which processing has yet to begin. The other queue is for transactions that are actively being processed but not yet done. For this reason, two cumulative time statistics are kept: wait time and run time. Wait time is prior to service. Run time is during the service. The kstat_queue() family of functions manages these times based on the transitions between the driver wait queue and run queue:

Kernel Statistics for Solaris Ethernet Drivers

The kstat interface described in the following table is an effective way to obtain Ethernet physical layer statistics from the driver. Ethernet drivers should export these statistics to guide users in better diagnosis and repair of Ethernet physical layer problems. With exception of link_up, all statistics have a default value of 0 when not present. The value of the link_up statistic should be assumed to be 1.

The following example gives all the shared link setup. In this case mii is used to filter statistics.

kstat ce:0:mii:link_*
Table 22-2 Ethernet MII/GMII Physical Layer Interface Kernel Statistics

Kstat Variable

Type

Description

xcvr_addr

KSTAT_DATA_UINT32

Provides the MII address of the transceiver that is currently in use.

  • (0) - (31) are for the MII address of the physical layer device in use for a given Ethernet device.

  • (-1) is used where there is no externally accessible MII interface, and therefore the MII address is undefined or irrelevant.

xcvr_id

KSTAT_DATA_UINT32

Provides the specific vendor ID or device ID of the transceiver that is currently in use.

xcvr_inuse

KSTAT_DATA_UINT32

Indicates the type of transceiver that is currently in use. The IEEE aPhytType enumerates the following set:

  • (0) other undefined

  • (1) no MII interface is present, but no transceiver is connected

  • (2) 10 Mbits/s Clause 7 10 Mbits/s Manchester

  • (3) 100BASE-T4 Clause 23 100 Mbits/s 8B/6T

  • (4) 100BASE-X Clause 24 100 Mbits/s 4B/5B

  • (5) 100BASE-T2 Clause 32 100 Mbits/s PAM5X5

  • (6) 1000BASE-X Clause 36 1000 Mbits/s 8B/10B

  • (7) 1000BASE-T Clause 40 1000 Mbits/s 4D-PAM5

This set is smaller than the set specified by ifMauType, which is defined to include all of the above plus their half duplex/full duplex options. Since this information can be provided by the cap_* statistics, the missing definitions can be derived from the combination of xcvr_inuse and cap_* to provide all the combinations of ifMayType.

cap_1000fdx

KSTAT_DATA_CHAR

Indicates the device is 1 Gbits/s full duplex capable.

cap_1000hdx

KSTAT_DATA_CHAR

Indicates the device is 1 Gbits/s half duplex capable.

cap_100fdx

KSTAT_DATA_CHAR

Indicates the device is 100 Mbits/s full duplex capable.

cap_100hdx

KSTAT_DATA_CHAR

Indicates the device is 100 Mbits/s half duplex capable.

cap_10fdx

KSTAT_DATA_CHAR

Indicates the device is 10 Mbits/s full duplex capable.

cap_10hdx

KSTAT_DATA_CHAR

Indicates the device is 10 Mbits/s half duplex capable.

cap_asmpause

KSTAT_DATA_CHAR

Indicates the device is capable of asymmetric pause Ethernet flow control.

cap_pause

KSTAT_DATA_CHAR

Indicates the device is capable of symmetric pause Ethernet flow control when cap_pause is set to 1 and cap_asmpause is set to 0. When cap_asmpause is set to 1, cap_pause has the following meaning:

  • cap_pause = 0 Transmit pauses based on receive congestion.

  • cap_pause = 1 Receive pauses and slow down transmit to avoid congestion.

cap_rem_fault

KSTAT_DATA_CHAR

Indicates the device is capable of remote fault indication.

cap_autoneg

KSTAT_DATA_CHAR

Indicates the device is capable of auto-negotiation.

adv_cap_1000fdx

KSTAT_DATA_CHAR

Indicates the device is advertising 1 Gbits/s full duplex capability.

adv_cap_1000hdx

KSTAT_DATA_CHAR

Indicates the device is advertising 1 Gbits/s half duplex capability.

adv_cap_100fdx

KSTAT_DATA_CHAR

Indicates the device is advertising 100 Mbits/s full duplex capability.

adv_cap_100hdx

KSTAT_DATA_CHAR

Indicates the device is advertising 100 Mbits/s half duplex capability.

adv_cap_10fdx

KSTAT_DATA_CHAR

Indicates the device is advertising 10 Mbits/s full duplex capability.

adv_cap_10hdx

KSTAT_DATA_CHAR

Indicates the device is advertising 10 Mbits/s half duplex capability.

adv_cap_asmpause

KSTAT_DATA_CHAR

Indicates the device is advertising the capability of asymmetric pause Ethernet flow control.

adv_cap_pause

KSTAT_DATA_CHAR

Indicates the device is advertising the capability of symmetric pause Ethernet flow control when adv_cap_pause is set to 1 and adv_cap_asmpause is set to 0. When adv_cap_asmpause is set to 1, adv_cap_pause has the following meaning:

  • adv_cap_pause = 0 Transmit pauses based on receive congestion.

  • adv_cap_pause = 1 Receive pauses and slow down transmit to avoid congestion.

adv_rem_fault

KSTAT_DATA_CHAR

Indicates the device is experiencing a fault that it is going to forward to the link partner.

adv_cap_autoneg

KSTAT_DATA_CHAR

Indicates the device is advertising the capability of auto-negotiation.

lp_cap_1000fdx

KSTAT_DATA_CHAR

Indicates the link partner device is 1 Gbits/s full duplex capable.

lp_cap_1000hdx

KSTAT_DATA_CHAR

Indicates the link partner device is 1 Gbits/s half duplex capable.

lp_cap_100fdx

KSTAT_DATA_CHAR

Indicates the link partner device is 100 Mbits/s full duplex capable.

lp_cap_100hdx

KSTAT_DATA_CHAR

Indicates the link partner device is 100 Mbits/s half duplex capable.

lp_cap_10fdx

KSTAT_DATA_CHAR

Indicates the link partner device is 10 Mbits/s full duplex capable.

lp_cap_10hdx

KSTAT_DATA_CHAR

Indicates the link partner device is 10 Mbits/s half duplex capable.

lp_cap_asmpause

KSTAT_DATA_CHAR

Indicates the link partner device is capable of asymmetric pause Ethernet flow control.

lp_cap_pause

KSTAT_DATA_CHAR

Indicates the link partner device is capable of symmetric pause Ethernet flow control when lp_cap_pause is set to 1 and lp_cap_asmpause is set to 0. When lp_cap_asmpause is set to 1, lp_cap_pause has the following meaning:

  • lp_cap_pause = 0 Link partner will transmit pauses based on receive congestion.

  • lp_cap_pause = 1 Link partner will receive pauses and slow down transmit to avoid congestion.

lp_rem_fault

KSTAT_DATA_CHAR

Indicates the link partner is experiencing a fault with the link.

lp_cap_autoneg

KSTAT_DATA_CHAR

Indicates the link partner device is capable of auto-negotiation.

link_asmpause

KSTAT_DATA_CHAR

Indicates the link is operating with asymmetric pause Ethernet flow control.

link_pause

KSTAT_DATA_CHAR

Indicates the resolution of the pause capability. Indicates the link is operating with symmetric pause Ethernet flow control when link_pause is set to 1 and link_asmpause is set to 0. When link_asmpause is set to 1 and is relative to a local view of the link, link_pause has the following meaning:

  • link_pause = 0 This station will transmit pauses based on receive congestion.

  • link_pause = 1 This station will receive pauses and slow down transmit to avoid congestion.

link_duplex

KSTAT_DATA_CHAR

Indicates the link duplex.

  • link_duplex = 0 Link is down and duplex is unknown.

  • link_duplex = 1 Link is up and in half duplex mode.

  • link_duplex = 2 Link is up and in full duplex mode.

link_up

KSTAT_DATA_CHAR

Indicates whether the link is up or down.

  • link_up = 0 Link is down.

  • link_up = 1 Link is up.

DTrace for Dynamic Instrumentation

DTrace is a comprehensive dynamic tracing facility for examining the behavior of both user programs and the operating system itself. With DTrace, you can collect data at strategic locations in your environment, referred to as probes. DTrace enables you to record such data as stack traces, timestamps, the arguments to a function, or simply counts of how often the probe fires. Because DTrace enables you to insert probes dynamically, you do not need to recompile your code. For more information on DTrace, see the Solaris Dynamic Tracing Guide and the DTrace User Guide . The DTrace BigAdmin System Administration Portal contains many links to articles, XPerts sessions, and other information about DTrace.

Previous Next