Wednesday, September 30, 2009

Linux Filesystem Performance Comparison for OLTP




This paper describes the technical background and the performance results of a
transaction processing workload that we ran on an Oracle 9i Release 2 database
server to compare the performance of four Linux filesystems on direct-attached
SCSI storage. We tested two traditional filesystems, ext2 and ext3; and we tested
two special-purpose filesystems, raw device pseudo-files and the Oracle Cluster
Filesystem (OCFS).
In general, filesystems store and manage structured information on disk partitions
which vary in size depending on the configuration and needs of the applications.
Filesystems are generally such a crucial part of computer systems that the reliability
and performance of most applications greatly depend on the reliability and
performance of the underlying filesystem.
However; unlike most applications the Oracle database server itself also
implements many of the same functions that are implemented by a traditional
filesystem including cache management, IO optimization and data layout. This
overlap of filesystem and database functionality raises several questions: How does
Oracle on Linux perform when the storage is provided by a typical filesystem? Can
the two systems work together? How well can the database perform when it
controls the storage more directly? These questions motivated the performance
tests discussed in this paper.
The test results show that OCFS and raw devices performed similarly; that ext2 and
ext3 performed similarly; and that as the workload scaled up, OCFS and raw device
files yielded significantly greater performance than the ext2 and ext3 filesystems.
The following sections discuss the details of the tests and the results.
Linux Filesystem Performance Comparison for OLTP Page 4
FILESYSTEMS
This section briefly describes the filesystems we tested: ext2, ext3, raw, and OCFS.
Nearly all Linux filesystems, including ext2 and ext3, use the buffer cache layer in
the Linux operating system kernel for disk reading or writing. The kernel not only
caches data but also uses algorithms such as read-ahead (which consecutively reads
extra data blocks into the cache in the hope that these will be the next data blocks
requested by the application).
As it turns out, although the kernel buffer cache layer is beneficial in many
applications it will usually decrease performance for a database application such as
Oracle. Why? Primarily because the Oracle database itself already has a buffer
cache, so the double-caching of data wastes system memory; furthermore the
Oracle database has more knowledge of its client IO access patterns than the
filesystem can have, so the cache buffer replacement strategies and other algorithms
(such as read-ahead) are most likely to improve performance if handled in the
database rather than the filesystem. A variety of other factors can also decrease
database performance in a filesystem; for example some OS filesystem
implementations develop resource bottlenecks when Oracle uses large numbers of
files.
It is important to note that the files which most directly affect the performance of a
database workload such as the OLTP benchmark discussed in this paper are the
Oracle database’s data files. In this paper the filesystems are compared solely for
the performance impact on the chosen benchmark, which means that only the
datafiles are stored in the filesystems for our test (see Appendix A). The following
are some examples of other files used with Oracle that are not data files: executable
binaries, message files, trace files, and shared libraries.
Ext2 Filesystem
Ext2, which is short for “second extended filesystem,” was the de facto standard
on numerous Linux distributions for many years. Ext2 is a reliable and robust
filesystem and provides a rich set of features including subdirectories, attributes,
quotas, and locks.
One potential problem with ext2 and many other filesystems is that in case of an
improper system shutdown such as a power failure, an ext2 filesystem cannot be
Linux Filesystem Performance Comparison for OLTP Page 5
used until a filesystem consistency check (fsck) has been performed. The time
taken for the consistency check is heavily dependent on the size of the filesystem:
the bigger the filesystem, the longer it takes to finish the consistency check. With
modern disk capacities reaching hundreds of Gigabytes (and doubling each year), it
can take hours or even days to check large filesystems – and during this time the
applications cannot run and the system is unavailable.
Ext3 Filesystem
Ext3 is now available on most Linux distributions. Ext3 is an enhancement of ext2
to implement algorithms for efficiently using a journal of all writes to the on-disk
filesystem. The journal itself is also stored on the disk which enables ext3 to be
reliable, and the IO to the journal is sequential (due to the physical layout of the
journal’s data blocks on the disk) which enables ext3 to perform well.
Ext3 has a number of advantages. A great advantage of ext3 is that when you have
a large disk and need to recover from a system crash, the recovery process
(filesystem consistency check) is much faster than on Ext2. By journaling changes
(writes), ext3 can recover very quickly regardless of the size of the filesystem.
Another advantage is that ext3 is compatible with Ext2 so converting filesystems
between Ext2 and Ext3 is very easy and does not need a reboot or repartitioning of
the disks.
An ext3 filesystem can be mounted in one of 3 modes:
• data=journal – this is the most reliable, but slowest of the 3 modes
• data=ordered – this is the default mode
• data=writeback – this is generally the fastest of the 3 modes
Our performance tests included these three modes of ext3 operation. However; no
further filesystem tuning was done.
Raw Devices
Linux provides raw device access with a pseudo filesystem which presents a file-like
interface to read and write actual disk partitions. Normally reads and writes to files
go through the Linux filesystem buffer cache; however, raw device IO bypasses the
buffer cache and directly deals with the underlying hardware devices. There is one
raw device for one partition. For the 2.4 series Linux operating system kernel that
was used for this performance comparison, the maximum number of raw devices
that a system can have is fixed at a total of 255 raw devices, and the number of
partitions that can be created on any disk is fixed at 14.
Linux Filesystem Performance Comparison for OLTP Page 6
Oracle Clustered Filesystem (OCFS)
OCFS is Oracle’s open-source filesystem available on various platforms. This is an
extent-based (an extent is a variable contiguous space) filesystem which is currently
intended for Oracle datafiles and Oracle RAC (real application clusters). In terms
of the file interface it provides to applications, OCFS balances performance and
manageability by providing functionality that is in-between the functionality
provided by raw devices and typical filesystems. While retaining the performance
of raw devices (as we see from the results of the benchmark in the next section),
OCFS provides higher-order, more manageable file interfaces. In this respect, the
OCFS service can be thought of as a filesystem-like interface to raw devices. At the
same time, the cluster features of OCFS go well beyond the functionality of a
typical filesystem. OCFS files can be shared across multiple computers, or nodes,
on a network so that the files are simultaneously accessible by all the nodes, which
is essential in RAC configurations. For example, sharing datafiles allows media
recovery in case of failures, as all the datafiles (archive log files) are visible from the
nodes that constitute the RAC cluster.
Beyond clustering features and basic file service, OCFS provides a number of
manageability benefits (for example, resizing datafiles and partitions is easy) and
comes with a set of tools to manage OCFS files.
For more information on OCFS, please visit http://otn.oracle.com/linux.
BENCHMARK
This test used an OLTP workload, generated by processes simulating users who
connect to a database and perform transactions. The database simulated a realworld
retail store chain with 1000 warehouses. By varying the number of users, we
controlled the amount of work and the load on the CPU and the IO subsystem.
Configuration
Processor 4 x Intel Pentium III 700 Mhz (Cache 2MB)
Storage 96 x 8 GB SCSI direct-attached disks
(configured as 8 disks in RAID-0 )
Disk
Controller
MegaRAID v1.73
Memory 4 GB
Hardware
Linux Filesystem Performance Comparison for OLTP Page 7
Linux
Distribution
Red Hat Advanced Server 2.1
Kernel 2.4.9-e12smp
OCFS 1.0.8-4
Oracle
Database
9i Release 2, version 9.2.0.3
Test Setup
The setup consisted of the above server with direct-attached storage. The database
files were distributed evenly over 6 disks. The disk layout is mentioned in
Appendix A. Appendix B contains the database parameter file used for the tests.
Test Goals
Evaluate the performance of ext2, ext3, raw, and OCFS in terms of the following
parameters, which are representative of the database performance.
• Transactions Per Second (TPS) The amount of work done per second; this
is a measure of throughput. The more transactions, the better the overall
system performance.
• Input/Output (IO) The amount of IO done, in kilobytes per second; this is a
measure of bandwidth.
• CPU Utilization The average CPU used during a test run given as a
percentage where 100% = full CPU utilization; this is a measure of system
processing load.
From past experience on this hardware, IO throughput was found to be optimum
for benchmark simulations of 100 users and less, so for this paper the test plan was
to test only up to 100 users. However; since the raw and OCFS results scaled
linearly to 100 users, in retrospect a higher threshold could have been used.
Software
Linux Filesystem Performance Comparison for OLTP Page 8
RESULTS
Figure 1: TPS Graph
Figure 1 shows that when using raw or OCFS the transaction throughput increases
linearly. Since raw and OCFS bypass the filesystem cache, there is more memory
available for data. For Ext2/Ext3, there is a linear increase for some time after
which the TPS graph stays level. The performance increase stops as the database
cache gets filled up. To acquire free buffers the database writer needs to clear some
space in its own cache, and the system performance becomes IO-bound because
the database is busy writing these disk blocks. Thus we see more “free buffer
waits” in the Oracle statspack reports.
Figure 2: Input Graph
TPS measures the rate at which work is
done. The more transactions completed
per second, the better the overall
performance. Since the TPS value
indicates the overall database
performance, it is the chief metric we used
to compare the filesystems.
The three modes of ext3 operation are
journaled, ordered, and writeback. We
denote them here by ext3j, ext3o, and
ext3w, respectively.
The input bandwidth in kilobytes per
second measures the amount of data read
by the system from the disks. We used OS
utilities to measure the rate of input,
Users
TPS
raw ocfs ext2 ext3j ext3o ext3w
0
5000
10000
15000
20000
Users
Input (KB/s)
raw ocfs ext2 ext3j ext3o ext3w
Linux Filesystem Performance Comparison for OLTP Page 9
As seen in Figure 2, the number of bytes read from disk is higher for Ext2/Ext3,
than for raw and OCFS. This is attributable to filesystem read-ahead IO and the
reduced memory available for the database due to the filesystem cache.
Since raw and OCFS bypass the filesystem cache, they do not have any read-ahead
(Oracle itself may issue read-ahead requests when they will directly benefit database
performance, but the raw and OCFS filesystems by themselves do not).
Figure 3: Output Graph
Figure 3, above, shows several effects. First, for ext2 and ext3 the number of bytes
written is greater than for raw or OCFS until the workload is scaled to a high
number of users. Why? Initially there are more Oracle buffer cache flushes for
ext2 and ext3, but for high numbers of users the system does less total work than
with raw and OCFS (see also Figure 1 above). Second, the output for ext2/ext3
eventually stays constant which is likely due to a bottleneck in the IO subsystem.
Also, in journal mode ext3 writes more because it journals both data and metadata.
Figure 4: CPU Utilization Graph
The output bandwidth in kilobytes per
second measures the amount of data
written to storage by the system. We used
OS utilities to measure the rate of output,
The percentage of full CPU utilization
measures the processor load and also
indirectly provides information about
relative system IO load. We used OS
utilities to measure CPU utilization.
0
1000
2000
3000
4000
5000
6000
7000
Users
Output (KB/s)
raw ocfs ext2 ext3j ext3o ext3w
0
10
20
30
40
50
60
70
Users
% CPU
raw ocfs ext2 ext3j ext3o ext3w
Linux Filesystem Performance Comparison for OLTP Page 10
Figure 4 shows the CPU utilization for all the four filesystems being tested in this
paper, namely ext2, ext3, raw, and OCFS. In all cases, we observe that the
processor load increases with the number of users, as expected. For very high
numbers of users the ext2 and ext3 filesystems do less processing because their IO
rates are higher and the CPU is less busy while waiting for IO. It is important to
note that there is also an opposite effect on the CPU activity: whenever the IO
increases as it does for ext2 and ext3 under very high benchmark load, the Linux
kernel activity such as queueing, swapping, and other resource management will
take up more CPU time than under more normal conditions. However; the
decrease in benchmark processing due to heavier IO load usually outweighs the
increase in system CPU. Furthermore, this additional system CPU load actually
subtracts from the total processing power available to do directly useful work (i.e.
database transactions).
CONCLUSION
By design, this was purely a performance comparison (factors such as cost,
manageability, and reliability were not considered) and used one database workload,
albeit a typical one. We ran the same database benchmark using 4 different Linux
filesystems for data storage and we measured transaction throughput, IO activity,
and server CPU utilization. Transaction throughput was the primary metric of the
benchmark. The results showed that raw and OCFS performance are nearly
identical and they yield better overall performance for the OLTP database workload
that we tested.
The ext2 and ext3 filesystems have a cache which decreases the database workload
performance for increasing users/load on the system. For small number of users,
ext2 or ext3 may suffice but for large number of users, the limits of the IO
subsystem are reached sooner than with raw and OCFS storage.
The performance of the overall system depends on the type of application and for
an Oracle database it usually depends greatly on the physical organization of data
on disk. Traditional filesystems such as ext2 and ext3 may perform better for
simple applications which benefit from the caches and algorithms of the filesystem
layer. However; for Oracle database systems the on-disk data layout and access
strategies are best left to the Oracle server and our test results demonstrate the
performance advantages of storing Oracle data in either raw files or OCFS files.
Linux Filesystem Performance Comparison for OLTP Page 11
APPENDIX A – DATAFILE LAYOUT
Disk/Partition Raw Device Database File Size (MB)
/dev/sde7 /dev/raw/raw70 stok_0_0 7500
/dev/sde8 /dev/raw/raw77 icom_0_0 3500
/dev/sde9 /dev/raw/raw84 dcom_0_0 2500
/dev/sde10 /dev/raw/raw91 ordl_0_0 8032
/dev/sde11 /dev/raw/raw98 temp_0_0 3000
/dev/sde12 /dev/raw/raw105 cust_0_0 5000
/dev/sde13 /dev/raw/raw118 sp_0 1000
/dev/sdf7 /dev/raw/raw71 stok_0_1 7500
/dev/sdf8 /dev/raw/raw78 icom_0_1 3500
/dev/sdf9 /dev/raw/raw85 dcom_0_1 2500
/dev/sdf10 /dev/raw/raw92 ordl_0_1 8032
/dev/sdf11 /dev/raw/raw99 temp_0_1 3000
/dev/sdf12 /dev/raw/raw106 cust_0_1 5000
/dev/sdf13 /dev/raw/raw114 ctl1 1000
/dev/sdg8 /dev/raw/raw72 stok_0_2 7500
/dev/sdg9 /dev/raw/raw79 icom_0_2 3500
/dev/sdg10 /dev/raw/raw86 dcom_0_2 2500
/dev/sdg11 /dev/raw/raw93 ordl_0_2 8032
/dev/sdg12 /dev/raw/raw100 temp_0_2 3000
/dev/sdg13 /dev/raw/raw107 cust_0_2 5000
/dev/sdh7 /dev/raw/raw73 stok_0_3 7500
/dev/sdh8 /dev/raw/raw80 icom_0_3 3500
/dev/sdh9 /dev/raw/raw87 dcom_0_3 2500
/dev/sdh10 /dev/raw/raw94 ordl_0_3 8032
/dev/sdh11 /dev/raw/raw101 temp_0_3 3000
/dev/sdh12 /dev/raw/raw108 cust_0_3 5000
/dev/sdh13 /dev/raw/raw115 ctl2 1000
/dev/sdi6 /dev/raw/raw74 stok_0_4 7500
/dev/sdi7 /dev/raw/raw81 icom_0_4 3500
/dev/sdi8 /dev/raw/raw88 dcom_0_4 2500
/dev/sdi9 /dev/raw/raw95 ordl_0_4 8032
/dev/sdi10 /dev/raw/raw102 temp_0_4 3000
/dev/sdi11 /dev/raw/raw109 cust_0_4 5000
/dev/sdj6 /dev/raw/raw75 stok_0_5 7500
/dev/sdj7 /dev/raw/raw82 icom_0_5 3500
/dev/sdj8 /dev/raw/raw89 dcom_0_5 2500
/dev/sdj9 /dev/raw/raw96 ordl_0_5 8032
/dev/sdj10 /dev/raw/raw103 temp_0_5 3000
/dev/sdj11 /dev/raw/raw110 cust_0_5 5000
/dev/sdj12 /dev/raw/raw116 roll01 1500
/dev/sdj13 /dev/raw/raw117 sys 4000
/dev/sdk6 /dev/raw/raw76 stok_0_6 7500
/dev/sdk7 /dev/raw/raw83 icom_0_6 3500
/dev/sdk8 /dev/raw/raw90 dcom_0_6 2500
/dev/sdk9 /dev/raw/raw97 ordl_0_6 8032
/dev/sdk10 /dev/raw/raw104 temp_0_6 3000
/dev/sdk11 /dev/raw/raw111 cust_0_6 5000
/dev/sdl6 /dev/raw/raw112 log1 8032
/dev/sdl7 /dev/raw/raw113 log1 8032
Linux Filesystem Performance Comparison for OLTP Page 12
APPENDIX B – INIT.ORA PARAMETER FILE
compatible = 9.0.1.3.0
control_files = (/dwork1/ctl1,/dwork3/ctl2)
#control_files = ($diskloc/ctl1,$diskloc/ctl2)
db_name = fstest
db_block_size = 2048
db_files = 100
db_block_buffers = 750000 # SGA 1.7 Gb
sort_area_size = 10485760
shared_pool_size = 25000000
shared_pool_size = 15000000
log_buffer = 1048576
parallel_max_servers = 50
recovery_parallelism = 40
dml_locks = 500
processes = 619
sessions = 619
transactions = 600
cursor_space_for_time = TRUE
undo_management = auto
UNDO_SUPPRESS_ERRORS = true
undo_retention = 60
UNDO_TABLESPACE = undo_ts
max_rollback_segments = 520
db_writer_processes = 1
dbwr_io_slaves = 10
Linux Filesystem Performance Comparison for OLTP
January 2004
Authors: Rajendra Kulkarni, Peter Schay
Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
www.oracle.com
Oracle is a registered trademark of Oracle Corporation. Various
product and service names referenced herein may be trademarks
of Oracle Corporation. All other product and service names
mentioned may be trademarks of their respective owners.
Copyright © 2004 Oracle Corporation
All rights reserved.

Analysis of the Linux Random Number Generator Zvi Gutterman

Safend and The Hebrew University of Jerusalem
Benny Pinkas
University of Haifa
Tzachy Reinman
The Hebrew University of Jerusalem
March 6, 2006
Abstract
Linux is the most popular open source project. The Linux random number generator is part of the kernel of all
Linux distributions and is based on generating randomness from entropy of operating system events. The output of
this generator is used for almost every security protocol, including TLS/SSL key generation, choosing TCP sequence
numbers, and file system and email encryption. Although the generator is part of an open source project, its source
code (about 2500 lines of code) is poorly documented, and patched with hundreds of code patches.
We used dynamic and static reverse engineering to learn the operation of this generator. This paper presents a
description of the underlying algorithms and exposes several security vulnerabilities. In particular, we show an attack
on the forward security of the generator which enables an adversary who exposes the state of the generator to compute
previous states and outputs. In addition we present a few cryptographic flaws in the design of the generator, as well
as measurements of the actual entropy collected by it, and a critical analysis of the use of the generator in Linux
distributions on disk-less devices.
1 Introduction
Randomness is a crucial resource for cryptography, and random number generators are therefore critical building
blocks of almost all cryptographic systems. The security analysis of almost any system assumes a source of random
bits, whose output can be used, for example, for the purpose of choosing keys or choosing random nonces. Weak
random values may result in an adversary ability to break the system, as was demonstrated by breaking the Netscape
implementation of SSL [8], or predicting Java session-ids [11].
Since a physical source of randomness is often too costly, most systems use a pseudo-random number generator.
The state of the generator is seeded, and periodically refreshed, by entropy which is gathered from physical sources
(such as from timing disk operations, or from a human interface). The state is updated using an algorithm which
updates the state and outputs pseudo-random bits.
This paper studies the Linux pseudo-random number generator (which we denote as the LRNG). This is the most
popular open source pseudo-random number generator, and it is embedded in all running Linux environments, which
include desktops, servers, PDAs, smart phones, media centers, and even routers.
Properties required of pseudo-random number generators. A pseudo-random number generator must be secure
against external and internal attacks. The attacker is assumed to know the code of the generator, and might have partial
knowledge of the entropy used for refreshing the generator’s state. We list here the most basic security requirements,
using common terminology (e.g., of [3]). (A more detailed list of potential vulnerabilities appears in [14].)
² Pseudorandomness. The generator’s output looks random to an outside observer.
² Forward security. An adversary which learns the internal state of the generator at a specific time cannot learn
anything about previous outputs of the generator.
1
² Break-in recovery / backward security. An adversary which learns the state of the generator at a specific time
does not learn anything about future outputs of the generator, provided that sufficient entropy is used to refresh
the generator’s state.
The Pseudorandomness requirement is sufficient in attack scenarios where the adversary does not have knowledge of
the generator’s state. In many scenarios, however, an adversary might be able to learn the state of the generator; for
example, by bypassing access restrictions of the operating system, or by reading the state from memory or from the
hard disk if it is stored there. The forward security requirement ensures that an adversary which learns the generator’s
state at a certain time cannot gain information about the output of the generator at previous times (as we show below,
this requirement is not fully satisfied by the LRNG). The break-in recovery requirement ensures that learning the state
at a particular time does not compromise all future outputs of the generator.
1.1 The Linux Pseudo-Random Number Generator (LRNG)
The Linux kernel is an open source project developed in the last 15 years by group of developers led by Linus Torvalds.
The kernel is the common element in all various Linux distributions, on all types of devices.
The output of the LRNG can be used by internal kernel functionalities which use random bits, and by calls to its
application programming interface (API). The Linux kernel uses random data for various purposes, such as generating
random identifiers, computing TCP sequence numbers, producing passwords, and generating SSL private keys. Within
the kernel, the interface for receiving random values from the LRNG is the function get_random_bytes(*buf, nbytes).
The API to the LRNG is through two device drivers named /dev/random and /dev/urandom. Both devices
let users read pseudo-random bits. The difference between the two is in the stated level of security of the random bits,
and the resulting delay. The first device (/dev/random) outputs “extremely secure” bits1 and may block the user
until such bits can be generated by the system. The second device (/dev/urandom) outputs less secure bits but its
output is never blocked. Section 2.4 explains the difference between the two devices.
Why reverse-engineering the LRNG is not easy. The LRNG is part of an open source project and therefore one
might assume that its source code is available for public scrutiny and that its security can be easily analyzed (or at least,
is not based on “security by obscurity”). However, the LRNG is not well documented and there is no clear description
of the implemented algorithm. The LRNG is composed of about 2500 lines of code, and in addition, hundreds of code
patches were applied to the code during the last five years (and consequently, the available documentation does not
always reflect the current code). One example of the complexity of the LRNG code is the fact that for 17 months the
LRNG code included a bug in which entropy addition used a vector of size 4 £ n instead of n. We also note that
throughout our analysis we were not helped by any of the LRNG authors.
These factors turned the reverse-engineering of the LRNG into a challenging task. We therefore combined static
reverse-engineering of the source code of the Linux kernel with dynamic tracing to present a clear algorithmic representation
of the LRNG (see Section 3). Dynamic reverse-engineering is not simple in this case, due to two main
restrictions. The first is the fact that any kernel change requires a new build and installation. This process takes a
couple of hours to finish, and performing it dozens of times makes the process very tedious. The second, and more
troubling restriction, is the fact that any change made to the kernel may also result in some influence on the kernel
“noise generation” and hence on the LRNG behavior. We therefore implemented a user-mode simulator of the LRNG
as part of our analysis. It can be downloaded from the authors’ web page.
As a final note on the complexity of the implementation, we add that the complexity of the algorithm, the lack of
documentation, and the high volume of changes to the LRNG code, resulted in dozens of programming bugs. Many
of these bugs resulted in security vulnerabilities during the last five years.
The basic structure of the LRNG. At a high level, the LRNG can be described as three asynchronous components.
The first component translates system events into bits which represent the underlying entropy. The second component
adds these bits to the generator “pool”. When bits are read from the generator, the third component applies three
consecutive SHA-1 operations to generate the output of the generator and the feedback which is entered back into the
pool.
1This wording is according to the LRNG designer. Essentially, this type of output is only available when enough physical entropy is gathered.
Our discussion of the resulting security is in Section 3.4.
2
Each sample of “randomness” originating from system events is collected as two 32-bit words. The first word
measures the time of the event and the second word is the event value, which is usually an encoding of a pressed key, a
mouse move or a drive access. In order to keep track of the amount of physical randomness which is added to the pool,
the LRNG holds a counter for counting an estimate of this value, which is calculated as a function of the frequencies
of the different events. The LRNG denotes this value as entropy (see Section 2.4 for the exact definition) although it
is different than the classical entropy definition by Shannon [21].
1.2 Our Contributions
This paper describes research conducted on analyzing the LRNG. It provides the following contributions:
² Publication of a description of the LRNG algorithm. As described above, a considerable amount of work was
required in order to analyze the LRNG code and provide a high-level description of the underlying algorithm.
² An attack which breaks the forward-security of the LRNG. Namely, we show how, given a state of the generator,
it is possible to reconstruct previous states. The time complexity of this attack is 264 or 296, depending on the
attack variant. The memory overhead of the attack is O(1).
² An analysis of the amount of entropy which is added to the generator in one typical implementation.
² An analysis of the security of an implementation on a disk-less system (an OpenWRT based router).
² We also identify some vulnerabilities in the current implementation (including an easy deniale of service attack)
and provide recommendations for improving future implementations of pseudo-random number generators.
1.3 RelatedWork
Existing PRNG implementations. In the past, PRNGs were either a separate program or a standard library within
a programming language. The evolution of software engineering and operating systems introduced PRNGs which are
part of the operating system. From a cryptographic point of view, this architecture has three main advantages: (1)
the ability to introduce more complex algorithms which are implemented using unique kernel optimization, (2) the
ability to use kernel based entropy events as input to the generator, and (3) the fact that in a multi-user, multi-threaded
environment, many consumers can read random bits, and therefore an adversary might be prevented from reading a
long stream of consecutive bits of the PRNG output.
A published overview of the use of cryptography in OpenBSD [5] includes a very informative introduction to this
operating system’s usage of PRNGs, and of cryptography’s general role in this operating system. The PRNG of the
FreeBSD operating system is described in [18]. It uses time stamps of events as an entropy (i.e., physical randomness)
source. These are hashed using AES [19] into two pools, each of 256 bits. When output is extracted, AES encryption
is used to repeatedly encrypt a 256-bit counter, using an encryption key that is taken from the entropy pools. FreeBSD
implements a single non-blocking device and the authors declare their preference of performance over security.
Castejon-Amenedo et al. [4] propose a PRNG for UNIX environments. Their system is composed of an entropy
daemon and a buffer manager that handles two devices—blocking and non-blocking. The buffer manager divides
entropy equally between the two devices, such that there is no entropy that is used in both. A notable advantage of this
scheme is the absolute separation between blocking and non-blocking devices, which prevents launching a denial-ofservice
attack on the blocking device by using the non-blocking device (such an attack is possible in Linux, as we later
discuss in Section 3.4).
Analysis of PRNGs. A comprehensive discussion of the system aspects of PRNGs, as well as a guide to designing
and implementing a PRNG without the use of special hardware or of access to privileged system services, is given
by Gutmann [9]. Issues related to operating system entropy sources were discussed in a recent NIST workshop on
random number generation [12, 10].
An extensive discussion of PRNGs, which includes an analysis of several possible attacks and their relevance to
real-world PRNGs, is given by Kelsey et al. in [14]. Additional discussion of PRNGs, as well as new PRNG designs
appear in [13, 7].
3
The recent work of Barak and Halevi [3] presents a rigorous definition and an analysis of the security of PRNGs,
as well as a simple PRNG construction. This work suggests separating the entropy extraction process, which is
information-theoretic in nature, from the output generation process. Their construction is based on a cryptographic
pseudo-random generator G, which can be implemented, for example, using AES in counter mode, and which does
not use any input except for its seed. The state of the PRNG is the seed of G. Periodically, an entropy extractor uses
system events as an input from which it extracts random bits. The output of the extractor is xored into the current
state of G. This construction is much simpler than most existing PRNG constructions, yet its security was proved
in [3] assuming that the underlying building blocks are secure. We note that our analysis shows that the Linux PRNG
construction, which is much more complex than that of [3], suffers from weaknesses which could have been avoided
by using the latter construction.
Analysis of open-source security packages. It is hard to examine the security of implementations of security packages,
and, in the case of proprietary software, one usually has to trust the software authors. However, even in the case
of open source security packages it is hard to trust security, since the fact that source code can be read does not imply
that it is actually examined by security experts. For example, the work of Nguyen [20] examined the (open) source
code of the GNU Privacy Guard secure email software (GnuPG or GPG), and identified several cryptographic flaws.
The most serious of these flaws has been present in GPG for almost four years.
2 The Structure of the Linux Random Number Generator
Our study is based on version 2:6:10 of the Linux kernel, which was released on December 24, 2004.
2.1 General Structure
The generation of random numbers in Linux is composed of three asynchronous procedures. In the first procedure,
operating system entropy is collected from various events inside the kernel. In the second procedure, entropy is fed
into an LFSR-like pool, using a mixing function. When random bits are requested, the third procedure occurs, output
is generated and the pool is updated. The only non-linear cryptographic operation used by these procedures is the
SHA-1 hash function. (We note that the recent attacks on the collision resistance of SHA-1 seem irrelevant for the
purpose of attacking the LRNG.)
Pools and counters. Figure 2.1 describes the LRNG flow. The internal state is kept in three entropy pools: primary,
secondary and urandom, whose sizes are 512, 128 and 128 bytes, respectively. Entropy sources add data to the primary
pool2 ; output from the primary pool is extracted and fed to the secondary and urandom pools, while the LRNG output
is extracted from the secondary pool or from the urandom pool. During the extraction operation, the inner state of a
pool is modified in a feedback manner.
Each pool has its own entropy estimation counter. This is an integer value between zero and the pool size in
bits, which indicates the current estimated entropy of the pool. When output is extracted from the pool this counter is
decremented, and when entropy is added the counter is incremented. An entropy counter is always decremented by the
number of extracted bits. Incrementing the counter is more complex. If the added bits originate from one of the entropy
sources, then their entropy is estimated, and the counter is incremented accordingly. The entropy estimation uses the
timing of the last few events of the same entropy source (see discussion below). If the entropy bits are transferred from
the primary pool, the entropy counter of the receiving pool is incremented by the number of transferred bits.
The entropy counter of the secondary pool plays a crucial role when extracting entropy using the blocking interface,
/dev/random. Its task is to determine whether there is enough entropy in the pool to supply the requested amount
of random data. If the answer is negative, the LRNG tries to transfer entropy from the primary pool to the secondary
pool, and if this fails, it blocks and waits until some entropy input arrives and increments the entropy counter.
Adding physical entropy. Entropy bits are added to the primary pool from external sources. Desktop and server
PCs can use four different sources: mouse and keyboard activity, disk I/O operations, and specific interrupts. When
such an event occurs, it produces a 32-bit word representing its timing and a 32-bit word encoding its attributes (e.g.,
2As is described below, system entropy might also be added to the secondary pool, if the LRNG estimates that the primary pool has full entropy.
4
Entropy Sources
keyboard
mouse
interrupts
disk
Primary Entropy
Pool
512 bytes
Secondary
Entropy Pool
128 Bytes
Urandom
Entropy Pool
128 Bytes
C A
E
/dev/random
E
E
E A
A
A
A
A
A
/dev/urandom
get_random_bytes
(blocking)
(non-blocking)
Figure 2.1: The general structure of the LRNG: Entropy is collected (C) from four sources and is added (A) to the
primary pool. Entropy is extracted (E) from the secondary pool or from the urandom pool. Whenever entropy is
extracted from a pool, some of it is also fed back into this pool (broken line). The secondary pool and the urandom
pool draw in entropy from the primary pool.
which key was pressed). In addition, the differences between the timings of successive events of the same type are
used to estimate the entropy provided by this event. In Section 2.4 we define this procedure in detail.
Due to the asynchronous nature of the system, collected entropy cannot be simply added to the pools but is rather
collected and batched. A few times a minute the batched data is added to the pools (in a process described in Section
2.5). The default operation is to add entropy to the primary pool. If this pool is full (its entropy count equals 4096)
entropy is added to the secondary pool. When the secondary pool is full the process returns to the primary pool, and
so on. Entropy is never added to the urandom pool. This process increments the entropy counter of the respective pool
by the estimated entropy amount.
Generating output. Random bits are extracted from one of the three pools: they are extracted from the urandom
pool when the user uses /dev/urandom and when the kernel calls get_random_bytes; from the secondary pool
when the user uses /dev/random; and from the primary pool when one of the two other pools does not have enough
entropy and needs re-filling. The process of entropy extraction includes three steps: updating the pool’s contents,
extracting random bits to be output, and decrementing the entropy counter of the pool. This process involves hashing
the pool contents using SHA-1, and adding the results to the pool. We study each of the LRNG steps in the following
sections.
2.2 Initialization
Operating system startup includes a sequence of routine actions. This sequence includes the initialization of the LRNG
with constant operating system parameters and with the time-of-day, and additional disk operations and system events
which affect the LRNG using the interface for adding external entropy (discussed in Section 2.5). This sequence of
operations might be easily predicted by an adversary, especially in systems which do not have a hard drive. If no
special actions are taken, the LRNG state might include very limited entropy. (For example, the time of day is given as
a count of seconds and of micro-seconds, each represented as a 32-bit value. In reality these values have very limited
entropy as one can find computer uptime within an accuracy of a minute, which leads to a brute-force search of only
60 £ 106 < 226 different options.)
To solve this problem, the LRNG simulates continuity along shutdowns and startups. This is done by saving a
5
Keyboard Mouse Hard Drive Interrupts
8 12 3 4
Table 1: The number of unknown bits in operating system events.
random-seed at shutdown and writing it back to the pools at startup. A script that is activated during system startups
and shutdowns uses the read and write capabilities of the /dev/urandom interface to perform this operation.
During shutdown the script reads 512 bytes from /dev/urandom and writes them to a file, and during startup
these bits are written back to the /dev/urandom device. This device is defined such that writing to it modifies the
primary pool and not the urandom pool (as one could expect from its name). The resulting operations applied to the
primary pool are pretty much identical to the effect of receiving these 512 bytes as the encoding of system events, and
adding them to the primary pool using the usual procedure for adding entropy, which is outlined in Section 2.5. The
only difference is that the added bytes do not increment the entropy estimation. The secondary pool and the urandom
pool are refreshed by the primary pool, and therefore the script affects all three pools.
It is important to note that this script is part of a Linux distribution package, such as RedHat, and not part of the
kernel code itself (this is also the reason that the script must interact with the pools using a device driver rather than
reading and writing from/to the pools). The author of the LRNG ([22]) instructs Linux distribution developers to add
this script in order to ensure the unpredictability of the LRNG at system startups. This implies that the security of the
LRNG is not completely stand-alone, but dependent on an external component, which can be predictable in certain
Linux distributions.
Security implications: Some Linux distributions, such as the KNOPPIX distribution which is a bootable PC system
on a CD or a DVD [1], or the OpenWRT Linux distribution for routers [2], do not use a script of this type and
therefore initialize the LRNG from scratch in each reboot. This might result in an initial LRNG state which is rather
predictable. It is also obvious that when the seed is saved in a file on the hard disk after system shutdown, anyone who
can physically access the disk can read that file and learn the seed, or alternatively replace the seed with a different
value (say, a previous value of the seed for which the adversary already recorded the output of the generator). Access
permissions are not too helpful here, since they are related to the operating system and not to the hard disk.
2.3 Collecting Entropy
In a PC environment, the LRNG collects entropy from events originating from the keyboard, mouse, disk and system
interrupts. When such an event occurs, two 32-bit words are used as input to the entropy pools. The first word encodes
the timing of the event in jiffies (namely, the number of milliseconds from the time the machine was booted) or in cpucycles
granularity (currently cpu-cycles granularity is only used on SMP). The second word encodes the event type.
For example, in case of a keyboard event the word encodes the key that was pressed. Table 1 presents the number of
unknown bits per each type of event. Note that the actual entropy of these events is much lower, as most of them are
predictable to a large extent. Appendix A describes in detail how each value is calculated.
In other environments, the LRNG gathers entropy from the available resources. For example, an OpenWRT router
does not include a hard disk, mouse and keyboard and therefore these cannot be used as an entropy sources. On the
other hand, the router collects entropy from network events.
2.4 Estimating the Entropy Amount
One of the fundamental issues in using physical entropy is estimating the amount of entropy which is added to the
generator, and more generally, estimating the current amount of entropy in the pools.
As noted above, the difference between /dev/random and /dev/urandom is that the /dev/random interface
does not return more bits than its current entropy estimation and thus might block. The /dev/urandom
interface, and the kernel interface (get_random_bytes), return any number of pseudo-random bits, according to
the request. This difference implies that entropy estimation is important mainly for the /dev/random interface.
The LRNG estimates the amount of entropy of an event as a function of its timing only, and not of the event type.
The estimate is done in the following manner:
6
Definition 2.1 Let tn denote the timing of event number n. Define
±n = tn ¡ tn¡1
±2
n = ±n ¡ ±n¡1
±3
n = ±2
n ¡ ±2
n¡1
Note that tn, ±n, ±2
n, ±3
n are each 32 bits long.
The amount of entropy added by the event is defined to be log2 (min (j±nj; j±2
nj; j±3
nj )[19¡30]) where S[a¡b]
denotes bits a to b (inclusive) of S (where location 0 is the MSB). If min (j±nj; j±2
nj; j±3
nj )[19¡30] = 0 then the
estimate is 0. (Even if the estimate is 0, the event is used to update the state of the LRNG. The entropy count, however,
is only updated if the entropy estimate of the event is positive.)
This estimation is relevant only in the case of adding entropy from OS sources to the pools. When a user writes
data to one of the device drivers (/dev/random or /dev/urandom) the entropy counter is not changed. When n
bits are extracted from a pool the entropy estimation is decremented by n. When bits are transferred from one pool to
another the first is decremented and the second incremented—both by the amount of transferred bits.
2.5 Updating the Pools
The mechanism for updating the pools is based on a TGFSR (Twisted Generalized Feedback Shift Register [15, 16]).
The main advantage of the TGFSR is its extended cycle length for any initial seed. The period of a TGFSR with a
state of 128 words (on a 32-bit PC) can be 2128£32 ¡ 1 steps.
Definition 2.2 (TGFSR) A series ai 2 f0; 1gw and a matrix Aw£w are a TGFSR based on a primitive polynomial
xp + xp¡t1 + ¢ ¢ ¢ + xp¡tm + 1 (1 · t1; : : : ; tm < p) if and only if
ai = ai¡p+t1 © ¢ ¢ ¢© ai¡p+tm © ai¡pA
i = p; p + 1; : : :
The only input to the TGFSR is the initial value of the state (which is a p £ w bit seed), and in each iteration the
internal state is used to generate the new state.
The shift register used in the LRNG is based on the TGFSR (and its implementation which is described in [15])
but is different from it since it adds entropy in each iteration. We define the pools of the LRNG as arrays of length m
words (m = 32 or m = 128) which are indexed by an index j.
Adding entropy. Entropy is added in each round of state and output computation, as well as when entropy is added
from external sources, or from the primary pool to the secondary and urandom pools. Entropy is added by running the
algorithm add(pool,j,g) and updating the value of the index j, where g is the new entropy word which is added
to the pool. Figure 2.2 defines the update operation for a pool of size 32 words.
Each pool is updated based on a primitive polynomial. The polynomial is chosen according to the size of the pool,
and therefore the secondary and the urandom pools share the same polynomial, which is x32+x26+x20+x14+x7+
x+1. The primary pool polynomial is x128 +x103 +x76 +x51 +x25 +x+1, and the entropy addition for that pool
is identical to that of the smaller pools, except for using this polynomial for updating the pool (namely, xoring g with
entries j; j + 1; j + 25; j + 51; j + 76 and j + 103 modulo 128).
Entropy addition can be analyzed by assuming that the generator is reseeded in each iteration. Alternatively, one
might analyze it by assuming that the TGFSR is used to encrypt the entropy input. The LRNG reseeding process
changes the elementary properties of the TGFSR. The long period can no longer be guaranteed, and the process is no
longer a linear function of the initial state.
7
Algor i thm add ( pool , j , g ) :
temp := g
temp := temp xor po ol [ j ]
temp := temp xor po ol [ ( j +1) mod 32]
temp := temp xor po ol [ ( j +7) mod 32]
temp := temp xor po ol [ ( j +14) mod 32]
temp := temp xor po ol [ ( j +20) mod 32]
temp := temp xor po ol [ ( j +26) mod 32]
temp := ( temp >> 3) xor t a b l e [ temp & 7]
/ / t h e l a s t 3 b i t s o f temp choos e a t a b l e
/ / e n t r y which i s xor ed t o ( temp >> 3)
po ol [ j ] := temp
/ / t a b l e [ ] i s d e f i n e d as f o l l ows
/ / t a b l e [ 0 ] = 0 x0
/ / t a b l e [ 1 ] = 0 x3b6e20c8
/ / t a b l e [ 2 ] = 0 x76dc4190
/ / t a b l e [ 3 ] = 0 x4db26158
/ / t a b l e [ 4 ] = 0 xedb88320
/ / t a b l e [ 5 ] = 0 xd6d6a3e8
/ / t a b l e [ 6 ] = 0 x9b64c2b0
/ / t a b l e [ 7 ] = 0 xa00ae278
Figure 2.2: Pseudo-code of the entropy addition algorithm for the urandom and secondary pools. pool is the pool to
add entropy to, g is the added entropy, table is a table with eight words, and j is the current position in pool.
2.6 Extracting Random Bits
Entropy is extracted from the secondary pool in case of /dev/random and from the urandom pool in case of
/dev/urandom or get_random_bytes. It is also extracted from the primary pool for the purpose of refreshing
the other pools.
Extracting entropy from a pool is not a simple operation. It involves hashing the extracted bits, modifying the
pool’s state and decrementing the entropy estimate by the number of extracted bits.
Figures 2.3 and 2.4 present a pseudo-code and a diagram of the extraction algorithm. Entropy extraction is done
in blocks of 10 bytes. The process is described for the case of the urandom or secondary pools, which are 32 words
long. For simplicity the description does not include the steps of decrementing the entropy estimation, and the entropy
refilling process. The algorithm applies the SHA-1 function to the first 16 words, and adds part of the result to location
j. It then applies a variant of SHA-1, which we denote as SHA-1’ (see below), to the right half of the pool, and adds
parts of the result to locations j ¡1 and j ¡2. Finally, it applies SHA-1’ to the 16 words ending at location j ¡2, and
uses the result to compute the output in the following way: The output of SHA-1’ is 5 words (20 bytes) long. These
words are folded as described in Figure 2.5, and the resulting 10 bytes are the output of the iteration. This output is
copied to the target (user or kernel) buffer, and the number of bytes to be copied is updated. The loop continues until
the requested number of bytes are output.
The SHA-1 variant being used. The first hash function used in the procedure is the original SHA-1 function (see
[6] for the exact definition of SHA-1). Each of the following hash operations in this iteration, which we denote as
SHA-1’, use for their five initial constant values (IV) the five output words of the previous hash result. (We note that
if the LRNG had used the original SHA-1 function, the attacks in Section 3.1 would have been considerably more
efficient, as is described there.)
Extracting randomness from the primary pool. Bits are extracted from the primary pool when it is required to
refresh one of the other pools. In this case, the algorithm is slightly different since the primary pool is longer. The
SHA-1 (or SHA-1’) operation is applied to each of the eight 16 word chunks of the primary pool, and once more to
the 16 words ending at location j ¡ 8. The first eight SHA-1 operations update eight words in the pool, and the last
operation generates the output. The algorithm is described in Appendix B.
8
Algor i thm E x t r a c t ( pool , nbyt e s , j ) :
whi l e n b y t e s > 0
tmp := SHA¡1( p ool [ 0 . . 1 5 ] )
/ / t h e r e s u l t i s 5 words lo ng
add ( pool , j , tmp [ 0 ] )
tmp := SHA¡1 ’ ( poo l [ 1 6 . . 3 1 ] )
add ( pool , j¡1 mod 32 , tmp [ 2 ] )
add ( pool , j¡2 mod 32 , tmp [ 4 ] )
tmp := SHA¡1 ’ ( poo l [ ( j ¡2¡15) mod 32
. . . ( j ¡2) mod 3 2 ] )
tmp := f o l d i n g ( tmp [ 0 . . 4 ] )
/ / t h e r e s u l t i s 2 . 5 words lo ng
o u t p u t ( tmp , min ( nbyt e s , 1 0 ) )
n b y t e s := nbyt e s¡min ( nbyt e s , 10)
j := j¡3 mod 32
end whi l e
Figure 2.3: Pseudo-code of the extraction algorithm. pool is a 32 word pool from which entropy is extracted, nbytes
is the number of requested bytes, and j is the current position in pool.
0 16 31
Pool:
SHA-1
0 16 31
Pool:
i
0 16 31
Pool:
i i-2 i-1
(SHA-1)’
SHA-1
Folding
output
...
...
Figure 2.4: Extraction algorithm
input : W0;W1;W2;W3;W4
output : W0 ©W3; W1 ©W4; W2[0¡15] ©W2[16¡31]
Figure 2.5: Folding operation. Folding 5 words (160 bits) to 2:5 words (80 bits). Wi denotes the ith word, Wi[l¡m]
denotes bits l ¡ m (inclusive) of the ith word..
9
3 Analysis
Our analysis of the security of the LRNG is composed of four parts: (1) a cryptanalytic attack on the forward security
of the LRNG, (2) an analysis of the entropy added by system events, (3) observations on the insecurity of the LRNG
in the OpenWRT Linux distribution for routers, and (4) observations on security engineering aspects of the LRNG,
including a denial-of-service attack.
3.1 Forward Security
We first observe that the output which is extracted from a pool is calculated as the last state in the Extract algorithm.
Namely, it is computed after the state of the pool is updated. This means that if the state of the pool at time t is known
then it is easy to compute the output which was extracted from the pool during its last state transition (i.e., the output
which was computed in the transition from time t ¡ 1 to time t). This is a flaw in the forward security of the LRNG,
since it enables anyone who observes the state of the LRNG at a certain time to compute the last output of the LRNG.
We show below how to mount a stronger attack on the forward security of the LRNG and compute, given the state at
time t, previous states, which consequently enable to compute previous outputs of the LRNG.
We describe below how to reverse the state of a single pool, assuming that in its last update it was not refreshed
with new entropy. Let us recall that the states of the urandom and secondary pools are updated when random bits are
extracted from the LRNG, and that if the entropy estimates of these pools are low these pools attempt to be refreshed
with output from the primary pool. Only the secondary pool, however, is blocked if no such refresh is available.
The state of the primary pool is updated with randomness from system events whenever it is updated. The attack is
therefore mostly relevant to the urandom pool which is often used for extracting many bits while not receiving any
entropy updates. The attack is also relevant to the secondary pool, if the attack starts from a time in which the value
of the entropy counter is high, and to the primary pool, if the entropy which is added to the pool is mostly predictable.
The input of the attack is the state of a pool at time t (denoted as poolt). It computes the state of the pool at time
t¡1 (denoted as poolt¡1). Now, given poolt¡1 it is easy to compute the random value which was output in the extract
operation that transitioned the pool state from poolt¡2 to poolt¡1. In other words, the forward security requirement
is not satisfied since it is possible to compute a previous output of the LRNG. The same analysis can be continued
and compute the state and the corresponding outputs at times t ¡ 2; t ¡ 3 and so on, until the last time that the pool
received an entropy update.
Below we describe two methods for reversing the state of the LRNG. The first method is a generic attack which has
an overhead of 296, which is much better than an exhaustive search (whose overhead is 21024 for the case of a 32 word
pool), but is rather impractical. The second attack is almost practical, with an overhead of 264, but it is only applicable
when the index j is in a specific range (covering 18 of the 32 possible values of j in a 32 word pool). This means that
a lucky attacker, which starts its attack when the value of j is at the end of this range, can compute five previous states
of the pool (and the corresponding outputs) with an overhead of about 264. Both attacks use O(1) memory.
The Extract algorithm is described in Figure 2.3 and is used for advancing the pool and extracting an output
from it. To simplify the analysis, let us first assume that the pool is either the secondary or the urandom pool, and
is therefore 32 words long, and that the add operation is a simple addition modulo 232 ¡ 1, instead of the TGFSR
operation detailed in Section 2.5.
The analysis starts with knowledge of the state of the pool at time t (namely, poolt), and of the value of the index j.
(To simplify the notation, we denote by jt, or simply by j, the value of the index j at the beginning of the computation
of the Extract operation that transitions the pool from time t ¡ 1 to time t.)
A generic attack: As a warmup we describe a generic attack which is based on a simple observation: all but three
words of the pool (those indexed by j; j ¡ 1 and j ¡ 2) are identical in both poolt and poolt¡1. Given poolt there
are therefore only 296 possible candidate values, or “guesses”, for poolt¡1 —those obtained by copying the values of
the words in locations different from [j ¡ 2; j] and going over all possible values for the 96 bits in words j; j ¡ 1 and
j¡2. The transition from poolt¡1 to poolt is deterministic and defined by the Extract algorithm in Figure 2.3. The
attack therefore applies this algorithm to each candidate value of poolt¡1, and checks if the result is equal to poolt. If
the two values are not equal then the candidate value is dismissed. Otherwise, the candidate value is put in a short list
of possible values for poolt¡1 (we show immediately that this list is indeed short).
10
Note that although the Extract algorithm uses three invocations of SHA-1, which cause the bulk of the overhead,
all but a fraction of 2¡32 of the candidates for poolt¡1 can be dismissed after a single application of SHA-1. The
overhead of the search is therefore about 296 applications of SHA-1.
Estimating the accuracy of the attack. We know that the true value of poolt¡1 is in the computed short list, but
that there might be some additional “false positives”, i.e., values for locations [j¡2; j] in time t¡1 which are different
from the true value, but for which applying the Extract algorithm results in the right value for poolt. Fortunately,
we do not expect many false positives. There are 296 ¡ 1 false candidates for poolt¡1[j ¡ 2; j], and each of them has
a probability of 2¡96 to become a false positive (assuming that we model SHA-1 as a random function and therefore
the probability of computing the right value of poolt[j ¡ 2; j] is 2¡96). The number of false positives is therefore k
with probability of about ¡n
k¢n¡k(1 ¡ 1=n)n¡k, where n = 296 ¡ 1. Namely, with probability e¡1 there are no false
positives (k = 0), with probability e¡1 there is a single false positive, with probability of about 0:5e¡1 there are two
false positives, and so on.
Given the short list of possible values of poolt¡1, the previous procedure can be applied for each value in the list
to compute all possible values of poolt¡2. Applying this procedure to the correct value of poolt¡1 always results in
one or more candidates for poolt¡2, which include the correct value of poolt¡2 and possibly additional false positives
(according to the distribution that was detailed above). However, applying the procedure to a value which is a false
positive for poolt¡1, results, with probability e¡1, with no candidate for poolt¡2. If this event happens then it can
be concluded that the tested value for poolt¡1 is a false positive, and this value can be removed from the list. (If this
is not the case then with probability e¡1 the tested value results in a single candidate for poolt¡2, with probability
0:5e¡1 it results in two candidates for poolt¡2, etc.) The number of possible values for poolt¡k is therefore a random
variable. In Appendix C we show that the sequence fjpoolt¡kj ¡ kgk=1;2;::: is a martingale, that E(jpoolt¡kj) = k,
and that the probability that poolt¡k > k + b is at most 1=b. We can therefore conclude that for all practical purposes
the procedure outputs a list of reasonable size of the possible values of the state at time t ¡ k.
A more efficient attack. We now show that for 18 of the possible 32 values of the index j, it is possible to reverse
the pool by a procedure with an overhead of 264. This procedure is applicable if the value of j is in the range [16; 31],
and for j = 1; 2. We detail the procedure for the case of j 2 [18; 31]. In this case the words which are affected by
the state transition are located in the upper half of the pool, while the first half of the pool does not change from time
t ¡ 1 to time t (namely, poolt[0; 15] = poolt¡1[0; 15]). It is therefore possible, given poolt, to apply SHA-1 to words
[0; 15] and compute the value that was added to location j. Given this value it is possible to compute poolt¡1[j] from
poolt[j]. In addition, the initialization vector for the second application of SHA-1 is computable from poolt¡1[0; 15].
It is therefore possible to go over all 264 potential values of poolt¡1[j ¡2; j ¡1], apply SHA-1 to poolt¡1[16; 32] and
compute the resulting values that were added to locations j¡2 and j¡1. If these values are not equal to the difference
between the values of these locations in time t and in time t¡1, it is safe to dismiss the “guess” of poolt¡1[j¡2; j¡1].
The true value of poolt¡1 is never dismissed, while false positives appear with the same probability distribution as
in the previous analysis (the only difference is the value of n, which is 264 instead of 296). The number of possible
candidates for poolt¡k behaves according to the same distribution as in the previous procedure, as is analyzed in
Appendix C. The expected number of candidates for poolt¡k is therefore only k. The overhead of the attack is O(264)
operations and O(1) memory.
We note here that the pool could have been reversed with an overhead of 264 or less operations for any value of j, if
the Extract algorithm had used the original SHA-1 function rather than the version which changes the initialization
vector after the first invocation of the function. The effect of the SHA-1 variant which is used in the LRNG is that one
cannot compute the values added to locations j ¡ 1 and j ¡ 2 before computing poolt¡1[0; 15]. For j 2 [3; 15] this
means that we must check all options for poolt¡1[j ¡ 2; j].
Reversing the primary pool. Assume that we are given the state of the primary pool, and that we know which
entropy value was added to the pool when it was advanced to this state. The only difference of this case from the
case of the shorter pools is that the size of the pool is 128 words. The generic attack can still be applied with a 296
overhead. The more efficient attack can be applied if j is in the range [120; 127].
The add operation. The previous analysis assumed that the add operation in the Extract algorithm is an addition
modulo 232¡1. However, this operation is implemented as is described in Section 2.5 and therefore the value added to
11
±n Frequency
77 243
78 1730
79 11468
80 113402
81 11786
82 625
83 0
84 0
> 84 1112
Table 2: Frequency of time differences between two consecutive HD entropy addition
location j is a function of the current values of several locations in the pool, as is depicted in Figure 2.2. This change
obviously does not affect the generic attack, since poolt is still a deterministic function of poolt¡1. The second, more
efficient, attack is also not affected: the value added to location j is a function of the result of applying SHA-1 to
poolt¡1[0; 15] = poolt[0; 15] and computing a function of the SHA-1 result, of poolt¡1[j], and of five other locations
in poolt¡1 which are not changed in the transition to poolt. Given poolt[j], we can examine its three most significant
bits and identify the entry of table which was xored to temp in the transition (this is easy since each of the eight
entries in table has a different value to its three most significant bits). The index of the entry identifies the three
least significant bits of temp (before it was shifted to the right). It is now possible to reverse the operation that was
performed in the second to last line of add, and xor the result with g and with the five pool entries which where used
to generate it. The result is poolt¡1[j].
3.2 Entropy Measurements
When entropy is added to the LRNG pool, each event adds two 32 bit words. In Section 2.3 we described the actual
active range of bits within the type-value field and Appendix A provides the details for each case.
We now turn to the entropy of the timing of the encoded events. As mouse and keyboard events are usage dependent
and network interrupts are not commonly used as an entropy resource for the LRNG, we ran a trial which measured
the entropy which is added to the LRNG by hard disk events. Our trial included over ten days of measurements on
a single Linux machine, with a total of over 140; 000 entropy addition events. The system was mostly idle, and the
measurements were in a setting that recorded their results on a different computer, without affecting the disk of the
examined machine.
We mark by tn the time of event n, and the difference between two consecutive events is defined as ±n := tn¡tn¡1.
Table 2 presents the frequency of different ±n values. Note that all but 0:8% of the values are in the range [77; 82].
All other values are larger by about four orders of magnitude. The resulting entropy is only H := 1:03 bits per event,
which is to say that event timing, which is encoded as a 32 bit field, had in this setting an entropy of at most a single bit.
The recordings furthermore show that there is a correlation between consecutive ±n values (and therefore the entropy
is even lower). The entropy of pairs of events is 1.53 bits per pair, and consequently, the conditional entropy of ±n
given ±n¡1 is only 0:5 bits.3
We also checked the entropy estimate that would have been calculated by the LRNG for these measurements,
using the procedure outlined in Section 2.4. This estimate turns out to be very conservative. Only 2224 of the 140,000
measurements resulted in a positive addition to the entropy estimate. These events occurred, more or less, only when
±n had a very large value, and consequently the n and n + 1 measurements had a large entropy estimate (namely, the
1100 events of Table 2 for which ±n À 84, each resulted in a pair of consecutive measurements which contributed to
the entropy count). The total estimate of the added entropy was 17500 bits. Given that the measurements were taken
over a period of ten days, there was an average delay of about 15 minutes between pairs of events which had a positive
3Closer examination reveals patterns such as the following one: For about 84% of the measurements it holds that ±i+1 = ±i. Given this event,
the conditional probability that ±i+2 = ±i is about 90%, and the conditional probability that ±i+2 6= ±i but ±i+3 = ±i is 9:9%. Consequently, for
only 0:1% of the cases in which ±i+1 = ±i we get that both ±i+2 and ±i+3 are different from ±i.
12
contribution to the entropy count. The average contribution of these pairs was about 16 entropy bits. This is quite a
severe bottleneck for the blocking interface to the LRNG.
3.3 Analysis of the OpenWRT Linux Distribution
OpenWRT [2] is a Linux distribution for wireless routers. It provides many cryptographic services such as SSL
termination, a SSH server, and a wireless encryption. The security of all these services is dependent upon the output
of the LRNG.
An OpenWRT router has very limited entropy sources. There are no keyboard, mouse or hard drive attached to the
router. The WRT uses its flash memory (of size 4 ¡ 16 MBytes) as a special file system, but this file system does not
provide entropy to the LRNG.
In addition, the WRT implementation does not save any LRNG state between reboots. Hence, the only entropy
source for the LRNG is network interrupts, which might be observable by an adversary (!). This is true in particular for
wireless routers where most network interrupts are caused by wireless activity, which is easily visible by an external
adversary.
Given this data we can conclude that the WRT implementation of the LRNG is weak. The state of the LRNG is
reset in every reboot to a predictable value (composed of the time of day and a constant string), and the only source of
entropy is, to a large extent, observable by external adversaries. The result is that an adversary can simulate the state
and the output of the LRNG. We note that security can be improved by saving LRNG output at shutdown and loading
it into the state at reboot. This would require potential attackers to either eavesdrop to all network traffic from the time
the router was initialized, or obtain access to the LRNG state.
3.4 Security Engineering
Denial of service. There is no limitation on the number of bits a user can read from the random devices per time
unit. However, the secure interface /dev/random blocks its output when the entropy estimate is low, until additional
“noise” is added to the pools. These facts together suggest two denial of service attacks which block all users from
reading /dev/random bits.
The first attack is simply to read bits from /dev/random. As there is no limit and no prioritization, this results
in blocking other users, and might delay them for a long period of time. We tested this attack using simple
dd if=/dev/random and were able to block other readers of /dev/random.
Furthermore, an attack can even be mounted remotely, by triggering system requests for random bytes (get_random_bytes)
in a significantly higher rate than that of the entropy input events. Since the urandom non-blocking pool (from which
these bits are taken) is refilled from the (blocking) primary pool, this attack will result in denial-of-service for the
primary and secondary pools. A simple way for an adversary to issue this attack may be to set many TCP connections.
For each connection, a TCP-syn-cookie is generated, which requires 128 bytes from the non-blocking pool, hence
reducing the entropy count.
Solution. As entropy is a limited and valuable resource, its consumption must be controlled. The common solution
in operating systems for such a resource is through the definition of a new quota per user or group for the consumption
of random bits.
Guessable passwords. Usually, the first user-operation in a computer system is user login, and the first input entered
by the user is the password, or a user-name and password pair.
In a scenario of a disk-less system, without a random seed that is saved between startups, we can imagine a situation
where an attacker knows the initial state of the LRNG, and where the sequence of updates of the LRNG during system
reboot is quite predictable. The state of the LRNG might therefore be, to a large extent, a deterministic function of the
initial password entered by the user. If the attacker is able to read random bits fast enough, it might be able to identify
the password by going over all possible password values, and checking which one results in the LRNG output which
was observed.
This attack, which was noted by Kelsey et al. [14], is particularly relevant if the LRNG does not obtain input from
a hard disk or from interrupts resulting from network events (as is the case for example with one of the most popular
network cards, the 3Com PCI 3c905B Cyclone 100baseTx). In this case the input to the LRNG comes mostly from
13
the user’s input. We attempted to use this observation to extract the user password from the output of the LRNG of
the KNOPPIX Linux distribution [1], which is bootable from a CD and therefore does not save the LRNG state. We
were unsuccessful in this attack, largely because the examined system used a hard disk which provided considerable
amount of entropy during the boot process.
Solution. It is better to remove the influence of the values of keyboard events on the LRNG. Keyboard entropy
should be based on the timing of its events, and not on the type-values. (The timing of keyboard events might also
reveal information about data entered by the user, but this seems like a second-order risk compared to the risk from
the values of keyboard events.)
An adversary can create noise that directly affects the LRNG output. Normally, there is a separation between
the input and output of the LRNG, but when the primary pool is full, the batched entropy is added directly to the
secondary pool, from which it is output when /dev/random is used. This direct flow between the input and the
output of the LRNG might provide an adversary with the ability to create noise that directly affects the generator’s
output.
Solution. It is best to always flush the batched entropy to the primary pool, even if it is full. This makes a strict
separation between the input and the output sides of LRNG.
The LRNG state reveals the previous LRNG output. The Extract algorithm (Fig. 2.3) first updates the pool
and then computes its output. The result of this design decision is that an adversary which learns the internal state of
the LRNG learns the state of the pool which was used to compute the last LRNG output, and can easily compute this
output (and hence break the forward-security of the LRNG).
Solution. It is best to switch the order of operations. The state update will than take place after LRNG output.
4 Conclusions and Recommendations
This paper analyzes the Linux random number generator. The LRNG algorithm is complex and includes a large state
made of three different storage pools, a complex mechanism for adding entropy from system events, and an extraction
algorithm based on a shift register and several SHA-1 operations.
We showed that these layers add complexity to the implementation but do not prevent attacks on the forward
security of the LRNG. In addition we described weaknesses in the OpenWRT Linux distribution.
Our study was conducted on the latest (at the time) Linux kernel, labeled version 2:6:10, which was released on
December 24, 2004. Since then the kernel kept developing. Lately, version 2:6:15 was released in January 2006, and
patches are being published since then4.
Limitations. As far as we could tell the different Linux distributions for PCs (e.g., RedHat, Debian, Slakware) have
little if any effect on the LRNG structure, since all distributions use the same kernel source. Changes occur only within
the system up and down times, and to our findings are only cosmetic. As there are hundreds of different distributions
this statement may be not true for all of them.
Our study does not cover all Linux kernel options. For example, we did not take into account multi-cpu hardware
configurations, or unique hardware configurations such as the Qtronix keyboard and mouse device5 whose entropy
collection method is different than the one described here.
Open source security. The LRNG is an open source project which enables an adversary to read the entire source
code and even trace changes inside the source configuration management system. This feature gives powerful tools to
the adversary. On the other hand, open source benefits security by enabling security audits, and enabling easy changes
to the code. It is rather easy to add patches to the current LRNG code in order to prevent the attacks we described in
this paper (this would have been much harder, if at all possible, for closed source PRNGs).
“Open” is not a synonym for “secure”. We feel that the open source community should have a better policy for
security sensitive software components. These components should not be treated as other source elements. We suggest
4See http://www.linuxhq.com/kernel/file/drivers/char/random.c for the incremental changes in random.c.
5http://lxr.linux.no/source/drivers/char/qtronix.c?v=2.6.10
14
to add a better quality assurance procedure for the cryptographic elements of the kernel. For example, the PRNG must
pass statistical tests which can be put into the kernel build process. Open source must also have, in our opinion, a
clear and updated documentation of the algorithms used in the code. Such documentation could have saved us from
the trouble of reverse engineering the code, and would have provided better access for other researchers to review the
security of the LRNG.
4.1 Recommendations
Following our analysis of the LRNG, we suggest the following recommendations for the design of pseudo-random
number generators.
² Fixing the LRNG. The issues which were reported in this paper should be fixed. In particular, the LRNG code
should be changed to prevent attacks on its forward security. The OpenWRT implementation should be changed
to provide more entropy to the LRNG, or at least save its state during shutdown.
² Implementing a quota for the consumption of random bits. Random bits are a limited resource, and attackers can
easily mount a denial-of-service attack (even remotely) by consuming random bits at a high rate. The common
solution for this type of problem is to implement a quota system which limits the effect of each user, or each
process, on the operation of other users of the same system. Such a quota system should be added to the Linux
kernel.
² Adopting the Barak-Halevi construction. The Barak-Halevi (BH) construction and its analysis [3] are attractive
in their simplicity, which clearly identifies the role of every component of the system, and enables a simple
implementation. In comparison, the current LRNG construction is an overkill in some aspects (like the size
of the pools or the number of SHA-1 invocations), but its complexity does not improve its security but rather
hides its weaknesses. We suggest that future constructions of pseudo-random number generators follow the BH
construction (and in general, try to “keep it simple”).
² Since randomness is often consumed in a multi-user environment, it makes sense to generalize the BH model
to such environments. Ideally, each user should have its own random-number generator, and these generators
should be refreshed with different data which is all derived from the entropy sources available to the system
(perhaps after going through an additional PRNG). This architecture should prevent denial-of-service attacks,
and prevent one user from learning about the randomness used by other users.6
References
[1] http://www.knoppix.org.
[2] http://www.openwrt.org.
[3] B. Barak and S. Halevi. A model and architecture for pseudo-random generation with applications to /dev/random.
In ACM Conference on Computer and Communications Security, pages 203–212, 2005.
[4] J. Castejon-Amenedo, R. McCue and B.H. Simov. Extracting randomness from external interrupts. In The
IASTED International Conference on Communication, Network, and Information Security, pages 141–146, USA,
December 2003.
[5] T. de Raadt, N. Hallqvist, A. Grabowski, A. D. Keromytis, and N. Provos. Cryptography in OpenBSD:
An overview. pages 93–101. USENIX, 1999. http://citeseer.ist.psu.edu/article/
raadt99cryptography.html.
6In the current centralized system this property is not guaranteed. A user which learns the state of the generator can simulate future outputs of
the generator (assuming that no external entropy is added). If the user later reads random bits it can identify their location in the simulated generator
output and learn the values used by other users.
15
[6] D. Eastlake 3rd, and P. Jones. US secure hash algorithm 1 (SHA1). Request for Comments 3174, Internet
Engineering Task Force, September 2001.
[7] N. Ferguson and B. Schneier. Practical Cryptography. John Wiley & Sons, 2003.
[8] I. Goldberg and D. Wagner. Randomness and the Netscape browser. Dr Dobb’s, pages 66–70, January 1996.
[9] P. Gutmann. Software generation of practically strong random numbers. In Proc. of 7th USENIX Security Symposium,
1998. An updated version appears in http://www.cypherpunks.to/˜peter/06_random.pdf.
[10] P. Gutmann. Testing issues with os-based entropy sources. http://www.cs.auckland.ac.nz/
˜pgut001/pubs/nist_rng.pdf, July 2004.
[11] Z. Gutterman and D. Malkhi. Hold your sessions: An attack on Java session-id generation. In A. J. Menezes,
editor, CT-RSA, LNCS vol. 3376, pages 44–57. Springer, February 2005.
[12] J. Kelsey. Entropy and entropy sources in x9.82. In Proceeding of the NIST Random Number Generation
Workshop, July 2004.
[13] J. Kelsey, B. Schneier, and N. Ferguson. Yarrow-160: Notes on the design and analysis of the yarrow cryptographic
pseudorandom number generator. In Selected Areas in Cryptography, LNCS vol. 1758, pages 13–33.
Springer, 1999.
[14] J. Kelsey, B. Schneier, D. Wagner, and C. Hall. Cryptanalytic attacks on pseudorandom number generators. In
Fast Software Encryption, LNCS vol. 1372, pages 168–188. Springer, 1998.
[15] M. Matsumoto and Y. Kurita. Twisted GFSR generators. ACM Transactions on Modeling and Computer Simulation,
2(3):179–194, 1992.
[16] M. Matsumoto and Y. Kurita. Twisted GFSR generators II. ACM Transactions on Modeling and Computer
Simulation, 4(3):254–266, 1994.
[17] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995.
[18] M. R. V. Murray. An implementation of the Yarrow PRNG for FreeBSD. In S. J. Leffler, editor, BSDCon, pages
47–53. USENIX, 2002.
[19] National Institute of Standards and Technology (NIST). Advanced Encryption Standard. Available on: http:
//csrc.nist.gov/CryptoToolkit/aes/.
[20] P. Q. Nguyen. Can we trust cryptographic software? cryptographic flaws in GNU Privacy Guard v1.2.3. In
EUROCRYPT 2004, LNCS vol. 3027, pages 555–570. Springer, 2004.
[21] C. E. Shannon. Communication theory of secrecy systems. Bell System Technical Journal, 28(4), October 1949.
[22] T. Ts’o. random.c—Linux kernel random number generator. http://www.kernel.org.
A Entropy Collection
We explain each “noise” source and the different valid values for the 32 bits of the type-value during the entropy
addition procedure:
² keyboard event. The type-value contains the keyboard press and release codes, valid range between [0; 255].
16
² mouse event.
type-value := (type ¿ 4) © code © (code À 4) © value
Where type describes the event type - pressing or releasing in case of mouse buttons and start movement or
end movement in case of mouse movement; code is the mouse button pressed (left, right or middle) or wheel
scrolling in case of mouse buttons, and the axis of the movement (horizontal or vertical) in case of mouse
movement; value is true/false in case of mouse buttons press or release7, 1 or ¡1 for denoting scrolling direction
(1 for up, ¡1 for down) in case of wheel scrolling, and the size of movement in case of mouse movement. In
short, the mouse data is a 32-bit word with the movement size as its main entropy factor. However, only 10 bits
are used for movement, another 2 bits are used for the buttons, so in fact only 12 out of the 32 bits are effective.
² disk event. Computed at completion of a disk (such as IDE, SCSI, Floppy, block devices) I/O operation. Its
type-value is composed of major and minor numbers (major and minor numbers are operating system symbols
that together uniquely define a device):
type-value := 0x100 + ((major ¿ 20) j minor)
If there is only one IDE disk, the type-value is fixed, since the major and minor numbers are constants. (In
most cases the major is 3 (first IDE disk) and the minor is 0 (master), and their combined type-value yields
0x300100.) Assuming an average machine has no more than 8 disks, the type-value actual span is limited to 3
bits.
² interrupt event. The result of an interrupt occurrence is the IRQ (interrupt request channel) number, with a
valid range of [0,15]. It is important to note that as of the current kernel versions only very limited number of
hardware device drivers supply interrupt values to the LRNG. In many setups interrupts will not add any entropy
events.
B Extracting Randomness from the Primary Pool
The primary pool is updated using a procedure which is described in the following pseudo-code:
Algor i thm E x t r a c t ( pool , nbyt e s , j ) :
wh i l e n b y t e s > 0
tmp := SHA¡1( p ool [ 0 . . 1 5 ] ) / / t h e r e s u l t i s 5 words lo ng
add ( pool , j , tmp [ 0 ] )
tmp := SHA¡1 ’ ( poo l [ 1 6 . . 3 1 ] )
add ( pool , ( j ¡1) mod 128 , tmp [ 2 ] )
tmp := SHA¡1 ’ ( poo l [ 3 2 . . 4 7 ] )
add ( pool , ( j ¡2) mod 128 , tmp [ 4 ] )
tmp := SHA¡1 ’ ( poo l [ 4 8 . . 6 3 ] )
add ( pool , ( j ¡3) mod 128 , tmp [ 1 ] )
tmp := SHA¡1 ’ ( poo l [ 6 4 . . 7 9 ] )
add ( pool , ( j ¡4) mod 128 , tmp [ 3 ] )
tmp := SHA¡1 ’ ( poo l [ 8 0 . . 9 5 ] )
add ( pool , ( j ¡5) mod 128 , tmp [ 0 ] )
tmp := SHA¡1 ’ ( poo l [ 9 6 . . 1 1 1 ] )
add ( pool , ( j ¡6) mod 128 , tmp [ 2 ] )
tmp := SHA¡1 ’ ( poo l [ 1 1 2 . . 1 2 7 ] )
add ( pool , ( j ¡7) mod 128 , tmp [ 4 ] )
add ( pool , ( j ¡8) mod 128 , tmp [ 1 ] )
tmp := SHA¡1 ’ ( poo l [ ( j ¡8) mod 128
. . . ( j ¡8¡15) mod 1 2 8 ] )
tmp := f o l d i n g ( tmp [ 0 . . 4 ] ) / / t h e r e s u l t i s 2 . 5 words lo ng
o u t p u t ( tmp , min ( nbyt e s , 1 0 ) )
7Each action produces an input for all three buttons to the mouse type-value formula. The button that was active gets an action value = 1 while
the others get value = 0.
17
n b y t e s := nbyt e s¡min ( nbyt e s , 10)
j := ( j ¡9) mod 128
end wh i l e
C Probability Calculations
The number of false positives generated by the procedure for reversing the pool is a random variable with the following
distribution: The probability of having k false positives, for k = 0; : : : ; n, is ¡n
k¢n¡k(1 ¡ 1=n)n¡k ¼ e¡1=k!, where
n is either 264 or 296.
Starting from a single state of the pool at time t, let us denote by di the number of false positives at time t ¡ i.
Note that for every i there exists, in addition to the false positives, an additional candidate which is the true value of
the pool at time t ¡ i.
In time t we have one good candidate and no false positives (d0 = 0). Now,
E(d1) =
n Xk=1
ke¡1=k! = e¡1
n¡1 Xk=0
1=k! = e¡1 ¢ e = 1:
It also holds that E(dijdi¡1 = 1) = E(d1) = 1. As a result, E(dijdi¡1 = c) = (c + 1)E(dijdi¡1 = 1) = c + 1.
(We multiply E(dijdi¡1 = 1) by c + 1 since we obtain false positives at time t ¡ i from the c false positives at time
t ¡ i + 1 and in addition from the true value of poolt¡i+1.)
A martingale is a sequence of random variables X0;X1;X2; : : : which satisfies the relation
E(XijXi¡1; : : : ;X0) = Xi¡1:
It is known that for a martingale E(Xi) = E(X0), and there are known tail bounds on the divergence from this
expectation (see [17] for details).
Let us define the sequence zi = di¡i (the deviation of di from the value i). The sequence fdig is not a martingale
but the sequence fzig is, since E(zijzi¡1 = c) = E(dijdi¡1 = c + i ¡ 1) ¡ i = c + i ¡ 1 + 1 ¡ i = c = zi¡1. We
therefore get that E(zi) = E(z0) = 0 and E(di) = i.
It is now possible to apply the Kolmogorov-Doob inequality (see [17]), which states that Pr(max(zi) > b) <
E(z0)=b = 1=b (for the purpose of using this inequality we define zi = di ¡ i + 1 and therefore z0 = 1). As a
corollary we can obtain, for example, that the probability that di is greater than i + 100 is at most 1=100.
18

Linux and Samba Andrew Tridgell



Samba Team
Semantic Mapping
● Providing CIFS file services on Linux is an
exercise in “semantic mapping”. The detail of
mapping that is needed depends on the role the
server needs to play
● most detailed as a NAS box
● dual-mapping for multi-protocol server
● A good example of the semantic mapping
problem is the CIFS equivalent of open(), called
NTCreateX().
● takes 11 parameters and returns 14
CIFS meta-data
● File meta-data in CIFS is more complex than in
POSIX
● 4 settable times (POSIX has “2 and a half” time fields)
● DOS attributes, ACLs and SIDs
● separate allocation size
● 8.3 names
● file IDs
● alternate data streams
● Unfortunately applications do end up relying on
all these bits of meta-data
● the perils of a software monoculture
Some bits already done
● Some bits of CIFS semantics have already been
added to Linux for the 2.4.0 kernel and above
● oplocks
● simple share modes
● directory notify
● These have helped a lot for Samba, but some
have caused maintainence headaches for the
kernel
● How to integrate future CIFS features with less headaches?
Case-Insensitivity
● CIFS needs to be able to export a case insensitive
view of a filesystem. The problem is doing this
efficiently.
● very contentious issue
● problems with charsets
● NT is not quite UTF-16
● kernel maintainers have proposed a possible solution
● smbd to kernel dcache coherence mechanism
● log(N) lookup important?
Locking
● File byte range locking is rarely used in POSIX
● works badly, so programmers avoid it
● few users of it, so not priority to fix it
● CIFS needs more sophisticated byte range
locking
● true 64 bit (not 63 bit or 31 bit)
● no brain-dead “close loses locks on other fds” features
● mandatory locking (needs hook in read/write path)
● lock stacking
● Just solve in user space?
● works, but not good for multi-protocol file servers
File access control
● CIFS users expect full NT ACLs
● impossible to correctly map to POSIX ACLs
● needs SIDs for task security context
● Solve via LSM module?
● Samba LSM module
● NT ACLs and other attributes in an EA
● has sufficient hooks for share modes and locking as well?
Sendfile
● Sendfile seems like an obvious fit for Samba, but
there are potential problems
● header sent first, what to do when sendfile returns short?
● maybe doesn't matter as NT gets it wrong too
● what happens with WinXP SP2 and mandatory packet
signing?
Async IO
● Samba4 is designed around asynchronous
operation, whereas Samba3 is very synchronous
in nature
● How do we do async filesystem requests, like open(),
rename() etc?
● do we have to use pthreads? What about pthread
performance overheads
● see thread_perf.c benchmark - doesn't look good
● can we use direct clone() wrappers, bypassing glibc?
EAs and ACLs
● Samba4 will make extensive use of EAs and
ACLs, for a closer semantic match to CIFS
● use EAs for alternate data streams?
● EAs limited to 64k. What to do about large streams?
● do we really have to read all or nothing? nasty.
● what about performance?
● nobody benchmarks filesystems with ACLs and EAs!
● some filesystems don't journal inode operations involving EAs
● Can we hook all this into LSM sanely?
● Looks like we can
Alternate Data Streams
● NTFS and CIFS have “file streams”
● arbitrarily named additional streams of data in files
● mostly used for meta-data now, like who wrote it
● WinXP SP2 uses them for “security zone” information
● this makes streams much more urgent
● How should we store them?
● In EAs?
● in dot-files or a dot-directory?
● what about large streams?
Content Indexing
● A core part of longhorn and WinFS
● Maybe can be summarised as “open by content”
● real-time indexing essential
● can be very quickly deployed by Microsoft
● Win2k implementation uses periodic indexing
● We need this in Linux!
● Users could quickly become addicted to it
● must be supported on network drives
● uses a strange pipe format

Welcome to the SOAD – Suse On Active Diet – Linux! This 'welcome' is a brief introduction to the system, simple

addition to the official SuSE documentation.
SOAD Linux is not 'yet another distro'. It's the OpenSUSE. Almost all installed components are available via public
repositories or OBS (OpenSUSE Build Service). The best place for you to look for a fresh software is Webpin. You're
Welcome to discover the OpenSUSE project wiki, walk through the official forum or chat on irc.freenode.net (#suse
channel). OpenSUSE provide superior help resources. Use them the way you like.
The main purpose of this disk was to bring the cutting edge svn code of Enlightenment (E) to your desktop. After the
first prototype was released it become clear that the modest efforts was not useless. We tried to pick the best software
and make some tiny adjustments to save your time and deliver the best experience “out from the box”. The following E
resources could be handy: "Enlightenment for openSUSE" wiki page, official site, E-wiki, User Guide, Exchange, Eforum,
various stuff, our download page, themes for E16, E-thread hidden inside the OpenSUSE forum, this LiveCD
thread on E-forum.
To begin with we set the public repository in OBS – Enlightenment-cvs-core-metapackage – and it's recommended for
all interested in Enlightenment (E). It uses the latest svn code and updates monthly (or up on request if the amount of
changes are sufficient). Also we rely on the famous Packman repo and build a couple of packages manually only for
this disk (we also contribute to the Packman the packages which can not be created in the OBS). The core benefits of
our OpenSUSE Enlightenment repository are proper localization patches (you can read your local UTF-8 RSS feeds
with Enlightenment 'news' module for example) and freedom of each user to rebuild E and update installed packages to
the current svn version. Example of rebuild script could be found here.
All required repos in Your SOAD system are set by default. You need only to refresh them (to get the current metadata)
and select the desired components if they're missed on the disk or apply the required updates.
We would be glad if you like the results of our modest work. Any questions, comments, suggestions are encouraged.
SOAD team.
Mailto: sda00 [at] himki [dot] net
Mailto: dmitry.serpokryl [at] gmail [dot] com
P.S. If you like the results of our efforts or wish to see some improvements – just send an e-mail.
August 9, 2009. revision 7.
THE CORE
The main concept is to provide relatively small, fast, stable, secure and "transparent" distribution for users. For those,
who not so lazy to 'vim' /etc/sysconfig/SuSEfirewall2 for example and remember the good old days of SuSE versions
8.2-Pro and below. Enlightenment-DR17 Desktop Shell (E17) along with the EFL (Enlightenment Foundation
Libraries) applications aimed to be a major environment and is used as a default WM. IceWM is provided for those,
who can't live without the panel located in the bottom of your screen and the 'magic Start button' in the lower left
corner. TWM – for the deep meditation in the dark. Selected programs proven to be the best in their area are included in
a base distribution set. The legendary Enlightenment-DR16, WindowMaker and JWM are here to recall the glory of a
good old days. Ecomorph is TEMPORARILY unavailable until we fix the driver related issues. We're sorry!
No KDE/Gnome/Xfce/Others. No Mono. No Java. No laced useless crap to confirm that you do need XGb of RAM,
3D powerhouse and the latest X-Core CPU. No frills.
IGNITION!
It's time to press F3 and select the desired resolution (if you wish to boot into console – type here 'linux 3', this is very
useful if you don't see the 'sax2' setup screen on boot or have any other issues with the video drivers). We have only
1CD, that's why don't expect that applications have the desired localizations out of the box. E17 itself is localized.
Other applications have only English menus. It's easy to install the required language packages from pre-selected
repositories using YaST or 'zypper'.
The first thing you should notice is the appearance of a command prompt on boot which ask you to choose the
following configuration method to be used for system start-up.
The system awaits your selection.
1 – is for default automatic X server configuration and no other questions will be asked until you see the 'Entrance'
login screen.
3 – is for emergency. X (GUI) will not be started and will not be pre-configured (no xorg.conf), you're login to the
console and welcome to do whatever you want. This mode could be very useful if you're experience some issues with
the 'Automatic Configuration' and unable to select tty (Ctl+Alt+F1-4) to see what's going on. Those who prefer to
explore new systems in 'qemu' should appreciate this. You will be informed that
And here's what you get after pressing and Enter again:
Good old login prompt for tty1. All default services are started except X (GUI).
2 – is the advanced mode to tweak your peripherals before login. After a delay you will see the 'Sax2' configuration
window which offers to confirm the proper definition of your devices like Monitor, Keyboard, etc. It's a nice idea to
press “Change Configuration” button and select your Keyboard Layout or redefine the Monitor auto configuration. Test
all changes until you get exactly what you want.
N.B. Starting from the version 3.1.1 of Enlightenment-LiveCD we provide the installed NVIDIA proprietary drivers for
a modern GPU's/(video cards). The ATI users may experience some issues with open source drivers and HIGHLY
advised to visit the openSUSE wiki page:
ATI Driver HOWTO
and install required drivers manually!
NVIDIA users of a legacy video cards also should use the link below to get the proper drivers!
nVidia install HOWTO
Please note, that despite on our tests (which are must before the yet another release of SOAD leaved the development
farm) we can not ensure proper work of a proprietary drivers! Please e-mai us if you wish them to be excluded in the
future releases!
Don't forget to browse through all supported modes for your video card and monitor. Here you can select the desired
resolution. We tried to set the constant fonts resolution to 96.0 DPI despite on the actual screen resolution. At least it
works with 'Entrance'.
After your configuration the 'Sax2' will save the settings and next thing you should choose is the login manager. The
default is 'Entrance'. Also a simple and elegant 'emergency text-mode' are available. “GDM” login manager is planned
as an option for a future releases.
It's recommended to type '1' (without quotes) and press Enter. The most interesting option is '3' (emergency text mode)
of cource. Login as root (root/soad) and welcome to do whatever you like.
The init scripts for X configuration and start-up are adjusted by SOAD team. If something goes wrong – please ask on
forums:
1) Enlightenment forum thread
2) openSUSE forum thread
In rare cases it's best to select the text-mode, login as 'root' and type 'entranced' in command prompt.
If you decide to install the system to your hard drive all the questions above will appear only once or if you decide to
change/upgrade your video card. If you need the emergency text mode login for installed system – see the
recommendation above (type 'linux 3' or 'init 3' at the 'boot:' prompt).
Here is the default 'Entrance' login screen.
Default login details:
User: linux
Pass: soad
User: root
Pass: soad
Choose your favorite environment.
N.B.: please note that we COULD ship the custom kernel for our unofficial Enlightenment LiveCD. We have a limited resources to test
our disk. We're doing our best but unable to assure that everything will work for Your PC configuration. Thanks!
'E16' (Enlightenment-DR16) is one of the most advanced WM's (Window Manager's) in the *nix world. Released in
1999, still under development, stable, powerful and beautiful. We did some customizations to it's look and behaviour.
Hope you like it. Unfortunately you need to issue the following command to allow the modification of the “E-Toolbox”
epplet:
chmod +w ~/.e16/epplet_config/E-Toolbox/E-Toolbox.cfg
Tint2 panel is located on the bottom of this screenshot. It has a taskbar, systray and the clock. You can modify it's
configuration by editing the
$HOME/.config/tint2/tint2rc
file. Also we included several nice “legendary” E16 themes like “Lave” (set as the default), “23Oz”, “DarkOne”,
“Sedation”, “Presence” and the default “Winter” theme. Please copy/link the extra wallpapers to your home dir to be
visible for all E16 themes:
find /usr/share/wallpapers/ -type f -exec ln -sf {} $HOME/.e16/backgrounds/ \;
Another option is IceWM. It has less features than E16, eats more resources. His single advantage is a kind of a
“default traditional look”. Nothing special (comparing to the E16 of course).
That's exactly what you get by selecting IceWM. All customizations are made by OpenSUSE, we only set the
background and select the theme to be used.
Our main playground is E17. Known as “pre-alpha stage forever” project it grows without too much noise with a solid
community and talented dev team. You will like it or hate it at once.
E17 has a basic systray support (shown on the right corner of a shelf, placed atop of the screen). We also included the
'stalonetray' application to the disk just in case.
'Welcome.pdf' is the document you're reading now (the recent version is always available here).
The theme shown above is the new default one for Enlightenment-DR17. E17 has a modular structure/design. Each
component/toolkit has it's own environment and style. If you wish to have a 'unified' look for applications you need to
set similar themes for each toolkit they're using. Use 'etk_prefs' to change ETK theme (ETK – Enlightenment Toolkit),
'ewl_config' for EWL (Enlightenment Widget Library) and EC (Enlightenment Configuration Panel) for E17. Just
press Alt+Esc to have an access to the outstanding 'command line' window. All you type there will be matched against
the records in '*.desktop' files (upper part of the window) and exact filenames in your $PATH environment (lower part
of the window). To select EC make a left click on the desktop or click the start module icon and select “Configuration
Panel” option in the “Configuration” Section. EC is a GUI analogue of a mighty command – 'enlightenment_remote'.
If you wish to explore all customization options – use it. It has a simple structure and a nice context help.
Hint: some apps require the separate 'terminal' for a successful launch. Just hit Ctl+Enter after you type required commant in the 'Exec buffer'.
The first launch of E17 creates your basic environment which could be easily adjusted/changed later. Both E17 and
Ecomorph are using the same User's settings so it's best to have a separate “Profiles” (Menu → Settings → Settings
Panel → Settings → Profiles) for each of them. The first screen will ask you to define your locale (chosse your
preferred language).
The second screen will ask you to choose the base “Profile” from available presets. Three first rows named “SOAD
Default”, “SOAD Orange” and “SOAD Sedation” is our modest attempt to customize the defaults.
Later you can evaluate all “presets” without even logging out by selecting them via mentioned above “Enlightenment
Settings Panel” (Menu → Settings → Settings Panel → Settings → Profiles).
Then you should choose the base for xdg menu (select the first “System Default” option) and pick the applications for
a quick launch via “IBar” (apps launcher module). Suppose that it'd be much easier to define “IBar” launchers later by
navigating to the
Menu → Settings → Settings Panel → Apps → IBar Applications
N.B.: Please note that SOAD* profiles operate with autohided shelf/(panel) placed on top of your screen!

The another shot of a new default look is right below:
We're glad to note that you have 'SCIM' (Smart Common Input Method) installed and the disk is ready to support input
of almost all locales which are available in 'glibc'. Japan, China, Russia, Korea – configure via 'scim-setup' your
keyboard shortcuts, start/restart 'scim' and enjoy!
Please, enable 'SCIM' as your preferred input method in 'Settings → Language → Input Method Settings → System'.
CONFIGURATION
The more you spent discovering the configuration capabilities – the more experience you get. “Enlightenment is a new
philosophy in Window Managers. The premise is simple enough - complete control” (Rob Malda). We recommend to
get familiar with all EC configuration options. Some of them could save you hell of a lot of time. Like “Keyboard &
Mouse” configuration section or tiny option “Profiles” hidden inside the “Advanced” section. Use “Language” section
to set your default locale, “Application” - to control autostart/restart applications or create a new '.desktop' launchers
(new applications). “Extensions” - to deal with various E17 modules (all E17's components are modules and this
section is VERY important one. The less modules you load/enable – the less resources will be consumed by E). In the
“System” section you'll find all YaST configuration options, “File Manager” controls EFM (Enlightenment File
Manager). The default section you can see above is “Appearance”. The most interesting elements here are “Fonts” and
“Colors”. Using “Colors” you can set the font color for selected theme parts. “Fonts” itself control the way your fonts
are rendered by E17. Unfortunately it doesn't mean that your GTK or Qt apps will inherit this settings. You need to
modify ETK or EWL themes to change the fonts in related applications. 'qtconfig' will help you to set the base font for
Qt applications and 'gconf-editor' can accomplish the similar task for GTK (heh, only if you install and launch
'gnome-settings-daemon', also you need to install 'gconf-editor' itself). But the best results achieved when you modify
your ~/.gtkrc-2.0 and ~/.gtkrc files. You need to create the ~/.gtkrc file if you wish to modify look'n'feel of the good
old gtk1 applications like 'xmms'. Look at the preset of ~/.gtkrc-2.0:
# Auto-written by gtk2_prefs. Do not edit.
gtk-theme-name = "vision"
style "user-font"
{
#font_name="DejaVu Sans 10"
font_name="Verdana 10"
}
widget_class "*" style "user-font"
gtk-font-name = "Verdana 10"
gtk-icon-theme-name="black-white_2-Gloss"
Crystal clear that we use “Verdana” font as a default, theme “vision” and icon theme "black-white_2-Gloss" for our
GTK+ applications. 'gtk2_prefs' tool require half of Gnome environment to be installed as a dependency – we don't
need such favor.
We also have a nice GUI configuration utility for your GTK+ apps: 'lxappearance':
'Fontconfig' compiled with the full support of BCI (bytecode interpreter) but without subpixel rendering (because E17
doesn't use it, but operate with Bytecode quite well). To get the basic idea how to tweak the way your fonts are
displayed read the following files:
less /etc/fonts/conf.d/README
vim ~/.fonts.conf
Though we hope that our default presets require no adjustments.
N.B. This system uses mostly the standard openSUSE configuration utilities (YaST, Sax2), but there are some
VERY IMPORTANT differences:
1) please run after any manual adjustment of your “/etc/X11/xorg.conf” file or after any modifications made with Sax2:
> sudo /etc/init.d/create_xconf save-profile
if you wish to save your changes. We implemented a simple “profiler” which automatically remember and restore exact
settings of “/etc/X11/xorg.conf” for each PC were system was ever running. No need to reconfigure various parameters
if you're using a single USB-stick but work with a different computers.
2) use “sys_update” command to update installed packages (perform the update in general) ONLY if system is
installed to the harddrive/(another USB-stick)/(other media). Check the scripts in your “~/bin/” folder!
To adjust your fonts size please load 'Settings – Scaling' module and set the desired scale.
If you wish to use several language layouts and switch between them – start 'sax2' as 'root' (backup your 'xorg.conf'
before!) or edit your /etc/X11/xorg.conf file manually . This is the only TRUE way. It works all the way you start X:
Section "InputDevice"
Identifier "Keyboard[0]"
Driver "kbd"
Option "Protocol" "Standard"
Option "XkbLayout" "us,ru(winkeys)"
Option "XkbModel" "microsoftpro"
Option "XkbOptions" "grp:lctrl_lshift_toggle,grp_led:scroll"
Option "XkbRules" "xorg"
EndSection
In this case we have a Left Control + Left Shift combination to change our layout from the default 'us' to 'ru(winkeys)'
and LED scroll indicates the change of the layout.
N.B. The following command could be run by a simple User and achieve the same result as 'xorg.conf' modification:
setxkbmap -layout us,ru -option grp:lctrl_lshift_toggle,grp_led:scroll -variant winkeys
Hint: ! Look inside your $HOME/bin folder !
NETWORK
The network configuration in OpenSUSE is integrated into the YaST and binded to the applications we didn't want to
use. 'Network Manager' is not an option if you have a vpn connection or several virtual tunnels to be managed.
'Kinternet' is good but “not enough”. We faked YaST that our system has 'kinternet'. It has not indeed. But if you wish
to use YaST for a network configuration – keep in mind that
Traditional Method with ifup is the only 100% working in our case. 'Exalt' is a nice EFL based tool to handle the
network setup. It's under development now and could be included or excluded from our releases. Right now the 'Wicd'
is here to help you configure your network. We hope that 'Wicd' is flexible enough to match your strict requirements:
'Wicd' allow you to run a custom scripts for a fine network interface setup and has a nice feature to remember your
various 'profiles' (very useful if you're travelling a lot).
Example of a bash script to control vpn (dsl0) connection to the ISP could look like this:
cat ~/bin/vpn_up
#!/bin/sh
sudo ifconfig dsl0 down
sudo kill -9 `pgrep smpppd-ifcfg`
sudo kill -9 `pgrep pptp`
sudo kill -9 `pgrep pppd`
sudo /etc/init.d/smpppd restart
sudo route del default
sudo route add -net NETWORK netmask NETMASK gw GATEWAY
sudo route add -net NETWORK netmask NETMASK gw GATEWAY
sudo route add -net NETWORK netmask NETMASK gw GATEWAY
#***ADJUST REQUIRED ROUTING FOR A NEW INTERFACE****#
sudo route add default gw GATEWAY
sudo /usr/sbin/smpppd-ifcfg --ifcfg=ifcfg-dsl0 --provider=PROVIDER --user=USER
sudo ifconfig dsl0 up
sudo route del default
sudo route add default gw GATEWAY_FOR_DSL0_INTERFACE
All you need is to put the full path to the script in 'Wicd' and press 'Apply' button. Of course you should configure this
'vpn' with YaST as a DSL connection first.
Another option is 'cinternet' - simple frontend for 'smpppd'. Choice is all yours.
There's no NetworkManager installed by default. If you need it – just install from public internet repositories.
Disk is loaded with 'madwifi', 'ndiswrapper' and a lot of firmware and drivers for your equipment (take a look at the
'/lib/firmware' folder).
APPLICATIONS AND UTILITIES
The packages selected here match a simple criteria. They're essential to run E16/E17 (they're created with EFL) or
they're the best in their field. Result is quite predictable - a “zoo under the hood”. The only excuse – it works. It works
damn good to be exact. Please refer to the complete list of installed packages to see what is included into the
Enlightenment LiveCD.
Aria2c:
A famous multithread download utility. The supported protocols are HTTP(S), FTP, BitTorrent (DHT, PEX, MSE/PE),
and Metalink. It can download a file from multiple sources/protocols and tries to utilize your maximum download
bandwidth. It even supports downloading a file from HTTP(S)/FTP and BitTorrent at the same time, while the data
downloaded from HTTP(S)/FTP is uploaded to the BitTorrent swarm. Using Metalink's chunk checksums, aria2
automatically validates chunks of data while downloading a file like BitTorrent.
Sedtetris:
Amazing 'Tetris' game written completely on 'sed' by Julia Jomantaite
and located in your ~/bin folder. Just type
sedtris.sh
to start the game.
Dear Julia, it's a piece of art! Thank You!
Chmsee:
Office:
Complete OpenOffice Suite without OpenOffice-Base (databases support) is installed only for USB-stick system!
Multimedia:
The best audio player packed with a killer 31-band equalizer and other 'blow-my-roof-off' plugins still alive. We
contributed all required packages to the 'Packman'.
Compiled with all 'switches' turned 'on'. Bet you like it. Thanks to 'Packman' for the nice 'Mplayer' package.
Other EFL based applications:
Extrackt. Estickies – sticky notes.
Designed to watch TV and movies. Can't handle audio and pictures yet, but ... who knows :).
Run 'rpm -qi rage' to get the idea how to setup the basic environment.
EFM (Enlightenment File Manager) – a simple file manager for Enlightenment-DR17.
INSTALLATION TO THE HARD DRIVE
If you like the distribution and ready to install please read this chapter. If not – skip it. Installation itself is very simple.
Double-click on the 'Installer...' desktop icon, or open 'xterm', become 'root' by typing 'sudo -s' or just 'su' then type
'yast live-installer' and hit Enter. It's straight forward and gives you all options you can expect from SUSE. When you
will be asked to create a new user for the installed system don't hurry and check the groups this user is included into by
default. Inclusion into the following groups could be useful:
Another nasty questions are “how to partition the drive?”, “how much disk space is needed?” and so on. The best
practice is at least to have your '/home' on a separate disk/partition. The rest is up to you. Don't forget to make a 'swap'
partition. We test each LiveCD release in 'qemu' with the total disk capacity of 4Gb divided to the 'swap' partition
(256Mb) and 'root' partition (the rest of the disk, 3.7Gb). This settings allow to boot installed system into E17 with only
32Mb of RAM (swap is used for the rest). It's not the optimal though. If you wish to install all components from
LiveCD to the drive 1.8-1.9 Gb will be occupied, plus inevitable 'swap'. Now you can estimate the total requirements
according to your needs.
There should be no issues with the login into the new installed system. The following commands will allow you to
switch login managers:
/usr/bin/switch_to_entrance
and
/usr/bin/switch_to_gdm
The standard SUSE way to make it via YaST will not work and could cause some inconvenience. To restore/repair the
GUI, log into the 'emergency text-mode' as 'root' and issue one of the commands above.
Another hint is your network setup. If you have a stable network environment which could be managed by YaST you
may wish to disable 'Exalt'. Open your 'xterm', became 'root' (type 'sudo -s' and your password) and type:
chkconfig wicd off
chkconfig network on
If you're happy with 'Wicd' then no adjustments needed. Of course, you can start 'yast2' with a nice GTK GUI, go to the
'System' → 'System Services (Runlevel)' and adjust the running services from a GUI, but... it's too slow :).
Please use 'yast2' or 'sax2' for a routine system maintenance. Don't forget to make a backup of /etc/X11/xorg.conf
BEFORE you launch 'sax2'.
HINTS
Hint:
If you wish to remove 'stalonetray' from E17's startup routine open 'EC' → 'Applications' → 'Startup Applications' and delete 'stalonetray'
entry. If you wish to modify the 'stalonetray' parameters (like size, color, default positioning etc) → adjust the 'stalonetray.desktop' file
located in '~/.local/share/applications' (change the value of 'Exec' string).
Hint:
Check the entries in '/etc/fstab' if you can't write to the NTFS with ordinary user privileges. They should look like:
/MOUNT/VOLUME /MOUNT/POINT ntfs-3g user,users,gid=users,locale=en_US.UTF-8 0 0
and no crap like 'dmask' or 'fmask' here :).
Hint:
If you need to have a sophisticated firewall/iptables configuration just edit '/etc/sysconfig/SuSEfirewall2' file. You can even load your own
rules into the SuSEfirewall schema. Find the FW_CUSTOMRULES variable and look at the examples: '/etc/sysconfig/scripts/'
Hint:
Read this post to get the idea why native E17 screenlock doesn't work in SuSE and how to improve the situation. Another solution is to switch
to the console (Ctl+Alt+F1), login as root, install 'gdb' debugger ('zypper in gdb') and run 'pstree -p' command to get the PID of running E17
(3137 in this case):
then run:
gdb -p "3137"
call e_desklock_hide()
quit
Now you can switch back to GUI (Ctl+Alt+F7).
Hint:
To enable Xorg Composite extension add into the '/etc/X11/xorg.conf' file:
Section "Extensions"
Option "Composite" "Enable"
EndSection
P.S. NVIDIA users should use 'nvidia-xconfig' instead.
Hint:
To enable your Atheros wifi carg you may try to type 'pci=nommconf' (no quotes) in the 'boot:' line of a Welcome screen (first one, where you
can press F3 and change the resolution). Or add the 'pci=nommconf' into the kernel line of /boot/grub/menu.lst (adjust startup parameters). If
you failed to configure your Atheros card – try to update your kernel to the latest Vanilla and reboot. You can find the Vanilla kernels here:
openSUSE Vanilla Kernels
Hint:
Browse your ~/bin/ folder. We'll put there all tiny commands which might be helpful for you.
Hint:
Due to the high demand we included sweel command which never works as expected...
/usr/bin/Kill_X_server
ARE YOU READY TO BE ENLIGHTENED?!
THINK E, BE HAPPY!