Linux Open Book: September 2013

Extending an ext4 filesystem

To resize an ext4 file system you must use the resize4fs command.The resize4fs command is provided by the e4fsprogs package.

# yum install e4fsprogs

Once installed, you should resize the filesystem using the below filesystem. this can be performed online also

# resize4fs /dev/VolGroup00/LogVol02

If the filesystem is "dirty" or has contiguity errors a file system check (fsck) must be run against the file system first. After this, the above resize4fs command should be used and the file system can be mounted again.

# e4fsck -f /dev/VolGroup00/LogVol02

How to resize Logical Volume and filesystem together?

To achieve this you can use the lvresize command with the -r option use -n option to avoid fsck on a mounted filesystem

-----------------------------------------------------------------

# lvresize -L+100m -r /dev/vg_test/testvol

Rounding up size to full physical extent 128.00 MB

fsck from util-linux-ng 2.16

e2fsck 1.41.9 (22-Aug-2009)

/dev/mapper/vg_test-testvol is mounted.

WARNING!!! Running e2fsck on a mounted filesystem may cause

SEVERE filesystem damage.

Do you really want to continue (y/n)? yes

/dev/mapper/vg_test-testvol: recovering journal

/dev/mapper/vg_test-testvol: clean, 11/49152 files, 11862/196608 blocks

Extending logical volume testvol to 320.00 MB

Logical volume testvol successfully resized

resize2fs 1.41.9 (22-Aug-2009)

Filesystem at /dev/mapper/vg_test-testvol is mounted on /mnt; on-line resizing required

old desc_blocks = 1, new_desc_blocks = 2

Performing an on-line resize of /dev/mapper/vg_test-testvol to 327680 (1k) blocks.

The filesystem on /dev/mapper/vg_test-testvol is now 327680 blocks long.

-----------------------------------------------------------------

To avoid fsck use the option -n

-----------------------------------------------------------------

# lvresize -L+100m -r -n /dev/vg_testvg/testvol

Rounding up size to full physical extent 128.00 MB

Extending logical volume testvol to 448.00 MB

Logical volume testvol successfully resized

resize2fs 1.41.9 (22-Aug-2009)

Filesystem at /dev/mapper/vg_testvg-testvol is mounted on /mnt; on-line resizing required

old desc_blocks = 2, new_desc_blocks = 2

Performing an on-line resize of /dev/mapper/vg_testvg-testvol to 458752 (1k) blocks.

The filesystem on /dev/mapper/vg_testvg-testvol is now 458752 blocks long.

-----------------------------------------------------------------

Volume group "vgname" metadata archive failed

The lvextend fails with error message: "Volume group "vgname" metadata archive failed" below is the example of the error

# lvextend -L +1G /dev/vgname/lvname
Couldn't create temporary archive name.
Volume group "vgname" metadata archive failed.

To solve this issue make the /etc/lvm/archive directory writable

How to sync system clock faster with the NTP

Using parameter "iburst" will help to reduce the delay before clock synchronisation process starts.An example /etc/ntp.conf with "iburst" parameter added would be

.................................................
#cat /etc/ntp.conf
.
.
.
server ntpserver1 iburst
server ntpserver2 iburst
server ntpserver3 iburst
.
..................................................

NTP client sync with its hardware clock

The example below shows how to make the sync more accurate. Edit the /etc/sysconfig/ntpd, which is the key configuration for ntpd. Add the line below

..........................................................

# Set to 'yes' to sync hw clock after successful ntpdate.

SYNC_HWCLOCK=yes

..........................................................

All servers have two kinds of clocks; the system clock and the hardware clock.The system clock is owned by the OS, and the hardware clock is owned by CMOS. When the Server is running, it usually provides it's system clock as the clock resource. However, when the server is shutdown, the system clock will sync to hardware and when rebooted the hardware clock will sync to system.The problem occurs because there may be some offset between the two clocks, if the server has been shutdown for a longtime and reboot again, the hardware clock may sync to system that may not be accurate.f SYNC_HWCLOCK=yes is used in /etc/sysconfig/ntpd, it means when the ntpd is started, it will first sync with hardware clock, that will reduce the offset and skew between the two clocks.

How to debug NFS issues

rpcdebug is very helpful in analysing the NFS or RPC communication , you can get more information on how to use it well explained in my previous blog you can refer the below link for more info on rpcdebug

http://linuxblackbook.blogspot.in/2013/09/rpcdebug.html

There are events where we may required more debugging to resolve the issue advanced logging on kernel level with sysctl should be considered. It is not recommended to mix rpcdebug and the following kernel parameter since this would fill up the logs rapidly and would make them hardly readable. As a start only one of them should be used for debugging purposes

To enable additional logging for nfsd use:

# sysctl -w sunrpc.nfsd_debug=1023

for an NFS client issue:

# sysctl -w sunrpc.nfs_debug=1023

and finally for RPC:

# sysctl -w sunrpc.rpc_debug=1023

These commands also lead to additional messages in /var/log/messages that can be viewed via "dmesg" as well. The respective debugging can be disabled by using the same commands like above, but change "1023" to "0"

# sysctl -w sunrpc.nfsd_debug=0

# sysctl -w sunrpc.nfs_debug=0

# sysctl -w sunrpc.rpc_debug=0

How do I test my Samba configuration file for errors ...?

If you are unsure about the configuration of your /etc/samba/smb.conf file and would like to check it, you can run the testparm command against the configuration file.

# testparm /etc/samba/smb.conf

If configuration is correct and no errors are detected, output of the command will contain

.
.
Loaded services file OK.
.
.

Otherwise, an error may appear:

------------------------------------------------------------------------------------------------------
params.c:Parameter() - Ignoring badly formed line in configuration file: RANDOM TEXT HERE TO CAUSE ISSUES

lp_bool(boolean_name): value is not boolean!
-------------------------------------------------------------------------------------------------------

Analyzing Memory Usage in linux

Lack of free memory can either be a symptom of a bigger problem or nothing to worry about at all. Different operating systems handle memory in different ways, so it is important to start with a basic understanding of how memory is handled by the Linux kernel.

Why don't I have more free memory?

The example shown below list the contents of /proc/meminfo

Lets see the important fields relevant to know the basics of memory usage. The Linux kernel attempts to optimize I/O performance by copying what is on the disk into memory for faster access. The amount of memory used by the cache is listed in /proc/meminfo (noted above). Cached memory can be freed quickly if memory is needed for other reasons. However, there are two types of cached pages, and the amount of cached memory that can be evicted depends on how much of the cache is considered dirty.

Dirty cached pages are those that contain changes that the system still needs to write to disk. Whenever the system needs to reclaim memory, it can evict clean cached pages, but dirty cached pages must first be copied back to disk.

Processing can become less efficient if you have a lot of write-heavy operations and your dirty cache is large. There are several tunables you can adjust to reduce the amount of data cached by the Linux kernel. The most effective in this case is dirty_background_ratio, which contains, as a percentage of total system memory, the number of pages at which the pdflush background writeback daemon will start writing out dirty data. The default value is 10 percent.

How much memory is the Kernel taking up ?

The memory used by the linux kernel can be found by adding three of the /proc/meminfo values:slab,dirty, and buffers.

The buffers value reflects the amount of data read off the disk that is not part of a file, which includes data structure information that points to actual files (for example, ext4 inode information).

Most of the memory used by the Linux kernel is listed under slab. When the kernel allocates memory out of the slab cache, it is labeled, and the purpose for which the kernel allocates the memory is recorded in /proc/slabinfo.

Why Is the Kernel Using Swap When There Are Cached Pages Available?

When the kernel swaps out a page, that page stays swapped out until it is used again. This means that if there is a sudden spike in memory usage and pages get swapped out, those pages will not be swapped back in as soon as memory becomes free. The pages will not return until the application to which the pages belong tries to access them.

It should also be noted that Linux tends to favor clearing out the pages used least frequently, regardless of whether they are cached pages that need to be cleared or normal pages that need to be swapped. It might make more sense according to the kernel's heuristics to swap out a page rather than to free the cache. There is some bias in which action the kernel will prefer, though, and how much is tunable by /proc/sys/vm/swappiness.

Machine check events

X86 CPUs report errors detected by the CPU as machine check events (MCEs). These can be data corruption detected in the CPU caches, in main memory by an integrated memory controller, data transfer errors on the front side bus or CPU interconnect or other internal errors. Possible causes can be cosmic radiation, in stable power supplies, cooling problems, broken hardware, or bad luck.Most errors can be corrected by the CPU by internal error correction mechanisms. Uncorrected errors cause machine check exceptions which may panic the machine.When a corrected error happens the x86 kernel writes a record describing the MCE into a internal ring buffer available through the /dev/mcelog device mce log retrieves errors from /dev/mcelog, decodes them into a human readable format and prints them on the standard output or optionally into the system log.

You will see following messages when there is a mce event

Aug 20 17:59:28 hostname kernel: Machine check events logged

Aug 20 18:04:28 hostname kernel: Machine check events logged

This log message indicates that, Machine Check Events have been detected and are available for processing in /dev/mcelog. This logs are then redirected to /var/log/mcelog by the /etc/cron.hourly/mcelog.cron cronjob. The log can be also checked by running the following command

#mcelog

rpcdebug

The rpcdebug command allows an administrator to set and clear the Linux kernel's NFS client and server debug flags. Setting these flags causes the kernel to emit messages to the system log in response to NFS activity; this is typically useful when debugging NFS problems.

Run following command on NFS client. Which will log all activities in the "/var/log/messages"

# rpcdebug -m nfs -s all

----------------------------------------------------------------

-m is used to specify the module other options are

nfsd The NFS server.

nfs The NFS client.

nlm The Network Lock Manager, in either an NFS client or server.

rpc The Remote Procedure Call module, in either an NFS client or server.
-----------------------------------------------------------------

To disable the NFS logs on client which you have enabled

# rpcdebug -m nfs -c all

where "-c" option will, Clear the given debug flags

How to export a directory with NFS version 3

To start with modify the /etc/exports file on the NFS server with below entry

/nfs 192.168.0.*(rw,sync)

In the above example /nfs directory will only be shared to clients in the 192.168.0.0/24 subnet, as denoted by the 192.168.0.*. The options within the share will be read-write (rw) and will only reply to requests after changes have been committed to stable storage (sync).

Start the necessary services in NFS server

Red Hat Enterprise Linux 6 and above

# service rpcbind start
# chkconfig rpcbind on
# service nfslock start
# chkconfig nfslock on
# service nfs start
# chkconfig nfs on

Red Hat Enterprise Linux 5 and below

# service portmap start
# chkconfig portmap on
# service nfslock start
# chkconfig nfslock on
# service nfs start
# chkconfig nfs on

Start the necessary services on the NFS client.

Red Hat Enterprise Linux 6 and above

# service rpcbind start
# chkconfig rpcbind on
# service nfslock start
# chkconfig nfslock on

Red Hat Enterprise Linux 5 and below

# service portmap start
# chkconfig portmap on
# service nfslock start
# chkconfig nfslock on

Mount the NFS share on the NFS client, where <nfs server> is the IP address or hostname of the NFS server.

# mount -t nfs <nfs server>:/nfs /mnt/nfs

NFS clients are not able to mount the NFS share after a reboot

NFS clients are not able to mount the NFS share after a reboot. The same share is accessible on other system's. Even on these system's, fresh mount attempts are not possible. The NFS mount command on the NFS client results in a permission denied error

Client error :

mount: NFS-SERVER:/share failed, reason given by server: Permission denied

NFS Server Messages :

nfsserver mountd[11412]: authenticated mount request from 192.168.0.1:859 for /clients (/clients)

The main cause of this error is the nfsd file system is known to mount when the nfsd module is loaded. The nfsd filesytem is a special filesystem which provides access to the Linux NFS server. The exportfs and mountd programs (part of the nfs-utils package) expect to find this filesystem mounted at /proc/fs/nfsd. Restarting the NFS service will not mount the unmounted nfsd FS because the module is not reloaded.

Check whether the nfsd fs is mounted on the NFS server:

# cat /proc/mounts | grep nfsd

------------------------------------------------------
cat /proc/mounts |grep nfsd
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0

------------------------------------------------------

Execute the following command on the NFS server to resolve the issue

# mount -t nfsd nfsd /proc/fs/nfsd

Try the NFS mounts now

How can I configure RHEL guests to shutdown instead of suspend when the host shuts down

Change the below mentioned parameter in /etc/sysconfig/libvirt-guests

ON_SHUTDOWN=shutdown

SHUTDOWN_TIMEOUT=300

Note :- All running guests are asked to shutdown. Please be careful with this settings since there is no way to distinguish between a guest which is stuck or ignores shutdown requests and a guest which just needs a long time to shutdown. When setting ON_SHUTDOWN=shutdown, you must also set SHUTDOWN_TIMEOUT to a value suitable for your guests , By default the value of SHUTDWON_TIMEOUT is 300 seconds

How to create a rpm from an installed package

To demonstrate this we will use the FTP package

# rpm -qa | grep ftp

lftp-3.7.11-8.el5

tftp-server-0.49-2

ftp-0.17-38.el5

# rpm -e --repackage ftp-0.17-38.el5

# rpm -qa | grep ftp

lftp-3.7.11-8.el5

tftp-server-0.49-2

# ls /var/spool/repackage/

ftp-0.17-38.el5.x86_64.rpm

FTP package was uninstalled, repackaged, and then put in the /var/spool/repackage directory. The package can now be utilized as needed.