Entries Tagged as 'Linux'

Ubuntu – Creating A RAID5 Array

A RAID5 array is a fault tolerant disk configuration which uses a distributed parity block; this provides the ability to lose one drive (or have damaged sectors on one drive) and still retain data integrity.

RAID5 will likely have slightly lower write performance than a single drive; but will likely have significantly better read performance than a single drive. Other types of RAID configurations will have different characteristic.  RAID5 requires a minimum of three drives, and may have as many drives as desires; however, at some point RAID6 with multiple parity blocks should be considered because of the potential of additional drive failure during a rebuild.

The following instructions will illustrate the creation of a RAID5 array with four SATA drives.

Remember, all these commands will need to be executed with elevated privileges (as super-user), so they’ll have to be prefixed with ‘sudo’.

First step, select two disks — preferably identical (but as close to the same size as possible) that don’t have any data on them (or at least doesn’t have any important data on them). You can use Disk Utility (GUI) or gparted (GUI) or cfdisk (CLI) or fdisk (CLI) to confirm that the disk has no data and change (or create) the partition type to “Linux raid autotected” (type “fd”) — also note the devices that correspond to the drive, they will be needed when building the array.

Check to make sure that mdadm is installed; if not you can use the GUI package manager to download and install it; or simply type:

  • apt-get install mdadm

For this example, we’re going to say the drives were /dev/sde /dev/sdf /dev/sdg and /dev/sdh.

Create the RAID5 by executing:

  • mdadm ––create /dev/md1 ––level=5 ––raid-devices=4 /dev/sd{e,f,g,h}1

Now you have a RAID5 fault tolerant drive sub-system, /dev/md1 (the defaults for chunk size, etc are reasonable for general use).

At this point you could setup a LVM volume, but we’re going to keep it simple (and for most users, there’s no real advantage to using LVM).

Now you can use Disk Utility to create a partition (I’d recommend a GPT style partition) and format a file system (I’d recommend ext4).

You will want to decide on the mount point

You will probably have to add an entry to /etc/fstab and /etc/mdadm/mdadm.conf if you want the volume mounted automatically at boot (I’d recommend using the UUID rather than the device names).

Here’s an example mdadm.conf entry

  • ARRAY /dev/md1 level=raid5 num-devices=4 UUID=d84d477f:c3bcc681:679ecf21:59e6241a

And here’s an example fstab entry

  • UUID=00586af4-c0e8-479a-9398-3c2fdd2628c4 /mirror ext4 defaults 0 2

You can use mdadm to get the UUID of the RAID5 container

  • mdadm ––examine ––scan

And you can use blkid to get the UUID of the file system

  • blkid

You should probably make sure that you have SMART monitoring installed on your system so that you can monitor the status (and predictive failure) of drives. To get information on the RAID5 container you can use the Disk Utility (GUI) or just type

  • cat /proc/mdstat

There are many resources on setting RAID5 sub-systems on Linux; for starters you can simply look at the man pages on the mdadm command.

NOTE: This procedure was developed and tested using Ubuntu 10.04 LTS x64 Desktop.

Originally posted 2010-06-29 02:00:15.

Ubuntu – RAID Creation

I think learning how to use mdadm (/sbin/mdadm) is a good idea, but in Ubuntu Desktop you can use Disk Utility (/usr/bin/palimpsest) to create most any of your RAID (“multiple disk”) configurations.

In Disk Utility, just access “File->Create->Raid Array…” on the menu and choose the options.  Before doing that, you might want to clear off the drives you’re going to use (I generally create a fresh GTP partition to insure the drive is ready to be used as a component of the RAID array).

Once you’ve created the container with Disk Utility; you can even format it with a file system; however, you will still need to manually add the entries to /etc/mdadm/mdadm.conf and /etc/fstab.

One other minor issue I noticed.

I gave my multiple disk containers names (mirror00, mirror01, …) and Disk Utility will show them mounted on device /dev/md/mirror00 — in point of fact, you want to use device names like /dev/md0, /dev/md1, … in the /etc/mdadm/mdadm.conf file.  Also, once again, I highly recommend that you use the UUID for the array configuration (in mdadm.conf) and for the file system (in fstab).

Originally posted 2010-07-12 02:00:33.

Desktop Sharing

Maybe I’ve become spoiled, but I just expect desktop sharing (remote control) to be easy and fast.

Nothing, absolutely nothing compares to Microsoft’s RDP; and virtually any Windows machine (except home editions) can be accessed remotely via RDP; and all Windows machines and Macs can access a remote Windows machine.

Apple has their own Remote Desktop Client, and it works well — but it’s far from free (OUCH, far from free).  And Apple does build in VNC into OS-X (can you say dismally slow)… but they don’t provide any Windows client.

Linux and other *nix operating system you can use an X session remotely; or VNC (zzzzzzzzzzzzz again, slow).

As a “universal” desktop sharing solution VNC isn’t horrible (and it’s certainly priced right, and there’s plenty of different ports and builds of it to choose from), but it’s old school and old technology.

I personally think it would be a great standard to have an efficient remote desktop sharing standard, that all computers (and PDAs) could use… one ring — eh, got carried away there; one client could talk to any server, and operating system vendors would only need optimize their server and their client, other operating system vendors would do the same…

Originally posted 2009-02-23 01:00:41.

Ubuntu – Disk Utility

When you install Ubuntu 10.04 Desktop, the default menu item for Disk Utility isn’t extremely useful; after all, it’s on the System->Administration menu, so you would assume that it’s meant to administer the machine, not just view the disk configuration.

What I’m alluding to is that by default Disk Utility (/usr/bin/palimpsest) is not run with elevated privileges (as super-user), but rather as the current user — which if you’re doing as you should be, that’s means you won’t be able to effect any changes, and Disk Utility will probably end up being a waste of time and effort.

To correct this problem all you need do is modify the menu item which launches Disk Utility to elevate your privileges before launching (using gksu) — that, of course, assumes that you’re permitted to elevate your privileges.

To do add privilege elevation to disk utility:

  1. Right click your mouse on the menu bar along the top (right on system is good) and select ‘edit menu items’
  2. Navigate down to ‘administration’ and select it in the left pane
    Select ‘disk utility’ in the right pane
  3. Select ‘properties’ in the buttons on the right
  4. Under ‘command’ prefix it with ‘gksu’ or substitute ‘gksu /usr/bin/palimpsest’ (putting the entire path there)
  5. Then click ‘close’ and ‘close’ again…

Originally posted 2010-06-27 02:00:33.

Disk Bench

I’ve been playing with Ubuntu here of late, and looking at the characteristics of RAID arrays.

What got me on this is when I formatted an ext4 file system on a four drive RAID5 array created using an LSI 150-4 [hardware RAID] controller I noticed that it took longer than I though it should; and while most readers probably won’t be interested in whether or not to use the LSI 150 controller they have in their spare parts bin to create a RAID array on Linux, the numbers below are interesting just in deciding what type of array to create.

These numbers are obtained from the disk benchmark in Disk Utility; this is only a read test (write performance is going to be quite a bit different, but unfortunately the write test in Disk Utility is destructive, and I’m not willing to lose my file system contents at this moment; but I am looking for other good benchmarking tools).

drives avg access time min read rate max read rate avg read rate

ICH8 Single 1 17.4 ms 14.2 23.4 20.7 MB/s
ICH8 Raid1 (Mirror) 2 16.2 ms 20.8 42.9 33.4 MB/s
ICH8 Raid5 4 18.3 ms 17.9 221.2 119.1 MB/s
SiL3132 Raid5 4 18.4 ms 17.8 223.6 118.8 MB/s
LSI150-4 Raid5 4 25.2 ms 12.5 36.6 23.3 MB/s

All the drives used are similar class drives; Seagate Momentus 120GB 5400.6 (ST9120315AS) for the single drive and RAID1 (mirror) tests, and Seagate Momentus 500GB 5400.6 (ST9500325AS) for all the RAID5 tests.  Additionally all drives show that they are performing well withing acceptable operating parameters.

Originally posted 2010-06-30 02:00:09.


I’ve written about 7-Zip before; but since we’re on the verge of a significant improvement I felt it was time to highlight it again.

7-Zip is a file archiver written by Igor Pavlov.  Originally only available for Windows, but now available for most every operating system.

7-Zip was one of the first archiving tools to include LZMA (Lempel-Ziv-Markov chain algorithm); and consistently demonstrated much higher compression ratios at much higher compression rates than any other compression scheme.

The next release of 7-Zip (9.10) will include LZMA2.

The source code for the LZMA SDK has been put into the public domain, and is freely available for use in other products.  The SDK includes the main line C++ course, ANSI-C compatible LZMA and XV source code; C#  LZMA compression and decompression source code; Java LZMA compression and decompression source code; as well as other source code.

You can read all the features of LZMA as well as download the Windows version of 7-Zip and locate links for pZip for *nix operating systems.  You can also do a search for tvx or vx for *nix based systems as well.

This is the only archive utility you need; it would have been nice had Microsoft chosen to base the folder compression in Windows 7 on the LZMA SDK, or at least made it easy to replace the compression module; but 7-Zip installs a Windows shell extension so you have a separate (though confusing for some) menu item for compression and decompression.


Originally posted 2010-01-21 01:00:14.

Usability Summary

I think I can sum up the real problem with Linux or any open source system where there’s no real usability mandate…

Developers make arbitrary decisions that suit their needs without any regard to how others will view their decisions or even figure out how to use what they do… the real difference between Windows and OS-X and Linux is that two of those are the cooperative efforts of “experts” who try very hard to address the needs of a target audience who wouldn’t be capable of writing their own operating system.

And, of course, with something like Linux it’s geometrically worse than most open source software since any given Linux is the culmination of hundreds of separate open source modules put together in a completely arbitrary fashion.

It really is funny that what I’ve been describing as a lack of cohesiveness is layered; and I suspect no matter what the intentions of a single developer to try and wrap it inside a nice pretty shell that gives a forward facing pretense of a system that was planned and targeted for productivity, the ugly truth of how much a patch work it is will show through… and we can look back on early versions of Windows and MacOS and see just that… it’s really only been within the last five or six years that those systems have risen to the point that they are in fact fairly cohesive, and designed to be tools for people to solve problems with; not projects for people to build for the sole purpose of developing a life of their own.

Without some unifying direction, the only Linux I can see suceeding is Android; and that my friends is likely to become a collection of closed source tools running on top of an open source kernel.  Trust me, you haven’t seen an evil empire until Google gets on your desktop, phone, settop box, etc…

Originally posted 2010-01-11 01:00:10.

Linux – Desktop Search

A while ago I published a post on Desktop Search on Linux (specifically Ubuntu).  I was far from happy with my conclusions and I felt I needed to re-evaluate all the options to see which would really perform the most accurate search against my information.

Primarily my information consists of Microsoft Office documents, Open Office documents, pictures (JPEG, as well as Canon RAW and Nikon RAW), web pages, archives, and email (stored as RFC822/RFC2822 compliant files with an eml extension).

My test metrics would be to take a handful of search terms which I new existed in various types of documents, and check the results (I actually used Microsoft Windows Search 4.0 to prepare a complete list of documents that matched the query — since I knew it worked as expected).

The search engines I tested were:

I was able to install, configure, and launch each of the applications.  Actually none of them were really that difficult to install and configure; but all of them required searching through documentation and third party sites — I’d say poor documentation is just something you have to get used to.

Beagle, Google, Tracker, Pinot, and Recoll all failed to find all the documents of interest… none of them properly indexed the email files — most of the failed to handle plain text files; that didn’t leave a very high bar to pick a winner.

Queries on Strigi actually provided every hit that the same query provided on Windows Search… though I have to say Windows Search was easier to setup and use.

I tried the Neopomuk (KDE) interface for Strigi — though it just didn’t seem to work as well as strigiclient did… and certainly strigiclient was pretty much at the top of the list for butt-ugly, user-hostile, un-intuitive applications I’d ever seen.

After all of the time I’ve spent on desktop search for Linux I’ve decided all of the search solutions are jokes.  None of them are well thought out, none of them are well executed, and most of them out right don’t work.

Like most Linux projects, more energy needs to be focused on working out a framework for search than everyone going off half-cocked and creating a new search paradigm.

The right model is…

A single multi-threaded indexer running in the background indexing files according to a system wide policy aggregated with user policies (settable by each user on directories they own) along with the access privileges.

A search API that takes the user/group and query to provide results for items that the user has (read) access to.

The indexer should be designed to use plug-in modules to handle particular file types (mapped both by file extension, and by file content).

The index should also be designed to use plug-in modules for walking a file system and receiving file system change events (that allows the framework to adapt as the Linux kernel changes — and would support remote indexing as well).

Additionally, the index/search should be designed with distributed queries in mind (often you want to search many servers, desktops, and web locations simultaneously).

Then it becomes a simple matter for developers to write new/better indexer plug-ins; and better search interfaces.

I’ve pointed out in a number of recent posts that you can effective use Linux as a server platform in your business; however, it seems that if search is a requirement you might want to consider ponying up the money for Microsoft Windows Server 2008 and enjoy seamless search (that works) between your Windows Vista / Windows 7 Desktops and Windows Server.


Ubuntu – Desktop Search

Originally posted 2010-07-16 02:00:19.

Linux Server

I’ve been experimenting with a Linux server solution for the past couple months — I was prompted to look at this when my system disk failed in a Windows Server 2008 machine.

First, I’m amazed that after all these years Microsoft doesn’t have a standard module for monitoring the health of a system — at the SMART from disk drives.

I do have an Acronis image of the server from when I first installed it, but it would be a pain to reconfigure everything on that image to be as it was — and I guess I just haven’t been that happy with Windows Server 2008.

I personally find Windows Server 2008 needlessly complicated.

I’m not even going to start ranting on Hyper-V (I’ve done that enough, comparing it head-to-head with other technology… all I will say is it’s a good thing their big competitor is Vmware, or else Microsoft would really have to worry about having such a pathetic virtualization offering).

With a Linux distribution it’s a very simple thing to install a basic server. I actually tried Ubuntu, Centos, and Fedora. I also looked at the Xen distribution as well, but that wasn’t really of interest for a general purpose server.

Personally I found Centos (think Red Hat) to be a little too conservative on their releases/features; I found Fedora to be a little too bleeding edge on their releases/features (plus there’s no long term support commitment); so I was really just left with Ubuntu.

I didn’t really see any reason to look exhaustively at every Debian based distribution — Ubuntu was, in my mind, the best choice of that family; and I didn’t want to look at any distribution that wasn’t available at no cost, nor any distribution that didn’t have a good, stable track record.

With Ubuntu 10.04 LTS (10.04 is a Long Term Support release – which makes it a very good choice to build a server on) you could choose the Desktop or the Server edition — the main difference with the Server verses the Desktop is that the server does not install the XServer and graphical desktop components (you can add them).

The machine I was installing on had plenty of memory and processor to support a GUI, and I saw no reason not to install the Desktop version (I did try out the server version on a couple installs — and perhaps if you have an older machine or a machine with very limited memory or a machine that will be taxed to it’s limits or a machine that you want the absolute smallest attack surface you’d want desktop — though almost all those requirements would probably make me shift to Centos rather than Ubuntu).

My requirements were fairly simple — I wanted to replace the failed Windows 2008 Server with a machine that could perform my DNS, DHCP, web server, file store (home directories — served via CIFS/Samba), and active P2P downloads.

Additionally, the server would have to have fault-tolerate file systems (as did the Windows server).

Originally my testing focused on just making sure all the basic components worked, and worked reasonably well.

Then I moved on to getting all the tools I had written working (I converted all the C# code to PHP).

My final phase involved evaluating fault tolerant options. Initially I’d just used the LSI 150-4 RAID controller I had in the Windows Server 2008 (Linux supported it with no real issues — except that Linux was not able to monitor the health of the drives or the array).

I didn’t really see much need to use RAID5 as I had done with Windows Server 2008; so I concentrated on just doing RAID1 (mirroring) — I tried basic mirrors just using md, as well as using lvm (over md).

My feelings were that lvm added an unnecessary level of complexity on a standalone server (that isn’t to say that lvm doesn’t have feature that some individuals might want or need). So my tests focused primarily on just simple mirrors using md.

I tested performance of my LSI 150-4 RAID5 SATA1 PCI controller (with four SATA2 drives) against RAID1 SATA2 using Intel ICH9 and SiI3132 controllers (with pairs of SATA1 or SATA2 drives). I’d expected that the LSI 150-4 would outperform the md mirror with SATA1 drives on both read and write, but that with SATA2 drives I’d see better reads on the md mirror.

I was wrong.

The md mirrors actually performed better across the board (though negligibly better with SATA1 drives attached) — and the amazing thing was that CPU utilization was extremely low.

Now, let me underscore here that the LSI 150-4 controller is a PCI-X (64-bit) controller that I’m running as PCI (32-bit); and the LSI 150-4 represents technology that’s about six years old… and the LSI 150-4 controller is limited to SATA1 with no command set enhancements.

So this comparison wouldn’t hold true if I were testing md mirrors against a modern hardware RAID controller — plus the other RAID controllers I have are SAS/SATA2 PCIe and have eight and sixteen channels (more spindles means more performance).

Also, I haven’t tested md RAID5 performance at all.

My findings at present are that you can build a fairly high performance Linux based server for a small investment. You don’t need really high end hardware, you don’t need to invest in hardware RAID controllers, and you don’t need to buy software licenses — you can effectively run a small business or home office environment with confidence.

Originally posted 2010-06-24 02:00:09.

VirtualBox LinuxDesktop RealPerformance

The other day I installed VirtualBox OSE on my Ubuntu machine so that I could migrate over a Windows Server 2003 machine.  I wasn’t really expecting great performance since I was putting the virtual disks on a single spindle…

Sometimes you get a good surprise.

When I started up the virtual instance, it seemed very fast — so I shut it down and started it again.  Then I performed a few quick tests and I realized that not only was VirtualBox on a Ubuntu 10.04LTS Linux machine substantially faster than on a Windows 7 machine (with a faster hard disk and faster processor), but it was faster than on a Windows Server 2008 machine running Hyper-V.

The really incredible thing was that Hyper-V was running on a disk array with fifteen spindles verses a single spindle for VirtualBox.

I really didn’t have any way to do a set of rigorous tests, but what I found was that as long as the disk wasn’t saturated, VirtualBox was able to handily outperform Hyper-V on every test (read or write) that I performed… it was only when I started to push near to the limits of the drive that VirtualBox and Hyper-V had similar disk IO performance.

I didn’t evaluate how VirtualBox performed on Linux with a disk array, but my guess is that it’s simply much more efficient at scheduling disk IO than Hyper-V; and likely Linux is more efficient at disk IO than Windows period.

I’m a huge fan of VirtualBox; and if I knew now what I knew about Hyper-V eighteen months ago I would have avoided it like the plague and simply used VirtualBox or Xen as a virtualization solution.

I’ll put a more thorough investigation of disk IO and VirtualBox verses Hyper-V performance on my “TO-DO” list; but I don’t expect it’ll float to the top until this Winter at the earliest; until then my advice is choose VirtualBox (or Xen).

Originally posted 2010-08-24 02:00:27.