Discussion:
ZFS - Sudden decrease in write performance
Louis
2010-11-15 23:22:25 UTC
Permalink
Hey all1

Recently I've decided to implement OpenSolaris as a target for BackupExec.

The server I've converted into a "Storage Appliance" is an IBM x3650 M2 w/ ~4TB of on board storage via ~10 local SATA drives and I'm using OpenSolaris svn_134. I'm using a QLogic 4Gb FC HBA w/ the QLT driver and presented an 8TB sparse volume to the host due to dedup and compression being turned on for the zpool.

When writes begin, I see anywhere from 4.5GB/Min to 5.5GB/Min and then it drops of quickly (I mean down to 1GB/Min or less). I've already swapped out the card, cable, and port with no results. I have since ensured that every piece of equipment on the box had it's firmware updated. While doing so, I installed Windows Server 2008 to flash all the firmware (IBM doesn't have a Solaris installer).

While in Server 2008, I decided to just attempt a backup via share on the 1Gbs copper connection. I saw speeds of up to 5.5GB/Min consistently and they were sustained throughout 3 days of testing. Today I decided to move back to OpenSolaris with confidence. All writes began at 5.5GB/Min and quickly dropped off.

In my troubleshooting efforts, I have also dropped the fiber connection and made it an iSCSI target with no performance gains. I have let the on board RAID controller do the RAID portion instead of creating a zpool of multiple disks with no performance gains. And, I have created the target LUN using both rdsk and dsk paths.

I did notice today though, that there is a direct correlation between the ARC memory usage and speed. Using arcstat.pl, as soon as arcsz hits 1G (half of c column [commit?]), my throughput hits the floor (i.e. 600MB/Min or less). I can't figure it out. I tried every configuration possible.
--
This message posted from opensolaris.org
Richard Elling
2010-11-16 08:35:47 UTC
Permalink
Post by Louis
Hey all1
Recently I've decided to implement OpenSolaris as a target for BackupExec.
The server I've converted into a "Storage Appliance" is an IBM x3650 M2 w/ ~4TB of on board storage via ~10 local SATA drives and I'm using OpenSolaris svn_134. I'm using a QLogic 4Gb FC HBA w/ the QLT driver and presented an 8TB sparse volume to the host due to dedup and compression being turned on for the zpool.
When writes begin, I see anywhere from 4.5GB/Min to 5.5GB/Min and then it drops of quickly (I mean down to 1GB/Min or less). I've already swapped out the card, cable, and port with no results. I have since ensured that every piece of equipment on the box had it's firmware updated. While doing so, I installed Windows Server 2008 to flash all the firmware (IBM doesn't have a Solaris installer).
While in Server 2008, I decided to just attempt a backup via share on the 1Gbs copper connection. I saw speeds of up to 5.5GB/Min consistently and they were sustained throughout 3 days of testing. Today I decided to move back to OpenSolaris with confidence. All writes began at 5.5GB/Min and quickly dropped off.
In my troubleshooting efforts, I have also dropped the fiber connection and made it an iSCSI target with no performance gains. I have let the on board RAID controller do the RAID portion instead of creating a zpool of multiple disks with no performance gains. And, I have created the target LUN using both rdsk and dsk paths.
I did notice today though, that there is a direct correlation between the ARC memory usage and speed. Using arcstat.pl, as soon as arcsz hits 1G (half of c column [commit?]), my throughput hits the floor (i.e. 600MB/Min or less). I can't figure it out. I tried every configuration possible.
I'm not sure what you mean by "every configuration possible," but did you try not using
dedup? Dedup needs plenty of RAM for performance, and since you didn't mention
how much RAM you have, we can only guess that it isn't much.
-- richard

Loading...