Noviembre 17

iSCSI MPIO with Nimble

With the implementation of a new Nimble Storage Array, HCL is changing the storage strategy away from fiber channel to iSCSI. If you have not looked at a Nimble array, you really should. Fantastic!

The Nimble allows for four ethernet ports for iSCSI traffic. To have the highest amount of bandwidth and redundancy, MPIO needs to be configured on the system to communicate with the Nimble.

Target (SAN)

  • Nimble Storage Array CS220-X2
  • Discovery IP: 172.16.2.10
  • Data IP’s: 172.16.2.11, 172.16.2.12, 172.16.13, 172.16.2.14

Initiator (Client)

  • Ubuntu 12.04 LTS
  • Data IP: 10.2.10.46
  • iSCSI IP: 172.16.2.50

Software Prerequisite

# sudo apt-get install open-iscsi open-iscsi-utils multipath-tools

IQN

iSCSI uses an IQN to to refer to targets and initiators. Once you install the open-iscsi package, an IQN will be created for you. This can be found in the /etc/iscsi/initiatorname.iscsi file.

# cat /etc/iscsi/initiatorname.iscsi
## DO NOT EDIT OR REMOVE THIS FILE!
## If you remove this file, the iSCSI daemon will not start.
## If you change the InitiatorName, existing access control lists
## may reject this initiator.  The InitiatorName must be unique
## for each iSCSI initiator.  Do NOT duplicate iSCSI InitiatorNames.
InitiatorName=iqn.1993-08.org.debian:01:48a7e07cd57c

Use this initiator IQN to configure your volume on the Nimble array and to create your initiator group. As a practice, we have decided to build our initiator groups based on IQN vs. the IP address of the initiator systems.

Set iSCSI startup to automatic

# sudo ./chconfig /etc/iscsi/iscsid.conf node.startup automatic

chconfig is a small bash script to execute a sed command to change the value of configuration property to a specific value. It is useful in cases where configuration files are written in the form of property=value. It is available on github.

Discover the Target

# sudo iscsiadm -m discovery -t sendtargets -p 172.16.2.10
172.16.2.12:3260,2460 iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0
172.16.2.11:3260,2460 iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0
172.16.2.13:3260,2460 iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0
172.16.2.14:3260,2460 iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0

If everything is running correctly up to this point, you will see all four paths to the Nimble in the output along with the IQNs of the volumes that you have created. In my case, the volume name is ubuntutest.

Configure Multipath

This step is important to do prior to loging into each of the storage paths.

The first step is to log into one of the data targets.

# sudo iscsiadm -m node --targetname "iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0" --portal "172.16.2.11:3260" --login

Once you are logged in, you will be able to get the wwid of the drive. You will need this for the /etc/multipath.conf. This file configures all of your multipath preferences. To get the wwid…

# sudo multipath -ll
202e7bcc950e534c26c9ce900a0588a97 dm-2 Nimble,Server
size=5.0G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  `- 3:0:0:0 sdb 8:16 active ready running

In my case, the wwid is 202e7bcc950e534c26c9ce900a0588a97. Now, open /etc/multipath.conf in your favorite editor and edit the file so it matches something like this…

defaults {
    udev_dir /dev
    polling_interval 10
    prio_callout /bin/true
    path_checker readsector0
    prio const
    fallback immediate
    use_friendly_name yes
}

devices {
    device {
            vendor "Nimble*"
            product "*"

            path_grouping_policy multibus

            path_selector "round-robin 0"
            # path_selector "queue-length 0"
            # path_selector "service-time 0"
    }
}

multipaths {
    multipath {
            wwid 202e7bcc950e534c26c9ce900a0588a97
            alias data
    }
}

Now would be a good point to reload the multipath service.

# sudo service multipath-tools reload

Continue to log into the iSCSI targets

# sudo iscsiadm -m node --targetname "iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0" --portal "172.16.2.12:3260" --login

# sudo iscsiadm -m node --targetname "iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0" --portal "172.16.2.13:3260" --login

# sudo iscsiadm -m node --targetname "iqn.2007-11.com.nimblestorage:ubuntutest-v681ac6f7ff909e57.0000000a.978a58a0" --portal "172.16.2.14:3260" --login

Once you are completed logging into each target, you can verify your multipath configuration.

# sudo multipath -ll
data (202e7bcc950e534c26c9ce900a0588a97) dm-2 Nimble,Server
size=5.0G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 14:0:0:0 sdb 8:16 active ready  running
  |- 12:0:0:0 sdc 8:32 active ready  running
  |- 13:0:0:0 sdd 8:48 active ready  running
  `- 11:0:0:0 sde 8:64 active ready  running

The drive will be available at /dev/mapper/data.

Next up will be creating a LVM volume and formatting with OCFS2 for shared storage in a cluster

Category: MPIO, STORAGE | Los comentarios están deshabilitados en iSCSI MPIO with Nimble
Noviembre 17

Building a redundant iSCSI and NFS cluster with Debian

In this part of the series, we’ll configure an iSCSI client (“initiator”), connect it to the storage servers and set up multipathing. Note : Since Debian Lenny has been released since this series of articles started, that’s the version we’ll use for the client.

If you refer back to part one to refresh your memory of the network layout, you can see that the storage client (“badger” in that diagram) should have 3 network interfaces :

  • eth0 : 172.16.7.x for the management interface, this is what you’ll use to SSH into it.

And two storage interfaces. As the storage servers (“targets”) are using 192.168.x.1 and 2, I’ve given this client the following addresses :

  • eth1: 192.168.1.10
  • eth2: 192.168.2.10

Starting at .10 on each range keeps things clear – I’ve found it can help to have a policy of servers being in a range of, say, 1 to 10, and clients being above this. Before we continue, make sure that these interfaces are configured, and you can ping the storage server over both interfaces, e.g. try pinging 192.168.1.1 and 192.168.2.1.

Assuming the underlying networking is configured and working, the first thing we need to do is install open-iscsi (which is the “initiator” – the iSCSI client). This is done by a simple :

# aptitude install open-iscsi

You should see the package get installed, and the service started :

Setting up open-iscsi (2.0.870~rc3-0.4) ...
Starting iSCSI initiator service: iscsid.
Setting up iSCSI targets:
iscsiadm: No records found!

At this point, we have all we need to start setting up some connections.

There are two ways we can “discover” targets on a server (well, three actually, if you include iSNS, but that’s beyond the scope of this article).

  • We can use “send targets” – this logs into a iSCSI target server, and asks it to send the initiator a list of all the available targets.
  • We can use manual discovery, where we tell the initiator explicitly what targets to connect to.

For this exercise, I’ll first show how “send targets” works, then we’ll delete the records so we can add them back manually later. Sendtargets can be useful if you’re not sure what targets your storage server offers, but you can end up with a lot of stale or unused records if you don’t trim down the ones you’re not using.

So, to get things rolling, we’ll query the targets available on one of the interfaces we’re going to use (192.168.1.1) – we’ll set up multipathing later. Run the following as root :

iscsiadm -m discovery -t st -p 192.168.1.1

And you should see the following output returned :

192.168.1.1:3260,1 iqn.2009-02.com.example:test

This shows that your initiator has successfully queried the storage server, and has returned a list of targets – which, if you haven’t changed anything since the last article, should just be the one “iqn.2009-02.com.example:test” target. You can always see which nodes are available to your initiator at any time by simply running :

iscsiadm -m node

A few things have happened behind the scenes that it’s worth checking out at this point. After discovering an available target, the initiator will have created a node record for it under /etc/iscsi/nodes. If you take a look in that directory, you’ll see the following file :

/etc/iscsi/nodes/iqn.2009-02.com.example:test/192.168.1.1,3260,1/default

Which is a file that contains specific configuration details for that iSCSI node. Some of these settings are influenced by the contents of /etc/iscsi/iscsid.conf, which governs the overall behaviour of the iSCSI initiator (e.g. settings in iscsid.conf apply to all nodes). We’ll investigate a few of these settings later.

For now though, all your initiator has done is discover a set of available targets, we can’t actually make use of them without “logging in”. So, now run the following as root :

iscsiadm -m node -p 192.168.1.1 -T iqn.2009-02.com.example:test -l

The arguments to this command are largely self-explanatory – we’re performing an operation on a node (“-m node”), are using the portal we queried earlier (“-p 192.168.1.1”), are running the operation on a specific target (“-T iqn.2009-02.com.example:test”) and are logging in to it (“-l”).

You can use the longer form of these arguments if you want – for instance, you could use “–login” instead of “-l” if you feel it makes things clearer (see the man page for iscsiadm for more details). Anyway, you should see the following output after running that command :

Logging in to [iface: default, target: iqn.2009-02.com.example:test, portal: 192.168.1.1,3260]
Login to [iface: default, target: iqn.2009-02.com.example:test, portal: 192.168.1.1,3260]: successful

If you now check the output from “dmesg”, you’ll see output similar to the following in your logs :

[3688756.079470] scsi0 : iSCSI Initiator over TCP/IP
[3688756.463218] scsi 0:0:0:0: Direct-Access     IET      VIRTUAL-DISK     0    PQ: 0 ANSI: 4
[3688756.580379]  sda: unknown partition table
[3688756.581606] sd 0:0:0:0: [sda] Attached SCSI disk

The last line is important – it tells us the device node that the iSCSI node has been created under. You can also query this information by running :

iscsiadm -m session -P3

Which will display a lot of information about your iSCSI session, including the device it has created for you.

If you go back to your storage server now, you can see your client has connected and logged in to the target :

# cat /proc/net/iet/session
tid:1 name:iqn.2009-02.com.example:test
        sid:562949974196736 initiator:iqn.1993-08.org.debian:01:16ace3ba949f
                cid:0 ip:192.168.1.10 state:active hd:none dd:none

You now have a device on your iSCSI client that you can partition and format, just like it was a locally attached disk. Give it a try: fire up fdisk on it, create some partitions, format and mount them. You should find it behaves just the same as a local disk, although the speed will be limited by the capacity of your link to the storage server.

Once you’ve finished, make sure any filesystem you have created on the volume is unmounted, and we’ll then logout of the node and delete it’s record :

# iscsiadm -m node -p 192.168.1.1 -T iqn.2009-02.com.example:test --logout
Logging out of session [sid: 1, target: iqn.2009-02.com.example:test, portal: 192.168.1.1,3260]
Logout of [sid: 1, target: iqn.2009-02.com.example:test, portal: 192.168.1.1,3260]: successful
# iscsiadm -m node -p 192.168.1.1 -T iqn.2009-02.com.example:test -o delete

You should now find that the record for it has been removed from /etc/iscsi/nodes.

Multipathing

We’ll now manually log into the target on both paths to our storage server, and combine the two devices into one multipathed, fault-tolerant device that can handle the failure of one path.

Before we start, you’ll want to change a few of the default settings in /etc/iscsi/iscsid.conf – if you want any nodes you’ve added to the server to automatically be added back when the server reboots, you’ll want to change

node.startup = manual

to

node.startup = automatic

The default timeouts are also far too high when we’re using multipathing – you’ll want to set the following values :

node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 10
node.session.timeo.replacement_timeout = 15

Make sure you restart open-iscsi so these changes get picked up. We can then manually log into both paths to the storage server :

iscsiadm -m node -p 192.168.1.1 -T iqn.2009-02.com.example:test -o new
iscsiadm -m node -p 192.168.1.1 -T iqn.2009-02.com.example:test -l
iscsiadm -m node -p 192.168.2.1 -T iqn.2009-02.com.example:test -o new
iscsiadm -m node -p 192.168.2.1 -T iqn.2009-02.com.example:test -l

Note the use of “-o new” to manually specify and add the node, instead of using sendtargets discovery. After this, you should find that you have two devices created – in my case, these were /dev/sda and /dev/sdb. We now need to combine these using multipathing.

First, install “multipath-tools” :

aptitude install multipath-tools

And then create a default configuration file under /etc/multipath.conf with the following contents :

defaults {
        udev_dir                /dev
        polling_interval        10
        selector                "round-robin 0"
        path_grouping_policy    multibus
        getuid_callout          "/lib/udev/scsi_id -g -u -s /block/%n"
        prio_callout            /bin/true
        path_checker            readsector0
        rr_min_io               100
        rr_weight               priorities
        failback                immediate
        no_path_retry           fail
        user_friendly_names     no
}
blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z][[0-9]*]"
}

The first section sets some defaults for the multipat daemon, including how it should identify devices. The blacklist section lists devices that should not be multipathed so the daemon can ignore them – you can see it’s using regular expressions to exclude a number of entries under /dev, including anything starting with “hd” . This will exclude internal IDE devices, for instance. You may need to tune this to your needs, but it should work OK for this example.

Restart the daemon with

/etc/init.d/multipath-tools restart

And check what it can see with the command “multipath -ll”:

# multipath -ll
149455400000000000000000001000000c332000011000000dm-0 IET     ,VIRTUAL-DISK
[size=1.0G][features=0][hwhandler=0]
_ round-robin 0 [prio=1][active]
 _ 1:0:0:0 sda  8:0    [active][ready]
_ round-robin 0 [prio=1][enabled]
 _ 2:0:0:0 sdb  8:16   [active][ready]

That long number on the first line of output is the WWID of the multipathed device, which is similar to a MAC address in networking. It’s a unique identifier for this device, and you can see the components below it. You’ll also have a new device created under /dev/mapper :

/dev/mapper/149455400000000000000000001000000c332000011000000

Which is the multipathed device. You can access this the same as you would the individual devices, but I always find that long WWID a little too cumbersome. Fortunately, you can assign short names to multipathed devices. Just edit /etc/multipath.conf, and add the following section (replacing the WWID with your value) :

multipaths {
        multipath {
                wwid 149455400000000000000000001000000c332000011000000
                alias mpio
        }
}

And restart multipath-tools. When you next run “multipath -ll”, you should see the following :

mpio (149455400000000000000000001000000c332000011000000) dm-0 IET     ,VIRTUAL-DISK

And you can now access your volume through /dev/mapper/mpio.

Failing a path

To see what happens when a path fails, try creating a filesystem on your multipathed device (you may wish to partition it first, or you can use the whole device) and then mounting it. E.G.

mke2fs -j /dev/mapper/mpio
mount /dev/mapper/mpio /mnt

While the volume is mounted, try unplugging one of the storage switches – in this case, I tried pulling the power supply from the switch on the 192.168.2.x network. I then ran “multipath -ll”, which paused for a short time (the timeout values set above), and then I saw the following :

sdb: checker msg is "directio checker reports path is down"
mpio (149455400000000000000000001000000c332000011000000) dm-0 IET     ,VIRTUAL-DISK
[size=1.0G][features=0][hwhandler=0]
_ round-robin 0 [prio=1][active]
 _ 3:0:0:0 sda  8:0    [active][ready]
_ round-robin 0 [prio=0][enabled]
 _ 4:0:0:0 sdb  8:16   [active][faulty]

So, one path to our storage is unavailable – you can see it marked above as faulty. However, as the 192.168.1.x network path is still available, IO can continue to the remaining “sda” component of the device. The volume was still mounted, and I could carry on copying data to and from it. I then plugged the switch back in, and after a short pause, multipathd shows both paths as active again :

# multipath -ll
mpio (149455400000000000000000001000000c332000011000000) dm-0 IET     ,VIRTUAL-DISK
[size=1.0G][features=0][hwhandler=0]
_ round-robin 0 [prio=1][active]
 _ 3:0:0:0 sda  8:0    [active][ready]
_ round-robin 0 [prio=1][enabled]
 _ 4:0:0:0 sdb  8:16   [active][ready]

You now have a resilient, fault-tolerant iSCSI SAN!

That’s it for this part – in the next part, I’ll add an NFS server to the mix, tie off a few loose ends, and discuss some performance tuning issues, as well as post some scripts I’ve written to automate some of this.

Category: MPIO, STORAGE | Los comentarios están deshabilitados en Building a redundant iSCSI and NFS cluster with Debian