Tuesday, 22 December 2015

ZFS, like a work of art

A subtle thing with ZFS is you'll notice how the drive L.E.D.s flash quite differently to typical storage arrays, when you understand more under the hood you'll know why that is. So just looking in a DC you'd be able to observe this across which servers for example. You can see this type of effect here to illustrate - https://www.youtube.com/watch?v=LS3cfl-7n-4

ofc thats ZFS on linux.. which is implemented as a FUSE so less efficent than that of a FS in kernel space as elaborted across various posts, some examples: https://lkml.org/lkml/2007/4/16/133 , https://lkml.org/lkml/2007/4/16/83

example pool using raidz2 with hot spares, which will autoreplace in the event a drive or 2 fail. Creating with brackets like this is always easier - c4t{0..1}d0. Also have to get the order of commands to be correct or you may be second guessing...

# zpool create data c0t50004CF210AD1C22d0 c0t50004CF210BE51F1d0 c0t50004CF210BE51F3d0 c0t50004CF210BE5214d0 c4t{0..1}d0 raidz2
Unable to build pool from specified devices: invalid vdev specification: raidz2 requires at least 3 devices

# zpool create -o atime=off -o compress=lz4 data raidz2 c0t50004CF210AD1C22d0 c0t50004CF210BE51F1d0 c0t50004CF210BE51F3d0 c0t50004CF210BE5214d0 c4t{0..1}d0
# zpool add data spare c4t3d0 c5t3d0
# zpool status
  pool: data
 state: ONLINE
  scan: none requested

        NAME                       STATE     READ WRITE CKSUM
        data                       ONLINE       0     0     0
          raidz2-0                 ONLINE       0     0     0
            c0t50004CF210AD1C22d0  ONLINE       0     0     0
            c0t50004CF210BE51F1d0  ONLINE       0     0     0
            c0t50004CF210BE51F3d0  ONLINE       0     0     0
            c0t50004CF210BE5214d0  ONLINE       0     0     0
            c4t0d0                 ONLINE       0     0     0
            c4t1d0                 ONLINE       0     0     0
          c4t3d0                   AVAIL  
          c5t3d0                   AVAIL

Then as always test the assumption and it works as expected. I've got hot swap capabilities so pulled a drive out to simulate then try write some data and looks to have worked.

# zpool status -xv
  pool: data
 state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or 'fmadm repaired', or replace the device
    with 'zpool replace'.
  scan: resilvered 136K in 1s with 0 errors on Wed Dec 23 05:53:44 2015


    NAME                         STATE     READ WRITE CKSUM
    data                         DEGRADED     0     0     0
      raidz2-0                   DEGRADED     0     0     0
        c0t50004CF210AD1C22d0    ONLINE       0     0     0
        c0t50004CF210BE51F1d0    ONLINE       0     0     0
        spare-2                  DEGRADED     0     0     0
          c0t50004CF210BE51F3d0  UNAVAIL      0    24     0
          c4t3d0                 ONLINE       0     0     0
        c0t50004CF210BE5214d0    ONLINE       0     0     0
        c4t0d0                   ONLINE       0     0     0
        c4t1d0                   ONLINE       0     0     0
      c4t3d0                     INUSE  
      c5t3d0                     AVAIL  

device details:

    c0t50004CF210BE51F3d0      UNAVAIL       too many errors
    status: FMA has faulted this device.
    action: Run 'fmadm faulty' for more information. Clear the errors
        using 'fmadm repaired'.
       see: http://support.oracle.com/msg/ZFS-8000-FD for recovery

Saturday, 12 December 2015

Night Shifts....

For several years now I've had to do night shifts, it isn't something I ever wanted to do as I understand it is simply not good for your health. (neither is sat in a chair for almost 12 hours a day or night either) but in this industry it is common and in my case it is part of the job so overall it seems to be the best choice. Can't always get everything so just picking and choosing what's important.

If you have to do night shifts here is how you can be better prepared and avoid some issues that I've faced in past experience.

When I first started I had trouble even remembering things, to the point that I'd forget stuff within minutes. I had problems remaining awake even with adequate sleep to the point that I had my head on the desk struggling and the desire to sleep was overpowering. Those things can and have lead to silly, unnecessary mistakes.

I was told early on "drink lots of coffee :)" and of course sleep enough in the day. Firstly I tried this several times (some tea and or coffee) then I was so tired one shift I decided it may be a good idea to eat raw instant coffee, that just tasted bad and this has limited effectiveness. One colleague told me over a beer that scientific studies comparing tea, coffee and sugar intake for tests to remain awake for long periods overall proved sugar is the most effective way to keep yourself awake. Tea/Coffee although having a half life of 6 hours (subject to types,volume etc) is only initially suppressing sleep as it is binding to receptors within the brain which ordinarily are done by the chemical adenosine which would be binding to neurons reducing activity making you feel drowsy. After the caffeine wears off those chemicals are still present  and with this increasing the firing rate of neurons activity may over time prove to be less effective. Long story short having sugar seems the most effective way to remain awake is this is your brain and bodies main fuel source, glycogen used as energy. If you have sugar in various forms it only stimulates you to help remain active. If you keep putting in fuel the engine still runs. I found it hard to believe at first as it meant I was wrong for a few years and its commonly showed even on TV, we need to stay awake so drink coffee... one episode from Star gate SG-1 springs to mind... When I've tried sugar instead I've kept awake easier and much longer with even less sleep prior. (was still a wreck though)

The sleep part is the vast majority of people have sleep cycles which are 90 mins but can range from 70-120 mins which is less common. Waking on an aligned sleep cycle is essential for the entire day. Being woken up mid cycle your just zombified and it doesn't go away sometimes no matter what you try. So most cases you want 4-6 of these cycles and wake right at the waking stages to feel much more awake that will last the entire day.

A key useful thing here that helps either in day or night if you are able is drink a tea or coffee then have a nap which ranges from a minimum 5 minutes up to a maximum of 20 mins. so 5,10,15,20 as a target. This way while you digest and process the effects of some caffeine you have reduced the amount of adenosine residing in your brain just after then the caffeine can kick in. Also a good time  to do this is in the middle of your waking hours (so midday or midnightish). That may look contradictory to the previous statement about sugar and tea/coffee but the point is to make some tests personally and find what works, so the thing here is to have and rely on less amount of caffeine and primarily increase sugar intake if you need to. The more coffee/tea you drink on a daily basis the lesser the effects are which is why you want to have it when it is required.

Get an alarm such as the natural Phillips wake-up light, I saw this first on gadget show and it uses natural simulated light to wake you up that gradually increases in luminosity which is much more preferable to loud beeping, repetitive alarms. and always force yourself to wake on the alarm and not snooze it! That conditions your brain to not react to this and you go back into a sleeping state making it harder to subsequently awake. I always have the primary wake up light alarm, and then a backup alarm in case for any reason it does not go off. Best alarm I've ever used. If you time this right you should just wake up properly, if it is hard to get out of bed this implies the sleep cycles you set haven't aliened up right.

This other point was useful brought to my attention too: buy a blackout blind. That helped as during the day this completely stops any sunlight getting through and are relatively cheap and easy to install. For me it helped a bit, other people who invested in it said it made a great difference. We as people having seen sunlight naturally causes us to awake by us having chemical reactions take place in the brain, likewise when it gets dark and we see this we again react to make ourselves feel more tired/sleepy. Even if your in bed eyes shut some light shining inside will have effects which is why you want to limit it or cut it off outright as you can still detect it to some degree.

Key pointed summary.

  • Have some caffeine if needed, but primarily you want more sugar.
  • Buy and use a blackout blind to cut off sunlight when sleeping in the day
  • Get a naturally sun simulated wake up alarm, wake on first alarming!
  • Drink a tea/coffee then nap of 5-20 mins as temp. boost. 
  • Target and align for 4-6 of 90 min sleep cycles and wake on lightest sleep state.
  • Experiment to find what works and make minor adjustments, gradually.

Friday, 6 November 2015

knowing the unknowns

Known knowns, known unknowns & unknown unknowns.

When I first heard this from a video I was watching from Brendan on Dtrace I asked what does that even mean? 

I found other quotes that expand on this thought i.e.

"There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know."
- Donald Rumsfeld

which led me down more thoughts and questions...
such as how can I be more aware of things I don't even have awareness about?
Do I know someone with much more experience & understanding than me?

so why post this?

It is important to be more aware of this (it seems kind of obvious in some regard) but you need others to point things out to you or ask questions to make you think a little differently and question what you are doing and why you are doing something. This immediately gives you another angle towards things, allowing you to learn in another way

For example I learnt last year something I didn't realize was possible, on Windows, Linux, BSDs, OSX etc when you reboot you must wait until after BIOS and POST then load the system etc. This can take some time however is it possible to bypass these to boot much faster into the OS? Can this also be done after you do something such as upgrade the host kernel? 

This can be done in Solaris by default reboot command (which is same as reboot -f) as per the man page - "Fast reboot, bypassing firmware and boot  loader.  The  new  kernel will  be loaded into memory by the running kernel, and control will be transferred to the newly loaded kernel. If  disk or kernel arguments are specified, they must be specified before other boot arguments"

This allows reboot within seconds and in my case I have in a desktop an nvidia graphics card which the device driver implementation does not support quiesce. Nevertheless this can also be forced to do so anyway provided only the nvidia graphics are the issue. I did using the following:

echo "force_fastreboot/W 1" | mdb -kw
echo "set force_fast reboot = 1" #x26;#x26;#x3e;#x26;#x26;#x3e; /etc/system

then done.

I have since found out it is possible to do a similar thing on Linux using kexec however it does not look stable so I'm uneasy about using it but could test it out.

Tuesday, 3 November 2015

ZFS born in Zion

Interesting vids from the recent OpenZFS Summit 2015. Recommend you watch these - https://www.youtube.com/watch?v=dcV2PaMTAJ4&index=6&list=PLaUVvul17xSedlXipesHxfzDm74lXj0ab

As Jeff Bonwick explains around the time of ZFS conception that it has links to The Matrix. That's why Oracle documentation has things in there about Neo, Trinity, tank and Morpheus. Amazing film with memorable quotes:

Morpheus: "You're faster than this. Don't think you are, know you are."
Morpheus: "I'm trying to free your mind, Neo. But I can only show you the door. You're the one that has to walk through it"

Let's not forget he was also Cowboy Curtis - https://www.youtube.com/watch?v=3jsCxNK4vAc 

Lawrence and Samuel aren't the same person....

Sunday, 1 November 2015

Hardware or Software RAID?

About 4-5 years ago when I first made a start on learning and using Linux one of the questions was towards RAID, given you have more than one way to skin a cat so to speak. Which way to skin it?
I was told by a manager (and he was saying this with 100% solidity)"hardware RAID IS the best RAID". - I have yet to see this proven.

Loose Background

Years ago hardware RAID used to be the better option as CPU's were considerably slower so whilst software RAID is constantly running will consume a fair amount of CPU resources (thus additional overhead) combined with the lack of well designed software RAID (or for example firmware RAID on older motherboards) meant you would be better of paying for a dedicated card to handle this as it also has things like BBU + cache so it is able to reorganise write operations prior to flushing to disk at same time keeping writes ready to be flushed even if power is temporarily out to maintain a consistent state.

Questions arised and can be asked such as:
What if the hardware RAID card fails?
If software RAID is improved can we spend less money on HW?
Can rebuilds be done faster through software than hardware RAID?
Perhaps we should integrate LVM/VFS layer together?
Should software RAID be done user space or kernel space?
Is it possible to have software reorganize I/Os like hardware?
What happens to the state of the array if the cache after 72 hours is gone?

Linux mdadm is quite alot better, you also can use BTRFS or ZFS. I've played around removing drives and rebuilding etc using mdadm. I no longer bother now as I just use ZFS for all my storage needs.

In short Software RAID is now at a stage that it is faster than hardware RAID, provides end-to-end checksumming (so no data corruption), organizing writes to convert random writes into sequential writes (whilst providing dynamic block allocation) and can be very efficient in terms of it's resource usage.
Test that compares software and hardware RAID by Robert - http://milek.blogspot.co.uk/2006/08/hw-raid-vs-zfs-software-raid-part-ii.html
and as referenced also from "Unix and Linux System Administration Handbook fourth edition"

Saturday, 31 October 2015

Microsoft is Evil!

This link is funny


and on it within the links is my favorite message

from - http://toastytech.com/evil/errwindows.html

you never know, maybe messages like that could exist!

The best saying about Storage

When I read this quote I quite liked it.

"There are two things about hard drives, either they are going to fail, or they have failed."

Thinking of it in that way means you won't (or shouldn't) rely on some known % failure rate statistics or thinking my RAID has this low chance of failing so I will be fine etc, as at some point you know they will fail. Enterprise quality or not.

It is all well and good if you have a RAID array where you can suffer several drives failing at the same time and have spares ready to rebuild but have you asked what if another one fails before rebuild? What if they all fail? Ask this because in my and others experience when one thing goes wrong it just so happens it is when you need it most. (I think this is known as Murphy's Law) I've heard stories of someone telling me the chances are so low.. followed by but it just so happened on this one occasion and.. Also recently I suffered several drives fail within one month of one another after about 5-6 years of use (more on that one in another post)

Friday, 30 October 2015

NVMe (focus on M.2) the latest paradigm shift

I heard about this a few months back from my adviser and only just yesterday Samsung released the NVMe pro 950 M.2 SSD. A 256 and 512G version. This emerging tech has dramatic effects for the industry. Others don't appear to have realized or are even aware of the implications of NVMe (based on lack of comments from the posts I follow and people I've spoken with.) but then again I haven't checked everywhere.

This is why I've got myself a motherboard with 2 such M.2 Slots to utilize this (Asrock X99 extreme 11), probably for use as L2ARC... I'll just hold off a bit longer as prices will most certainly drop. (The 512G version is about £300)

What will it cause?

The next generation of all future laptops, smart phones and other devices will integrate this in. (infact iphone 6S already has this) this allows all next gen hardware to probably be 10x faster than existing tech (Based on the fact that most operations machines are waiting on is storage I/Os.) Being as this architecture is so small it will replace more and more existing SSD's such as the 2.5" Sata based ones as it grows more commonplace. (why would you not want something much faster and power efficient?) because it is very efficient from wattage point of view running costs on larger scales will also be less, space required is much less to as additional layers are added to the silicon as opposed to the older plane/flat methods. Just compare the sizes of your typical 3.5", 2.5"storage devices to something the size of a large chewing gum stick, which at some point will be TBs in size.

What is the future?

I am aware that more production facilities are in the making to produce this on a larger scale with additional layers. Next year Samsung will almost certainly release a 1TB model with faster speeds. Not to mention other vendors will be in direct competition. For starters Laptops not using this will be phased out. My question is what is the max amount of layers that can be added?