Thursday, October 21, 2010

Regular Expressions

Slides from my talk at last night's St. Louis Perl Mongers meeting:

Tuesday, September 28, 2010

String::Random

I mentioned my one module on CPAN (String::Random) to a friend yesterday, and got the response "You wrote that?"  Honestly, I was shocked that he'd heard of it.  (There are so many modules on CPAN that I doubt most Perl programmers have heard of 99% of them.)

I decided to Google for the module a bit to see if there were many mentions (fully expecting to find some "I looked at the code, and my eyes are still bleeding" comments), and I was pleasantly surprised to find this rather old tutorial on using the module: http://www.perlmonks.org/?node_id=88021

I also found a Ruby port on GitHub: http://github.com/repeatedly/ruby-string-random

So far, I haven't found anyone ripping into it, but I'm sure it's out there...

Thursday, August 19, 2010

Ohio Linux Fest

It's time again for the Ohio Linux Fest (September 10-12, 2010 in Columbus, Ohio).  I'm going.  So should you.  :-)

I'm teaching an abbreviated version of my Data Recovery class (for an abbreviated price, I might add) as part of the OLFU program on Friday, September 11.  If you are responsible for any hard drives, I highly recommend the class.

Speaking of classes, we also just announced a tag-team vi and vim class lineup that Bill Odom and I will be doing late next month.  Anyone in the St. Louis area who uses vi and thinks it is a burden (or avoids it like the plague) should take my vi basics class.  Everyone should take Bill's vim class.  (It's amazing stuff.)

Thursday, July 29, 2010

Generating ssh keys in PuTTY

I had to fire up Windows today to explain to someone how to generate ssh keys (for use with PuTTY).  I figured since I went to all that trouble, I should share...

Note: The Linux version of puttygen is all command-line, so these instructions will only work with the Windows version.


When you first run puttygen, the default (along the bottom) should be to create a key type "SSH-2 RSA". If not, select that. 1024 bits is fine (box at the very bottom right), which should be the default.


Now, hit "Generate". It will ask you to move the mouse around a bit to generate some randomness. When that is done, it will generate the key. Put your email address in the "Key comment" field. Then select the key in the box at the top (under "Public key for pasting into OpenSSH authorized_keys file:"), copy it, and paste it into the .ssh/authorized_keys on the system you want to be able to login to with that key.


Fill in the box next to "Key passphrase" with something long that you'll remember. (It's fine to use a full sentence or something. Remember, it's a passphrase, not a password.) Enter the passphrase again in the "Confirm passphrase" box.


Next hit the "Save public key" button and save that half to a file with "public" in the file name. Then hit the "Save private key" button and save that half to a file with "private" in the file name.


In PuTTY, in the configuration dialog, expand "Connection" in the left pane (if it isn't already), then expand "SSH". Click on "Auth". Next to the box that says "Private key file for authentication", hit "Browse" and select the "private" file you just saved. Be sure to save your settings so you don't have to feed this in every time. (Click on "Session" at the top of the left pane, then under "Saved Sessions" click on "Default Settings" and hit "Save".)


Now when you try to login to the system you previously dropped your public key on, you should be prompted for the passphrase for your key rather than the password for your account on the system.

If I may add a little editorializing here, I do have to point out that this is all much easier with ssh-keygen on Linux.  And, IMHO, if you're doing Linux administration from a Windows PC, you're doing it wrong.  But that's just me.  :-)

Friday, February 26, 2010

Stupid git tricks

I had a directory in CVS that I used for a catch-all for random scripts. (For example, this is where cpanspec lived before moving it to Sourceforge CVS.) Now that I'm using github, I'm trying to split these scripts up into separate git repos. This is the procedure I've come up with...

First I use git cvsimport to pull in the whole CVS tree:
git cvsimport -d :ext:user@host:/cvsroot -C myscript cvs_module
This will create a directory named myscript. Next, go into that directory and use git filter-branch to remove everything but the file(s) we care about (in this case, myscript again).
git filter-branch --prune-empty --tree-filter 'find -maxdepth 1 -type f \! -name myscript -delete' HEAD
This ends up leaving some stale objects that can be cleaned up by removing everything other than master in .git/refs/heads/, the entire directory .git/refs/original/, and any unrelated tags in .git/refs/tags/ (at least in my example with no branches and such), then cleaning up with a few git commands:
git gc --aggressive
git prune
git repack -a -d

The total number of objects listed by git gc and git repack should be much smaller than the original number git cvsimport reported. (I also confirmed that git fsck --unreachable doesn't find anything.)

[Update] Apparently I had found this answer to my problem a while back and forgot about it. Oops.

Monday, February 22, 2010

Where do we go next with RAID?

So a friend of mine sent me a link to this blog post.  A couple of things jumped out at me...
When a drive fails in a 7 drive, 2 TB SATA disk RAID 5, you’ll have 6 remaining 2 TB drives. As the RAID controller is reconstructing the data it is very likely it will see an URE. At that point the RAID reconstruction stops.
And later...
RAID proponents assumed that disk failures are independent events, but long experience has shown this is not the case: 1 drive failure means another is much more likely.
That sounds an awful lot like what I've been saying for 8 or 9 years now...  (Well, not specifically about 2TB drives, but you know what I mean...  :-)

So the myth that I've been hearing for the last 15 years or so is that you get speed and data security with RAID 5.  The fact is that the speed of an intact array is terrible, and to use the word "speed" in regards
to a degraded array would be an oxymoron.  Add that to the odds of a failure of one of your "good" drives during a rebuild, and you get one big pile of fail.

The advantage of RAID 5 is capacity.  Period.  Any other RAID solution costs more in terms of raw storage capacity.  RAID 6 gives you one less drive of capacity in exchage for improving your odds of a successful
rebuild, but as you all know, I still don't trust it for anything that we don't have a mirror of somewhere.

We've been doing a lot of RAID 1 and RAID 1+0, which is fine, but ultimately you have the same problem there with likely failures while trying to rebuild an array, but you have the added bonus problem that
errors may go undetected.  They may kill performance, but the checksums on RAID 5 and 6 do give you an added safety net since you can detect corrupted data.

For some of our largest arrays, we've been doing mirrored (or rsync'd) RAID 5 or 6, which, while extraordinarily wasteful in terms of storage space, gives us very good odds of recovery from catastrophic hardware failure.

I have to wonder if the real answer here might ultimately be to add parity to a stripe/mirror set, so that any combination of drive failures in an array of n drives that leaves you with at least (n-2)/2 working
drives is easily recoverable...  (Maybe doing RAID 6 over pairs of mirrored drives would be sufficient.  I have to think on that a bit...)

Thursday, February 18, 2010

Open-Source Point of Sale?

Dear Lazyweb,

I need an open-source POS solution for a client. They have a small cafeteria-type restaurant + gift shop. Currently they are using a craptastic closed-source commercial solution that offers no support despite requiring a huge service contract.

Lots of bonus points for something web-based, since the POS terminals they have are rather low-end.

FWIW, we've tried the following:
Posterita comes the closest to being what we want (web-based, AJAX-y, etc.), but development has gone closed-source apparently. OpenBravo POS is probably the most functional, but it's difficult to figure out how to do much of anything with it. OFBiz has a nice, simple POS app, but it's horribly buggy (and rather slow too).

Given an infinite amount of free time, I'd probably hack on the Adempiere + Posterita (last open-source release) combo, but, well, time is not on my side here...