DelphiFeeds.com

  • Dashboard
  • Popular Stories
  • Trending Stories
  • Feeds
  • Login
Trending now

Quick Logger Is A Powerful Enterprise-Grade Asynchronous Logger For Delphi

Learn How To Use C++ Atomic Operations For Windows Development In C++Builder

Easily Create Ultra-Fast C++ Applications With Low-Level libsimdpp Library In C++Builder

Delphi JOSE JWT Is A Powerful JSON Web Token Library For Delphi

Learn How Not To Use Square Brackets in Your Program In This Learn Delphi Video

Easily Communicate With The TI Gas Sensor Platform Using Delphi And C++ Builder

Powerful Video Game Collection Curation Software Is Built In Delphi

Rise of Legions Multiplayer RPG Windows Game Built In Delphi

Cross-Platform 4D Solar System Simulation Is Built In Delphi

Flexible Cross-Platform Open-Source Component Suite For Delphi FireMonkey

Quickly And Easily Hook Delphi And Windows API Functions With The DDetours Library

Discover How to Use C++ Alias Templates For Windows Development In C++Builder

QuickLib Is A Powerful Third Party Library For Delphi Which Can Boost Productivity

Discover How to Clone The Classic Minesweeper Game in Delphi in This Learn Delphi Video

TMS VCL UI Pack v10.5 released

RAID, files and cloud storage

1
Marshall Fryman Marshall Fryman 12 years ago in Coding, Delphi 0
RAID is a method that takes independent drives and lets a system group them together for security (redundancy or parity), speed enhancements, storage space increases or all three. One of the long-time stalwarts of the RAID environment is RAID 5. In RAID 5, you need at least 3 identically sized disks. They are combined so that the storage space is N-1 (i.e., in a 3-drive system, total space is 2x drive size). The last disk is used for parity. With a parity drive, you can remove any one of the drives and still have access to your data. If you remove two or more of the drives though, you'd better have a good backup.

How does this work? Through the magic of XOR. The following statements are all true:

A XOR B = PAR
PAR XOR B = A
A XOR PAR = B


That's how parity lets you lose one disk and still recover your data. The same rules also apply in larger sets. Notice that you have to rotate the position of the parity data though. So a 5 drive system looks like:

A XOR B XOR C XOR D XOR E = PAR
PAR XOR B XOR C XOR D XOR E = A
A XOR PAR XOR C XOR D XOR E = B
A XOR B XOR PAR XOR D XOR E = C
A XOR B XOR C XOR PAR XOR E = D
A XOR B XOR C XOR D XOR PAR = E


The other interesting thing to note is that the size stored on each drive is 1/(n-1) of the total file size. Thus a 100 byte file in a RAID 5 system only stores 20 bytes on each drive. This is where the space increase comes from.

Now, the question is, why do we care? Other than it's nice to know how something works, this technique could be applied to cloud storage. If you follow some of the events that have happened with online storage providers, you may have seen a number of them come and go. The problem is, if they disappear or lose your data, what happens then?

If you are using them as a convenient off-site data storage pool for large amounts of infrequently used data, you could implement a RAID 5-style data split. Keep in mind there's a bit of overhead in doing so, but the point of this exercise is to reduce your dependence on any one provider. Coincidentally, it will likely cost you the same or less than using a single cloud storage provider.

Take for instance Amazon's S3 service and Rackspace's Mosso Cloud Files. Both services charge per gigabyte per month. For arguments sake, let's assume we have a 10GB RAR file (BIGFILE.RAR) that we want to backup. If we split it into a RAID 5, 3 drive format, that would leave us with BIGFILE.RAR.1, BIGFILE.RAR.2 and BIGFILE.RAR.RAID. Each file will be 5GB (10GB / (3-1)) in size. I can upload one file to Amazon, one file to Rackspace and retain one file on my hard drive. I've now backed up the file in an online form that can be recovered even if I lose my hard drive or Amazon / Rackspace has an outage at the same moment I need to have access to my data. So long as I have access to any two parts of the file, I can create the original output file.

Obviously, Amazon and Rackspace are large enough it is unlikely they'd actually lose the data over a longer period of time. The same can't be said of some the smaller players in the market. Companies like Streamload managed to wipe out about half of their customers data during a reorganization before finally closing their doors. Anyone caught unaware lost all of their data.

I should also note that you could use a PAR or PAR2 system. The easiest to use implementation is probably QuickPAR. It uses a similar system to what I've shown here but the source is not very conducive to Delphi developers. From what I can tell, it was originally developed to push binary files around Usenet, but it would work equally well for cloud storage. If you're just looking for a good off-the-shelf tool, QuickPAR is probably the way to go. If you're interested in developing a parity solution that you can embed in your code, the source is included below.



procedure File2RaidFiles(fileName:string; raidLength:integer);
var FS:TFileStream;
outputFS:array of TFileStream;
byteArray:array of byte;
ix:integer;
begin
//test to make sure we were called correctly
if not FileExists(fileName) then
raise Exception.Create(Format('File %s doesn''t exist', [fileName]));
if raidLength<2 then
raise Exception.Create(Format('Raid Length must be greater than 1. Given value was %d',[raidLength]));

FS:=TFileStream.Create(fileName,fmOpenRead);
try
setLength(outputFS, raidLength+1); //+1 for the parity byte
setLength(byteArray, raidLength+1);
for ix := 0 to raidLength do //create an output location for each stripe (named .stripe#) and the parity file (named .raid)
begin
if ix=raidLength then
outputFS[ix]:=TFileStream.Create(fileName+'.raid',fmCreate)
else
outputFS[ix]:=TFileStream.Create(fileName+'.'+IntToStr(ix),fmCreate);
end;
try
while FS.Position<FS.Size do //while we haven't hit the end of the file
begin
FS.Read(byteArray[0],raidLength); //read in the bytes to the byteArray
for ix := 0 to raidLength-2 do //this calcs the parity byte, it's calculated by XORing the other bytes to it
byteArray[raidLength]:=byteArray[ix] xor byteArray[ix+1];
for ix := 0 to raidLength do //write out the bytes to the respective stripes (stripes are < raidlength) and the parity file (outputFS[raidlength])
outputFS[ix].Write(byteArray[ix],1);
end;
finally
for ix := 0 to raidLength do //clean up the output streams
outputFS[ix].Free;
end;
finally
FS.Free; //clean up the input stream
end;
end;

procedure RaidFiles2File(fileName, outputName:string; raidLength:integer);
var FS:TFileStream;
inputFS:array of TFileStream;
testFS:TFileStream;
byteArray:array of byte;
ix, damage:integer;
countOfMissing:integer;
checkByte:byte;
checkFails:boolean;
begin
//test to make sure we were called correctly
if FileExists(outputName) then
raise Exception.Create(Format('File %s already exist', [outputName]));
if raidLength<2 then
raise Exception.Create(Format('Raid Length must be greater than 1. Given value was %d',[raidLength]));
//init the basics
checkFails:=false;
testFS:=nil;
countOfMissing:=0;
damage:=0;
setLength(inputFS, raidLength+1); //+1 is the parity byte
setLength(byteArray, raidLength+1);
for ix := 0 to raidLength do
inputFS[ix]:=nil;

//setup the output file stream
FS:=TFileStream.Create(outputName,fmCreate);
try
//create the input file streams. make sure we pickup the count of missing streams and a pntr to an input stream for use later on (any stream will do, just testing for eof
for ix := 0 to raidLength do
begin
if ix=raidLength then //this is the parity stream
begin
if FileExists(fileName+'.raid') then
inputFS[ix]:=TFileStream.Create(fileName+'.raid',fmOpenRead)
end
else //this is a stripe stream
begin
if FileExists(fileName+'.'+IntToStr(ix)) then
begin
inputFS[ix]:=TFileStream.Create(fileName+'.'+IntToStr(ix),fmOpenRead);
testFS:=inputFS[ix]; //this is just to test for eof. all files are the same size
end;
end;
if inputFS[ix]=nil then inc(countOfMissing); //if we didn't get an input stream, add to the missing count
end;
//you are only allowed to have 1 missing input file
if countOfMissing>1 then
raise Exception.Create('Unable to recover file! You must have at least N-1 parts of the file');
assert(testFS<>nil, 'testFS=nil? This should never happen');

while testFS.Position<testFS.Size do //while we are not at the end of the input file
begin
if countOfMissing=0 then //if there are no missing streams, we can just remerge the data together
begin
for ix := 0 to raidLength do
inputFS[ix].Read(byteArray[ix],1);

checkByte:=0;
//calc a checkByte to make sure it all still agrees
for ix := 0 to raidLength-2 do //0 based, stop 1 short of the end
if ix=0 then
checkByte:=byteArray[ix] xor byteArray[ix+1] //seed check byte by XORing the 1st two bytes together
else
checkByte:=checkbyte xor byteArray[ix+1]; //XOR the bytes together
if checkByte<>byteArray[raidLength] then //the new checkByte doesn't match the old parity byte. That means we have a problem, the file doesn't match
checkFails:=true;
end
else //if there are missing streams, we have to calculate the missing data
begin //you have to reverse the XOR with the parity bit to determine the result
for ix := 0 to raidLength-1 do //not counting the end of the array, that's where we'll store the damaged byte
if inputFS[ix]<>nil then //if this is a valid stream, read it's data
inputFS[ix].Read(byteArray[ix],1)
else
begin //this isn't a valid stream, so remember this is damaged and read from the parity stream for this position
damage:=ix;
inputFS[raidLength].Read(byteArray[ix],1);
end;
//this is calcs the damaged byte into the end of the array (raidLength)
for ix := 0 to raidLength-2 do //0 based, stop 1 short of the end
byteArray[raidLength]:=byteArray[ix] xor byteArray[ix+1]; //XOR the bytes together
byteArray[damage]:=byteArray[raidLength]; //replace the damaged byte with the restored byte
end;
FS.Write(byteArray[0],raidLength); //write out the merged (and psbly restored) data
end;
finally
for ix := 0 to raidLength do //clean up the memory from the input streams
if inputFS[ix]<>nil then inputFS[ix].Free;
FS.Free; //clean up the memory for the output stream
end;
if checkFails then
raise Exception.Create(Format('This file (%s) does not match to its parity checks. Damage may be present', [outputName]));
end;

Trending Stories

  • Quick Logger Is A Powerful Enterprise-Grade Asynchronous Logger For Delphi

  • Learn How To Use C++ Atomic Operations For Windows Development...

  • Easily Create Ultra-Fast C++ Applications With Low-Level libsimdpp Library In...

  • Delphi JOSE JWT Is A Powerful JSON Web Token Library...

  • Learn How Not To Use Square Brackets in Your Program...

Embarcadero GetIt

  • ProDelphi 64-bit & 32-bit. Lite version

    Source code profiler for measuring runtime of 64 and 32 bit applications developed with Delphi. […]

  • TChromeTabs

    TChrome tabs is a comprehensive implementation of Google Chrome's tab system - features include - […]

  • ICS for FMX and VCL

    ICS is a Delphi library composed of many internet components supporting all major protocols and […]

  • ICS for VCL

    ICS is a Delphi library composed of many internet components supporting all major protocols and […]

  • SynEdit for VCL

    SynEdit for Delphi and CBuilder. Syntax highlighting edit control, not based on the Windows common […]

  • Learn Delphi
  • Learn C++
  • Embarcadero Blogs
  • BeginEnd.net
  • Python GUI
  • Firebird News
  • Torry’s Delphi Pages
Copyright DelphiFeeds.com 2021. All Rights Reserved
Embarcadero
Login Register

Login

Lost Password

Register

Lost Password