= BigData-ASAP Configs & Costs Harry Mangalam v1.1 - October 25th, 2013 :icons: // fileroot="/home/hjm/nacs/BigData-ASAP-configs+costs"; asciidoc -a icons -a toc2 -b html5 -a numbered ${fileroot}.txt; scp ${fileroot}.html ${fileroot}.txt ${fileroot}.png moo:~/public_html; == Introduction In the following, 'Primary' or '1°'; refers to the system in the OITCD; 'Secondary' or '2°' refers to the backup system in either ICS or CalIT2. Some Abbreviations and Explanations: *GPFS* = IBM's General Parallel File System + *FHGFS* = Fraunhofer File System + *MDS* = MetaData Server, required for Fhgfs + *IB* = Infiniband + // *LHIB* = Long haul Infiniband + *QDR* = Quad data rate IB (41Gbs) + *FDR* = Fourteen data rate IB (54Gbs) + *10GbE* = 10Gb/s ethernet + *SMF* = Single Mode optical Fiber + *MMF* = Multi-mode optical Fiber + *I/O* = Input / Output + A Storage Servers is a generic storage chassis with room for 36 disks each. They can be incrementally populated in chunks of 12 disks each (a 'brick') to provide 40TB for each 12-disk chunk. Distributed filesystems can be increased in size by providing more storage bricks. Their aggregate bandwidth can be increased by adding more chassis'. ie adding another chassis allows I/O at 2X the max I/O to one, 3 chassis allows 3X the I/O of one chassis. Hence the reasoning to initially provide 2 underpopulated chassis' in (B) below instead of filling the 1st. .Range of Configurations and Costs The 'Opt' labels (x-axis) refers to the the Option letters in parens below. The subscript refers to wither their 'cost' or 'terabytes'. ie the 'Et' bar refers to the size in TB of link:#optE[Option (E) below]. The bars are colored by whether the configuration is one with a 'Full Backup' (Blue), No Backup (Green), or Half Backup (Red). The Y-axis 'C.T' denotes both 'Cost' in $1000 or TB. image:BigData-ASAP-configs+costs.png[Range of Configurations and Costs] Listed in generally increasing cost. == (A) 80TB 1°; no 2°, 1 chassis *(~$24K)* - 1 storage node w/ 80TB usable (2 bricks of 40TB each in 1 server) - $19K - room in chassis for 1 more brick of 40TB. - no bandwidth scaling (single chassis) - integrated MDS with storage server - no 2° - single small IO node (~$5K) - no IB switch required. == (B) 80TB 1°, no 2°, 2 chassis' *(~$50K)* - 2 storage nodes; w/ 80TB usable (1 brick of 40TB in each server) (~$30K) - space in each server for 2 more bricks as needed (240TB total) - 2X bandwidth of (A). - separate Storage and MDS (~$7K) - no 2° - single IO node (~$5K) - requires IB rack switch (~$8K) == ( C ) 240TB 1°, no 2° *(~$66K)* - 2 storage nodes; w/ 240TB usable (6 bricks of 40TB each) (~$46K) - 2X bandwidth of (A). - both storage nodes fully populated with disks - otherwise like (B) (~$7K + ~$5K + ~$8K) == (D) 80TB 1°, full 2° *(~$44K)* - no bandwidth scaling with single chassis - like (A) (~$24K) but also: - 2 x 10GbE & optical interfaces (~$15K) - SM fiber from OITDC to ICSDC (~$5K) [[OptE]] == (E) 360TB 1°, no 2° *(~$89K)* - like ( C ) (\~$66K) but w/ another fully populated storage node (~$23K) - 2X bandwidth of single chassis. == (F) 240TB 1°, partial 2° *(~$109K)* - like (C) (~$66K), but also - 1 full storage node (120TB - provides re-charge backup of 1/2 of 1°), w/ integrated MD server (~$23K) - no IB rack switch req'd but will be req'd if increase # of storage nodes. - 2 x 10GbE & optical interfaces (~$15K) - SM fiber from OITDC to ICSDC (~$5K) == (G) 480TB 1°, no 2° *(~$112K)* - like (E) (~$89K) but w/ another fully populated storage node (~$23K) == (H) 240TB 1°, w/ Bandwidth scaling; w/ full 2°*(~$140K)* - like (F) (~$109K) but also: - 1 more storage node to provide 240TB total 2°. (~$23K) - 1 IB rack switch (~$8K) == My recommendation In terms of progression, the best way of doing this is to install a 'Primary' storage stack in the OITDC, get the interfaces up and running to make sure it's popular. Once the system has been shown to be useful and popular 'and' there is a demand for recharge backup, 'then' offer the 'Secondary/backup' as an option. At that point there might be Departmental pressure to both provide it and to pay for it. However, if we do provide a simultaneous 1° and 2° and there dos not seem to be the institutional support for the 2° geo-replicate, we can easily merge the hardware back into the 1° storage system to increase the 1° storage.