Posted on Fri, Aug 20, 2010 @ 10:32 AM
Not too long ago, the terms “backup” and “archiving” were often used interchangeably. More recently, I have noticed a trend toward using the term backup to refer to the primary data replica and the term archive used to reference a long-term, secondary data replica.
Advances in technology have also brought about a paradigm shift in the target media employed for the task of copying company data: Hard disk drives have largely replaced linear tape as the primary media of choice because of their (substantially) better performance. The principal reason for this is the equation of available time compared to the amount of data that needs to be backed up or archived. It is significantly faster to search a random-accessed disk drive for data versus a linear tape that must wind and rewind in order to reach the required data.
Data continues to grow exponentially for a variety of reasons but we are unable to add more hours to the day as a means of accommodating the growth. Consequently, we are compelled to discover more efficient and faster methods of saving our data. It is the use of hard disks as target media that has made modern email archiving solutions a practical reality.
What is Email Archiving?
Email archiving does not replace conventional email backups, through which email servers are backed up so that they may be restored to a previous point in time in the event of a disaster.

Email archiving differs from backup in that the archive created is indexed and searchable. The email archiving software application allows an administrator to search the index of the archive based upon various criteria, such as:
- Date
- Subject
- Sender or receiver address
- Message size
- Attached files
- Forwarding status
- Combinations of the above.
Archived data is removed from the main data-store, thus the size of the main data-store is kept from growing uncontrollably. The email archival application maintains pointers to the relocated data so that it is still accessible, though a bit slower than if it were still in the main data-store, but usually not to a noticeable degree.
Why is Email Archiving Needed?
There are two major reasons for businesses to consider email archiving: legal compliance and data-store management.
Legal Compliance
An administrator may be asked by an attorney to conduct a search of archived email messages - commonly referred to as e-Discovery - and produce a report of the findings. The attorney would typically provide direction to the administrator as to the content of interest. The administrator would rely upon the email archiving application search function to discover the required content and to produce the report. Certain organizations (e.g, healthcare, financial and educational) are required by law to be in compliance should they ever be required to produce email information.
A few of the more often discussed laws/acts are: the Sarbanes-Oxley Act (SOX), the Securities Exchange Act of 1934, the Gramm-Leach-Bliley (GLB) act of 1999, the Health Insurance Portability and Accountability Act (HIPAA) and the Family Educational Rights and Privacy Act (FERPA).
It is incumbent upon every organization to determine what their specific compliance requirements are and to take the appropriate required steps, or risk potential fines and other penalties. One need only search the web to find hundreds of stories of what can happen to companies that have been caught off guard.
Data-Store Management
The other major reason to consider an email archive solution is to gain control over the email data-store. Email archiving has become a necessity because end-users are not self-policing when it comes to managing the size of their own email message store. Email archive management software can use a variety of different methods to maintain control over data-store size, for example:
- End-users can be assigned quotas and receive automated warnings when size limits are approached.
- Large email attachments can be replaced with stub files that still allow functionality but relieve the main data-store of the burden of holding large file attachments.
- An administrator can set and impose a policy upon end users to effectively and automatically control the size of the email message data-stores.
In order to illustrate this point, let’s consider a “what-if?” scenario: Suppose there are two companies, each using the same email software. Company A has an email data-store over one terabyte in size and company B has an email data-store approximately 200 GB in size. If both email servers crashed and needed to be restored from bare metal, the difference in time to restore the server at company A versus company B would be significant.
Key Benefits of Email Archiving
For most organizations, email as a service has become a mission-critical application and that is why people are taking the time to ensure that their email servers are running as efficiently as possible. An email archiving application can assist in making certain that your email servers are running lean and mean by “cutting out the fat,” so to speak.
Your company may or may not need to be concerned with legal compliance issues, but everyone should strive to get control over the ever-growing size of their email message stores. Ultimately, the implementation of an email archive solution will help to conserve storage space, allow your email servers to run more efficiently and provide for the ability to search and find messages in an efficient manner.
How does your business manage its email?
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Image Credit:
http://farm4.static.flickr.com/3119/2313189596_a67b38baa6.jpg
Posted on Thu, Jul 22, 2010 @ 09:55 AM
Solid-state disks (SSD) are the next major step forward in storage technology. Think of the SSD as you would the more familiar flash drive (memory stick), but in a conventional disk drive form factor. However, even though an SSD might look physically similar to a conventional hard disk drive, the SSD does not employ spinning metal disks at all. Rather, the SSD utilizes erasable, writeable, cell-based memory chips that can store data reliably even when they’re powered off. You may see the acronym NAND used in descriptions of SSD technology [NAND = Not AND (electronic logic gate)].
Vastly improved reliability and performance are the main attractions to SSD technology. When looking at the two images below, take note that unlike the conventional disk drive on the left, the SSD has no moving parts — this is the major reason for the increased reliability offered by SSD technology.

A Little History
I remember back in the early 1980s when the company I was working for bought their first server… for approximately $10K. About half of the cost of that server was spent on the single, internal 1-GB hard disk drive (no that is not a typo, 1 GB!). It was a 5.25” form factor, full-height SCSI disk drive — which is beastly by today’s standards — and my colleagues and I wondered if we would ever use all of that space.
Of course the ubiquitous, magnetic hard disk drive has come a long way since the early days of 10-MB, 20-MB and 40-MB personal computer disk drives, which typically used the RLL (Run Length Limited) and MFM (Modified Frequency Modulation) encoding schemes that are completely foreign to most computer professionals today.
These seemingly humble beginnings, and the ever-present requirement for increased drive performance and capacity, have brought us to the comparatively astonishing hard disk drive capacities, performance, reliability and small physical size of modern storage devices. Consider for a moment that quite a few people are walking around with literally gigabytes of music, video and pictures in their pockets stored on devices that fit in the palm of their hands! Lend further consideration to the fact that many of these storage devices are not magnetic hard disk drives, but rather chips or SSDs.
Sidebar: Does anyone remember watching an old episode of the original Star Trek television show where Spock inserted a small rectangular piece of metal into his computer console, and the object was apparently a storage device with no moving parts that contained an enormous amount of data? Well guess what folks — we’re pretty much there!
Current State of Enterprise-Class Disk Storage
In the enterprise, the currently available magnetic hard disk drives are offered with FC (Fibre-Channel), SATA (Serial Advanced Technology Attachment) and SAS (Serial Attached SCSI) connectivity. Older IDE (Integrated Drive Electronics) and ATA (Advanced Technology Attachment) technology can still be found in PCs, and conventional SCSI has pretty much been superseded by SAS.
There is a variation available from a couple of manufacturers that mates an ATA drive with FC connectivity, and this is referred to as FATA. This variation in drive type and connectivity speaks to striking the balance between drive performance, reliability and cost, and is predominately described using the Storage Tier Model.
Tier 1 is described as the highest performance, most reliable and consequently most expensive storage tier. Tiers 2 and 3 take steps down in performance and reliability, thus lowering drive cost. The savings can be significant, so it is worth making the effort to categorize data into the storage tier model.
Generally, an organization’s most critical and most frequently accessed data will reside on Tier 1 storage devices. Data that is static, infrequently accessed or judged non-critical to daily operations is generally stored on Tier 2 or Tier 3 storage devices. Software is available to assist in the automation of this ongoing categorization of data.
Solid State Disk Storage in the Enterprise
So what about the potential of using newer SSDs for enterprise-class storage requirements, and where do they fit in the storage tier model? For some time now, SSDs could only be found on the periphery of the storage market, serving those who needed significantly more performance than was available in conventional magnetic hard disk drives. (Demanding streaming video applications is one such usage that comes to mind.)
The issue with SSD technology to date has been the storage density-to-cost ratio. In other words, SSD technology is quite expensive as compared to conventional spinning disks. This is changing, although slower than many would like. In fact, SSD drives of reasonable size (100GB+) can be found as options in high-end notebook computers from several popular vendors.

SSD in a MacBook Air
When considering enterprise storage, many people use a cost-per-GB or cost-per-TB equation to make purchase desicions. As of this writing we are just on the verge of being able to justify the cost of SSD technology in enterprise-class servers based upon IOPS (Input/Output Operations Per Second) performance.
Consider that the typical, conventional enterprise-class hard disk drive can deliver roughly 150 to 300 IOPS, compared to SSDs that can deliver approximately 100,000 IOPS, and you begin to realize how some quick math can justify the cost differential for those mission-critical applications wherein a time advantage translates immediately into a business advantage. An example would be the healthcare or financial industries.
The writing is clearly on the wall — as the capacity-to-cost ratio of SSDs continues to get closer to that of conventional hard disk drives, buyers will certainly invest in the newer, faster and more reliable technology. As you may have already surmised, this transition is expected to happen at the Tier 1 storage level first, and it may take significantly longer for SSDs to make their way into the Tier 2 or 3 levels of the storage hierarchy where cost is the paramount consideration.
It is also worth mentioning that adopting SSDs is not simply a matter of swapping out existing hard disks for the new SSDs. New controllers and other electronics are required, and consequently SSDs will more than likely find their way into enterprise environments as part of a larger purchase, such as a storage array, a server or a specialty appliance.
The Advice
If you have not already implemented, or lent consideration to, storage-tiering solutions, you should soon. Why? For most businesses and organizations (and yes, even individuals!), data continues to grow at an unrelenting pace.
One reason is that significantly more transactions are done electronically versus on paper these days. For example: Medical records are now stored electronically, and radiology images are many and large in size (and are also stored electronically). Old paper documents are being converted and stored. Even email has become a mission-critical application in many business environments where it wasn’t just a few years ago, and it presents many storage challenges. Corporate databases are growing in number and size.
We could go on and on, but the point is made that storage needs are continuing to increase, and therefore we would be wise to get a handle on the management of all of this data.
Implementation of a storage management, or tiering, solution now can prepare you for the consideration of implementing SSD storage in the future for your Tier 1 or mission-critical requirements when the time is right. Consequently, it will be much easier to justify the cost if you can demonstrate that you have the data storage environment reasonably under control. Rest assured, adopting a storage management policy and getting ready for SSD in your environment is worthy of your time because technology moves fast, and SSD will be pervasive in the enterprise in just a few years. Are you ready?
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Image credits:
Posted on Fri, Jun 04, 2010 @ 02:37 PM
If you are responsible for determining the requirements for your organization's datacenter or computer room's power and cooling capacity, then you have some work in front of you. There is a lot that goes into determining your needs that is difficult to accurately predict long-term, not the least of which is how much your business will grow over the next 5-10 years. Nobody wants to purchase more capacity than they need — especially when unsure about how long additional capacity will go unused — because there is, of course, an additional cost involved with this extra space.
How difficult is it to accurately predict datacenter needs? During initial commissioning, the typical datacenter is oversized by a factor of ten and ultimately plays out to be oversized by a factor of three at the end of its lifespan (which is usually planned to be 10 years, but is more likely 5-7 years).
To further complicate matters, downtime is no longer tolerated in today's business world. Even small companies need assurance that their critical applications will be available 24/7. Since the power and cooling systems are foundational to the systems that host those applications, proper design, implementation and ongoing management are essential to achieving this expected level of availability.
Achieving fault resiliency with traditional datacenter designs — monolithic power and cooling systems — is difficult, space consuming and expensive. Also, many older systems do not have flexible remote- or network-based management applications.
Modular Datacenters Offer an Alternative
In a word, modularity has become the proven way to overcome all of the above challenges. Indeed, we are in the midst of a transition from traditional monolithic power and cooling solutions to more adaptive and flexible modular designs.
For example, the ability to add UPS (Uninterruptable Power Supply) capacity incrementally — or in modules — as demand increases, means that it is no longer necessary to purchase an exceptionally large UPS system up front in anticipation of a potential future need.
Modular UPS systems are more flexible regarding physical location as well.
- They can be located closer to the equipment that they serve, resulting in further cost savings in wiring costs and associated labor expenses.
- Fault resiliency is more easily achieved because you can simply add redundant power modules as opposed to entire systems as in the traditional approach to datacenter design.
Power Distribution Units (PDUs) have also become modular in design, and can now be located within the computer equipment cabinets that they serve, or very close to them. Traditional design principles dictated a large UPS system positioned against a wall with one or more power distribution and circuit breaker panels bolted to the wall nearby. This resulted in vast amounts of electrical cabling running from the PDUs to the equipment racks and cabinets, which was typically run under a raised floor (which often caused another issue of the cables preventing proper airflow). As a result, modern datacenter design recommends that power and data cabling should be run overhead.
Cooling Challenges
Modular design in computer systems and storage devices has resulted in substantially greater computer power and storage capacity in a given physical space (e.g., blade servers, high-capacity storage arrays, enterprise-class network switches, etc.). A problem with this design, however, is the concentration of heat, which must be removed in order to prevent equipment damage. Consequently, older monolithic cooling solutions are no longer adequate because they were primarily designed to provide cool air but not necessarily remove heat efficiently.
This is because traditional Computer Room Air Conditioners (CRAC units) are designed to provide cooling for an entire room, and are essentially ineffective at removing the localized heat loads generated by modern computer, storage and network equipment. Again, modular design comes to the rescue by allowing air-conditioning capacity to be acquired incrementally and be physically located close to the source of the problem — the heat-generating equipment.
Benefits of Modular Cooling
People now realize the inefficiency and waste in providing cooling to an entire room when it is possible to employ modular cooling solutions that are designed to efficiently remove heat and prevent any potential intermingling of hot and cold air. This modular datacenter design offers long-term operational savings in electricity costs which can in fact be the largest ongoing expense associated with a datacenter.
Traditionally, the power and cooling in the computer room was supplied through the building's main systems. The facilities team would contact the IT department to see what their power and cooling needs were in the computer room. With the modularity of cooling units, they can be put in-line with the IT systems, removing the heat close to the source.
Another potential benefit of modular design that is typically overlooked or unnoticed by IT personnel, but certainly appreciated by the CFO, is that equipment located physically close to the computer and network equipment it services will probably not be considered as part of the facility, and consequently it may be possible to depreciate it as if it were IT equipment.
Modular systems are also generally more portable than traditional solutions, should the need arise — another potential cost savings.
Start with a Datacenter Assessment
To begin any datacenter design project, the key to getting started is to perform a datacenter assessment that can determine the present conditions in the room, identify any areas of concern with varying levels of priority, and ultimately offer suggestions to resolve issues and achieve greater efficiencies.
For reference, here are a few industry websites that offer further insight into power and cooling considerations for datacenters:
The Datacenter Journal
Datacenter Knowledge
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Image Credits:
http://farm4.static.flickr.com/3169/3026743122_6e9f473287.jpg
http://farm2.static.flickr.com/1390/866584082_c5a8a14726.jpg
Posted on Mon, Mar 22, 2010 @ 11:24 AM
Data deduplication is arguably the most important new feature for backup and archival solutions in years. Data deduplication can save your organization time and money, and therefore it is worth investigating to determine whether it is appropriate for your data environment. Although predominately employed for backup and archival processes today, data deduplication for primary storage is on the horizon, which underscores the significance of this technology.
Data deduplication has grown in awareness and popularity due to the continual explosion of data in the business world. For reference, some marketing pundits are now referring to this technical process as "capacity optimization." Whatever it's called, the overall goal is to perform data backups faster while still maintaining data integrity.
As the name implies, data deduplication is the process of removing redundant data. This is most common for the purpose of making the data backup process more efficient, but it can also provide benefits for data replication. The first point to understand is that deduplication can be implemented in different ways.
Levels of Data Deduplication
File Level
As the phrase implies, duplicate files are eliminated, and pointers or links are put in their place that all direct to the remaining single instance of the file. File level data deduplication is also referred to as Single Instance Storage.
This process is accomplished by comparing target files that are candidates for the backup process to files that are already archived, referencing the attributes stored in an index file. For example, imagine that we want to back up several host servers that all have the same operating system installed. There will indeed be many identical system files (static files that never get modified) which are obvious candidates for this process.
Even in this simplistic example, you can begin to understand how the math begins to work to your advantage especially if the duplicate files are numerous, or large in size. This implementation of data deduplication is typically found as a feature within a backup software application, such as EMC® Avamar®.

Block Level
The block level is also referred to as the sub-file level because it is underneath the file layer and therefore it is a more granular approach. In this regard, you will hear people refer to the parts of the file as blocks, chunks or segments.
For example, you may have a situation where three files are mostly identical except for one or two data blocks. In comparison to file level data deduplication where entire files are dealt with, using a block-level process allows us to achieve much greater space savings because only the unique blocks are processed, as opposed to the entire file.
As with the file-level process, an index is referenced to determine whether a block of data is a candidate. What we're actually dealing with here are rather complex mathematical algorithms. This can all get rather technical, and different vendors employ unique techniques for each circumstance depending upon the data being evaluated.

Byte Level
It can be said that the byte level method of data deduplication is even more granular than the block level method, but a caveat that should be mentioned — the more granular the solution, the more processing power required. Because of this fact, byte level data deduplication is usually implemented in the form of a purpose-built appliance.
Using this method, the data stream is analyzed and compared byte-by-byte to a previously stored stream. This method performs data deduplication post-process, meaning that your data is backed up in its native state onto a disk appliance, and once it is all there the deduplication process begins. A sufficient amount of hard disk space is required on the appliance as it needs room to perform its work. However, once it is finished the working disk space is made available again.
Some vendors will actually retain an unmodified, full backup so that a potential full restore can be performed quickly without the added overhead of un-deduplicating the data as it gets restored.
Proponents of the byte level methodology suggest that disk space is inexpensive and consequently we should leverage this fact to make great gains in speed by not deduplicating in-line or on-the-fly, both of which add significant processing overhead. This is not a trivial point and if backing up as quickly as possible is your primary goal, then you should most definitely consider a purpose-built appliance that does post-processing of your data.
In-Line Versus Post-Processing Deduplication
As stated above, if the main objective is to make your data backups as fast as possible, then post-process is what you should consider. Conversely, if you are more concerned about conserving space on your backup disk target, then in-line is the better choice.
If you are considering the replication of data from one location to another over a WAN link, then in-line processing may be a better choice. If your data is mostly static files that do not get modified very often (if at all), then a file-level solution may meet your requirements, and would probably cost less than either a block-level or byte-level appliance.
In any case, you should endeavor to clean up your data first as a best practice — there is no sense in working with data that truly does not need to be saved, regardless of the chosen backup methodology.
Ready to get started? General advice for the data deduplication process:
1) Evaluate and clean up your data first.
Your first step should be to evaluate your data. Is your data mostly comprised of common Microsoft office files? This type of data is a good candidate for a data deduplication solution.
Is your data mostly comprised of video, audio, image, imaging database and encrypted files? These types of files do not deduplicate very well (or at all as is the case with encrypted files).
Cleaning up your data usually requires setting one or more policies regarding the data itself. For example, a company may have a policy that any files with an mp3 extension will be automatically deleted and therefore not retained. A data retention policy can include and lend consideration to a multitude of criteria that can be used in combination; i.e. size of file, type of file, date of last file access, file extension, archive status and more. If this sorting out of data can be accomplished prior to backup or replication then the time required for the process will accordingly be decreased.
2) Don't be too complicated.
Sometimes people think that the more stuff they buy, the better off they are. You may be tempted to think that compressing your data before you send it over to the data deduplication appliance would be a good idea, but this is often not the case. The appliance needs to "see" the data in an unaltered state (and most appliances will already include some form of compression). Try to avoid layering solutions on top of one another, unless you are absolutely certain of the compatibility aspects.
3) Ask for a demo.
Vendors that are certain of the suitability of their solution for your data environment should be willing to provide a demo system for your evaluation. Furthermore, they should be willing to assist with the installation and configuration of the solution. You should also test more than one solution to get a complete idea of options, regardless of how impressed you may be with the first one that you evaluate.
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Posted on Tue, Feb 23, 2010 @ 09:50 PM
Historically, storage arrays were developed with only minor differences in basic design. However, in today's world of storage virtualization, there are several options to consider, including: traditional storage, modular storage and Intelligent-Clustered Storage (ICS), the next generation of modular storage.
When evaluating these critical data storage systems, everyone seems to agree on the key desired attributes:
- Acceptable Levels of Performance
- Easy to Manage (including Remote Administration)
- Snapshots or Continuous Data Protection (CDP)
- Ability to Replicate Data to Remote Locations (i.e., Disaster Recovery Planning)
- Non-Disruptive Additions
- Non-Disruptive Scheduled Maintenance
- Pay-As-You-Grow (Substantial Scalability)
- Industry Standard Technology (non-proprietary)
- Realistically Priced
- Support for Virtualization Platforms (e.g., VMware, Citrix, Microsoft, Linux)
- BONUS Feature: Automated Tiered Storage & Automatic RAID
- BONUS Feature: Support for SSD (Solid State Disk) Drives
- BONUS Feature: Support for Data Encryption or Enhanced Security
- BONUS Feature: Power Friendly (idles drives not being accessed to conserve power)
So, how do different data storage solutions stack up?
Traditional Storage Arrays
Traditional storage arrays are comprised of a head unit that typically holds one or two storage controllers, the various interfaces to the storage network, and connectivity for storage shelves or cabinets. This design can scale only to a certain point and consequently a specific level of performance. Once those limits are reached, a forklift upgrade to a larger system is required.
For this reason, one must usually estimate the projected storage growth and acquire a system much larger than what is required to meet current storage needs.
Often people do not like the idea of buying more than they need in anticipation of future developments, nor are they interested in expensive proprietary systems since "open" alternatives are plentiful.
In data storage, nothing is more "open" than a truly virtualized storage environment because of the flexibility afforded by the basic premise of the design.
Modular (Virtualized) Storage
A more modern approach, offered by storage manufacturers like DELL EqualLogic, Compellent and Hewlett Packard (LeftHand P4000), is the modular storage node architecture reminiscent of a grid-computing paradigm. Each storage node includes controllers, interfaces and drive space; therefore, scalability is much greater than the traditional storage array architecture.

A significant comparative to traditional designs is that one need only purchase the amount of storage required to meet current needs. This is because simply adding another storage node accommodates expansion. Due to the integrated virtualization layer in modular storage units, more storage is usually accomplished without the complicated reconfiguration required by traditional storage arrays.
Of further contrast to traditional designs, the modular storage node approach actually realizes a performance increase as it is expanded. This is a result of each node having controllers and dedicated storage network connections, in addition to disk space,
Also worth noting, day-to-day management, system upgrades and important features like Snapshots, Thin-Provisioning and Replication (synchronous and asynchronous) are more easily implemented in the modular design — typically at a much lower cost and decreased levels of downtime.
Note: Some manufacturers offer storage virtualization solutions in the form of front-end appliances or host-based software solutions that are positioned in front of these traditional storage arrays. This offers an opportunity to upgrade an otherwise outdated system without a complete overhaul.
The Latest Option: Intelligent Clustered Storage (ICS)
Some people view ICS technology as being so significant that they refer to it as a paradigm shift in the way we'll work with data storage going forward. This intriguing storage technology has roots in the General Parallel File Systems (GPFS) that were created several years ago by IBM in partnership with Intel. In fact, this technology is pervasive in the majority of production super-computer platforms deployed today.
Essentially an ICS solution is comprised of a number of physically independent storage nodes that have complete awareness of one another, and present themselves as a single, logical pool of storage.

The storage nodes are referred to as intelligent because each node has processors, controllers, memory, network interfaces and hard disk drives along with the enabling software. This could be considered to be storage virtualization software, because what is presented to the host operating system is not the true physical storage, but rather a logical representation of the storage that we want the operating system to see or have access to.
Fault resilience is inherent in this design, as data is automatically saved to multiple intelligent storage nodes. Consequently, a single disk failure, or the failure of an entire storage node, will not disrupt service. This is because data is split into blocks that are striped across all of the nodes within the intelligent storage cluster. In fact, some manufacturers can provide protection from multiple storage-node failures in their design.
While the storage system attributes mentioned above are difficult, expensive or unavailable in traditional storage array solutions, they are commonplace in the ICS design.
Virtual Storage Options Offer Increased Business Agility
Bringing server virtualization together with storage virtualization enables an environment that can respond quickly and effectively to business changes that require more computing power and/or more storage. Furthermore, it is no longer necessary to bet too much capacity as a hedge against the dynamic nature of business, saving money both upfront and ongoing.
In the early adoption phase of this virtual technology wave, management of virtualized storage solutions may have been labeled complex or cumbersome. Those days are behind us as manufacturers have made great improvements in easing management techniques and interfaces, as well as simplifying underlying architecture.
Today, virtualized storage options are affordable, scalable and easy-to-manage, and worth consideration if you are looking for a new, dependable storage solution that can grow with your business.
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Posted on Tue, Feb 16, 2010 @ 09:38 AM
Practically everyone concerned with keeping a business computer network up and running has heard of server virtualization, but the concept of storage virtualization is not as well known or understood. Ironically, the benefits of storage virtualization can be appreciated whether or not your servers are currently virtualized, though server virtualization can be a catalyst for considering storage virtualization.
In medium to large data center environments, it is inefficient for each server to be configured with independent storage because of the amount of resulting unused, and therefore wasted, hard disk drive space. This is referred to as DAS or Direct Attached Storage. The storage could either be internal to the server, or it could be housed externally and connected directly to the server via SCSI, SAS or FC (FIBRE-Channel).

In comparison, a shared storage environment consists of one of more storage arrays and a dedicated storage area network (SAN). The centralized pool of storage can be shared amongst the servers as determined by the storage administrator, ensuring that unused or wasted space is kept to a minimum.
Basic benefits of a SAN include:
- Improved reliability due to its redundant design with no single points of failure
- Reduced cost of backup because of the centralized design
- Improved scalability and performance because of the dedicated storage network and the shared design
- Simplified storage provisioning because of the centralized approach
- Improved data availability because of the ability to facilitate server clusters and virtual servers, as well as to provide for either storage array-based or host-based data replication
Furthermore, storage can be tiered.
Tier-One storage typically consists of volumes of data that are either frequently accessed or deemed critical to the daily operation of the business. It is therefore housed on the most expensive, most reliable and best performing SAS or FC type hard disk drives. In comparison, Tier-Two storage is data that is not accessed frequently or is of secondary importance, and therefore it can be housed on less expensive SATA disk drives.

Do you remember time sharing?
As many already know, virtualization as a concept is not new. For example, although the names and references in the micro-computer world may differ, virtualization of computing resources has been a feature on IBM Mainframe computers for quite some time - it just used to be called time sharing. Whether we are considering servers, storage devices or the networks that connect everything, virtualization is all about achieving greater levels of efficiency and optimization.
Consider your portable music player when thinking about storage virtualization.
The driver behind this consideration is the continued data growth that we are all experiencing. It is not just companies and organizations, but individuals as well. As a point of reference, consider your portable music player. Do you ever have enough space?
It is not so farfetched to say that people are walking around with mobile phones that typically have more storage capacity than the average server did only ten years ago! Data is growing so rapidly that proactive management has become of paramount interest relative to controlling costs, while continuing to provide practical levels of access to the data. In this regard, storage virtualization as a concept speaks to the notion of operational value. That is to say, proactively managing storage resources keeps costs in check, and a technology that facilitates this is of value from an ongoing operational cost perspective.
Automation is key in proactively managing resources.
Traditional approaches to managing storage are becoming too restrictive and cumbersome because of the dynamic nature of data these days. Storage managers need to be able to pass judgment on data and move it around as needed based upon factors such as file size, type of file, age of file, date of last access and other criteria.
Critical data needs to be analyzed, replicated and archived in a timely manner, and having storage resources virtualized more easily facilitates all of this. How? When we virtualize, we are creating a layer of abstraction between the otherwise strict and hard rules of the storage resources. This layer of abstraction provides us with the flexibility that we need to control the storage resources in a more effective manner.
Has your organization implemented a storage virtualization strategy? What challenges have you encountered in as you worked through the process?
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Posted on Thu, Jan 21, 2010 @ 08:33 PM
Today there are a few different choices for server form factors, and buyers need to know how to make the best decision based upon several criteria. Below are some standard server form factor options, along with some information to help you determine which type may best suit your needs.

Tower Servers
Full-size Tower servers are a thing of the past, except for use in very specialized applications. Modern Tower servers are more midsize in height and therefore physically similar to Tower-Style Desktop PCs. The extensibility of Tower servers is limited, and therefore they are best positioned for departmental use, small remote office locations or to serve the needs of small businesses with limited growth projected for server requirements. Choose tower server configurations when cost is a concern, your needs are simple and projected growth is very limited.

Tower Rack-Mount Conversion
These days, the Tower-to-Rack conversion systems are predominantly used in situations where a single server requires a substantial amount of internal storage. These are typically specialized application servers and having the ability to rack mount them is advantageous. Choose this arrangement when you need to dedicate a large amount of storage to either a specialized application server or an independent database server and when your needs for external network and storage connections is limited.
Rack Mount Servers
Rack-mount servers are still the most popular server form factor acquired these days. Standard rack enclosures are 42U in height (1U = 1.75") and can therefore accommodate up to forty-two 1U servers which are often referred to as pizza-box servers. The rack enclosure is typically equipped to integrate and provide power and data cabling as well as for a measure of physical security. Rack enclosures are usually an important component in airflow management toward providing adequate cooling for the server environment.
The popularity of rack-mount servers will likely decrease (eventually) as the benefits of the blade server chassis configuration is well proven. Choose this server form factor when your overall configuration will consist of an average to large number of nodes, and each node is still required to have substantial capacity versus the comparable capacity available in a blade server.

Blade Server Chassis
Blade server chassis are a somewhat newer server form factor that further integrate power and data cabling and achieve greater computing power density in a given space. This increased ratio of computing power per square foot requires additional consideration, as paid to cooling and many early adopters disrespectful of this fact learned a hard lesson in airflow management. It is important to note that blade servers still do not have the internal capacity that rack-mount server do even though four-processor blades are available and further technological advancements in consolidation continue to be made.
The blade server chassis concept has been such a success that Hewlett Packard has designed a smaller chassis designed to accommodate eight or fewer servers in a chassis only 6u in size. IBM has a small chassis that can hold six server blades in a 6U chassis and also include space for a significant amount of shared storage. Additional server manufacturers like DELL and Fujitsu also continue to invest in further refining the blade server form factor. The point is that a reasonable ROI (Return-on-Investment) can be achieved even with these smaller server densities because of the many benefits offered by the blade form factor.
Blade Server Chassis allow us to achieve much greater computing power density in a given space. This design eases management, greatly reduces required cabling, is substantially more efficient with regards to power usage and cooling requirements and achieves a ROI in a realistic period of time. Choose the blade server chassis form factor when each server blade or node can be of a relatively small capacity and when you need to achieve great server densities and scalability. You should also lend strong consideration to the blade server chassis because less infrastructure is simpler to manage.


Hybrid Blade/Rack Grid Chassis
The IBM System x® iDataPlexTM and the Hewlett Packard ProLiant SL Scalable System represent the latest thinking in extremely large scale-out computing. It is reasonable to label these solutions as Hybrids as they seem to blend design aspects of conventional rack-mount servers and blade server chassis design. These systems represent the current state of the art in x86 compute power density. In order to achieve these great densities certain compromises had to be made. For example, each server node typically does not have redundant power. This is acceptable given the intended usage of these machines; web server farms or other cluster aware application usage. In this scenario a single or even a few nodes failing does not halt application availability. In fact, it is claimed that sever density can be increased by more than 200% over conventional rack-mount server designs.
This concept is the newest twist on server form factor and is viewed by some as a combination of conventional rack-mount and blade server chassis designs. This is for a large-scale solution and can assist in solving concerns with constraints in power, cooling and physical space. You should consider these systems if your needs call for truly massive scale-out computing for applications such as Web 2.0, HPC (High-Performance Computing) or very substantial and corporate data processing.
What's Next?
Some people say that the challenge going forward seems to be concerns with rolling out and managing large-scale server deployments. In that regard, Unified Computing seems to be the next frontier, at least according to Cisco and Microsoft. Thanks for reading.
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Images courtesy of IBM and Hewlett Packard.
Posted on Sat, Jan 09, 2010 @ 03:36 PM
Author: Perry Szarka
Modern data centers and computer rooms house critical business applications and data and therefore it is important to ensure that every reasonable effort is made to provide a resilient support infrastructure through proactive data center management.
Despite the use of high quality hardware and careful software configuration, problems can still occur if unseen infrastructure weaknesses exist. This is why comprehensive data assessments can be of a great benefit to many organizations.
The goal of a thorough data center assessment is to confirm that the data center environment is running smoothly, and providing a safe and secure home to your organization's most sensitive data and information.
There are of course wide-ranging differences between data centers and therefore assessments should be tailored to the particular data center being evaluated.
However there are several standard procedures for determining the overall quality of your data center:
- Inspection of the physical space and Thermal Assessment
As menial as it may sound, it's important to inspect center's physical environment including the floors, walls, ceilings and the space under a raised floor (if present). The intent here is to ensure that the space is suitable for the machinery it houses and that no obvious structural concerns exist.
A suboptimal environment may contain cooling or airflow problems and a detailed thermal assessment may be needed to test the airflow. This keeps the temperature at a moderate level, which in turn keeps your machines operating smoothly. If physical security is of concern, the physical assessment will help to determine any additional security needs.
- Inspections of power components
UPS systems, generators, power distribution units (PDUs), transformers, batteries and cabling should be inspected to ensure that a proper balance of power usage is maintained. Load ratings and other data measurements can help provide key metrics for this purpose. Without a proper balance, a power surge or shortage could damage expensive equipment, and stall data storage and backups.
- Equipment Census and Review
Servers, Storage Arrays, Network Equipment, Appliances, Tape Devices and any other IT equipment should be analyzed in specific cases, such as for organizations using a virtualized storage, solution or if server consolidation is being considered.
Ultimately, after assessing your environment, you will have all the information you need to accurately determine the stability and strength of your current data center, and identify any areas of weakness that require updates or improvements.
A complete data center asessment will help to ensure that your organization's data is regularly saved, housed in a location that gets the most out of your technology systems, and that sensitive and vital information remains safe and secure.
Perry Szarka is the Datacenter Strategic Business Unit leader at MCPc. He works closely with clients to understand their business objectives and discover solutions to help them achieve their goals.
Image courtesy of The Planet.