The RASSP in IT Architecture

The QUALITY Attributes are those that defines how the system should perform over a business cycle/Time Period. Fundamental to good IT design is forecasting the business peaks and troughs and how the system should behave over the business cycle. When there are peak loads, systems should automatically be able to scale and perform with consistent service level agreements. However, there are several cases in the past where such quality attributes were ignored leading to loss of revenue.

Quality Attributes can be either run-time or non run-time attributes. Clearly, from a business standpoint, the run-time consideration is of prime importance as no one would want to lose business because of a system downtime or a performance lag. The blog post focuses on the run-time considerations an architect should adhere when making architecture decisions.

The following attributes are considered the most important run-time attributes when designing new systems. Reliability (R), Availability (A), Scalability (S), Security(S), and Performance (P). We can easily remember these key attributes with the acronym RASSP.

Reliability:
Reliability refers to ability of the system to perform the required functions over the business cycle/time period. More complex the system, less reliable it is to provide the required functions. Reliability often refers to Messaging Reliability and Transaction Reliability. Messaging Reliability refers to reliable to delivery of the messages to the destination. It is addressed using JMS Queue/Store and Forward, SOA WS-Reliable Messaging, WS-Reliability etc. Transaction Reliability refers to a system performing all-or-none operation. In the event of failure, the system should roll back and preserve the integrity.
Key Metrics: Probability of Data Loss(PDL), Mean-Time-To-DataLoss(MTTDL). Simulation of the systems using large volumes of data and employing real-time scenarios should give a fair idea of system reliability. Nevertheless, the system should ALWAYS be enhanced to have a ROLL-BACK feature should there be an emergency.

Availability:
Availability refers to proportion of time the system should be available for business-use. It is usually expressed in percentage. A 99.99% availability refers to less than 6 minutes of unplanned downtime in  a given year. Some of the key metrics that are used in reliability measurement are Mean Time to Repair(MTTR), Mean time between Failures(MTBF), Failover Recovery. Self-Healing systems, feedback loops, hybrid clouds help to make the systems available throughout the time period. Availability should be considered at each tier in a 3-tier or an N-Tier Architecture. Failover Infrastructure and Disaster Recovery systems should be in place to deal with emergency.

Scalability:
Scalability refers to the ability of the systems to scale up with the increase in the number of transactions. The systems can be scaled vertically or horizontally. Vertical scaling refers to adding more CPUs to a single node whereas horizontal refers to adding more nodes to the existing architecture. Scaling up has limitation from a budget and performance perspective.

Security:
This is of prime concern to both individuals and corporates. Given the amount of information traded on the net, there is incentive for hackers to intercept the data and monetize it. Hence, Security should be a top priority when designing the solution. Common security requirements include use of proper authentication, authorization systems, digital verification(signatures), encryption of data at rest and when transmitted over net, necessary firewalls, constant monitoring of virulent codes/malwares etc. Besides the privacy issue, there is a huge cost involved with data loss - Government Penalties.

Performance:
Performance refers to responsiveness of the system and is usually measured in terms of Latency, Throughput, and Bandwidth. Any business would prefer a low latent, high throughput, and high bandwidth system but the cost of implementing such systems are very high. There will be exponential increase in cost for squeezing out the next marginal increase in performance from the system. An architect should educate and advise the business users on the trade-off between cost and performance of the systems. Typically, High-Frequency trading systems require very high performance to gain on momentary mispricing in the market. Such systems should be tuned from network to application to squeeze the last bit of performance.

In summary, remember RASSP when designing a new system or enhancing an existing IT system.

References:
http://www.oracle.com/technetwork/topics/entarch/itso-165161.html
http://www.softwarearchitectures.com/go/Discipline/DesigningArchitecture/QualityAttributes/tabid/64/Default.aspx
http://msdn.microsoft.com/en-us/library/ee658094.aspx
http://www.cs.rutgers.edu/~ricardob/papers/srds08.pdf
http://www.clustrix.com/blog/bid/259950/Scale-Up-vs-Scale-Out


Comments

Popular posts from this blog

No Non-Sense Transformation

Patrick Pichette and Art of Corporate Finance in IT Industry