Scalability
Scalability is the property of a system to handle a growing amount of work by adding resources to the system.
In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a package delivery system is scalable because more packages can be delivered by adding more delivery vehicles. However, if all packages had to first pass through a single warehouse for sorting, the system would not be scalable, because one warehouse can handle only a limited number of packages.
In computing, scalability is a characteristic of computers, networks, algorithms, networking protocols, programs and applications. An example is a search engine, which must support increasing numbers of users, and the number of topics it indexes. Webscale is a computer architectural approach that brings the capabilities of large scale cloud computing companies into enterprise data centers.
In mathematics, scalability mostly refers to closure under scalar multiplication.
Examples
The Incident Command System is used by emergency response agencies in the United States. ICS can scale resource coordination from a single-engine roadside brushfire to an interstate wildfire. The first resource on scene establishes command, with authority to order resources and delegate responsibility. As an incident expands, more senior officers assume command.Dimensions
Scalability can be measured over multiple dimensions, such as:- Administrative scalability: The ability for an increasing number of organizations or users to access a system.
- Functional scalability: The ability to enhance the system by adding new functionality without disrupting existing activities.
- Geographic scalability: The ability to maintain effectiveness during expansion from a local area to a larger region.
- Load scalability: The ability for a distributed system to expand and contract to accommodate heavier or lighter loads, including, the ease with which a system or component can be modified, added, or removed, to accommodate changing loads.
- Generation scalability: The ability of a system to scale by adopting new generations of components.
- Heterogeneous scalability is the ability to adopt components from different vendors.
Domains
- A routing protocol is considered scalable with respect to network size, if the size of the necessary routing table on each node grows as O, where N is the number of nodes in the network. Some early peer-to-peer implementations of Gnutella had scaling issues. Each node query flooded its requests to all nodes. The demand on each peer increased in proportion to the total number of peers, quickly overrunning their capacity. Other P2P systems like BitTorrent scale well because the demand on each peer is independent of the number of peers. Nothing is centralized, so the system can expand indefinitely without any resources other than the peers themselves.
- A scalable online transaction processing system or database management system is one that can be upgraded to process more transactions by adding new processors, devices and storage, and which can be upgraded easily and transparently without shutting it down.
- The distributed nature of the Domain Name System allows it to work efficiently, serving billions of hosts on the worldwide Internet.
Horizontal (Scale Out) and Vertical Scaling (Scale Up)
Horizontal or Scale Out
Scaling horizontally means adding more nodes to a system, such as adding a new computer to a distributed software application. An example might involve scaling out from one web server to three. High-performance computing applications such as seismic analysis and biotechnology workloads scaled horizontally to support tasks that once would have required expensive supercomputers. Other workloads, such as large social networks exceed the capacity of the largest supercomputer and can only be handled by scalable systems. Exploiting this scalability requires software for efficient resource management and maintenance.Vertical or Scale Up
Scaling vertically means adding resources to a single node, typically involving the addition of CPUs, memory or storage to a single computer.Larger numbers of elements increases management complexity, more sophisticated programming to allocate tasks among resources and handle issues such as throughput and latency across nodes, while some applications do not scale horizontally.
Note that network function virtualization defines these terms differently: scaling out/in is the ability to scale by add/remove resource instances, whereas scaling up/down is the ability to scale by changing allocated resources
Database scalability
Scalability for databases requires that the database system be able to perform additional work given greater hardware resources, such as additional servers, processors, memory and storage. Workloads have continued to grow and demands on databases have followed suit.Algorithmic innovations have include row-level locking and table and index partitioning. Architectural innovations include shared nothing and shared everything architectures for managing multi-server configurations.
Strong versus eventual consistency (storage)
In the context of scale-out data storage, scalability is defined as the maximum storage cluster size which guarantees full data consistency, meaning there is only ever one valid version of stored data in the whole cluster, independently from the number of redundant physical data copies. Clusters which provide "lazy" redundancy by updating copies in an asynchronous fashion are called 'eventually consistent'. This type of scale-out design is suitable when availability and responsiveness are rated higher than consistency, which is true for many web file hosting services or web caches. For all classical transaction-oriented applications, this design should be avoided.Many open source and even commercial scale-out storage clusters, especially those built on top of standard PC hardware and networks, provide eventual consistency only. Idem some NoSQL databases like CouchDB and others mentioned above. Write operations invalidate other copies, but often don't wait for their acknowledgements. Read operations typically don't check every redundant copy prior to answering, potentially missing the preceding write operation. The large amount of metadata signal traffic would require specialized hardware and short distances to be handled with acceptable performance.
Whenever strong data consistency is expected, look for these indicators:
- the use of InfiniBand, Fibrechannel or similar low-latency networks to avoid performance degradation with increasing cluster size and number of redundant copies.
- short cable lengths and limited physical extent, avoiding signal runtime performance degradation.
- majority / quorum mechanisms to guarantee data consistency whenever parts of the cluster become inaccessible.
- write performance increases linearly with the number of connected devices in the cluster.
- while the storage cluster is partitioned, all parts remain responsive. There is a risk of conflicting updates.
Performance tuning versus hardware scalability
Substituting the value for this example, using 4 processors we get
If we double the compute power to 8 processors we get
Doubling the processing power has only improved the speedup by roughly one-fifth. If the whole problem was parallelizable, the speed would also double. Therefore, throwing in more hardware is not necessarily the optimal approach.
Weak versus strong scaling
In the context of high performance computing there are two common notions of scalability:- The first is strong scaling, which is defined as how the solution time varies with the number of processors for a fixed total problem size.
- The second is weak scaling, which is defined as how the solution time varies with the number of processors for a fixed problem size per processor.