Gluster


Gluster Inc. was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineering center in Bangalore, India. Gluster was funded by Nexus Venture Partners and Index Ventures. Gluster was acquired by Red Hat on October 7, 2011.

History

The name Gluster comes from the combination of the terms GNU and cluster. Despite the similarity in names, Gluster is not related to the Lustre file system and does not incorporate any Lustre code.
Gluster based its product on GlusterFS, an open-source software-based network-attached filesystem that deploys on commodity hardware. The initial version of GlusterFS was written by Anand Babu Periasamy, Gluster's founder and CTO.
In May 2010 Ben Golub became the president and chief executive officer.
Red Hat became the primary author and maintainer of the GlusterFS open-source project after acquiring the Gluster company in October 2011.
The product was first marketed as Red Hat Storage Server, but in early 2015 renamed to be Red Hat Gluster Storage since Red Hat has also acquired the Ceph file system technology.

Architecture

The GlusterFS architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.

Public cloud deployment

For public cloud deployments, GlusterFS offers an Amazon Web Services Amazon Machine Image, which is deployed on Elastic Compute Cloud instances rather than physical servers and the underlying storage is Amazon's Elastic Block Storage. In this environment, capacity is scaled by deploying more EBS storage units, performance is scaled by deploying more EC2 instances, and availability is scaled by n-way replication between AWS availability zones.

Private cloud deployment

A typical on-premises, or private cloud deployment will consist of GlusterFS installed as a virtual appliance on top of multiple commodity servers running hypervisors such as KVM, Xen, or VMware; or on bare metal.

GlusterFS

GlusterFS is a scale-out network-attached storage file system. It has found applications including cloud computing, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by Red Hat, Inc., as a result of Red Hat acquiring Gluster in 2011.
In June 2012, Red Hat Storage Server was announced as a commercially supported integration of GlusterFS with Red Hat Enterprise Linux. Red Hat bought Inktank Storage in April 2014, which is the company behind the Ceph distributed file system, and re-branded GlusterFS-based Red Hat Storage Server to "Red Hat Gluster Storage".

Design

GlusterFS aggregates various storage servers over Ethernet or Infiniband RDMA interconnect into one large parallel network file system. It is free software, with some parts licensed under the GNU General Public License v3 while others are dual licensed under either GPL v2 or the Lesser General Public License v3. GlusterFS is based on a stackable user space design.
GlusterFS has a client and server component. Servers are typically deployed as storage bricks, with each server running a glusterfsd daemon to export a local file system as a volume. The glusterfs client process, which connects to servers with a custom protocol over TCP/IP, InfiniBand or Sockets Direct Protocol, creates composite virtual volumes from multiple remote servers using stackable translators. By default, files are stored whole, but striping of files across multiple remote volumes is also possible. The client may mount the composite volume using a GlusterFS native protocol via the FUSE mechanism or using NFS v3 protocol using a built-in server translator, or access the volume via the gfapi client library. The client may re-export a native-protocol mount, for example via the kernel NFSv4 server, SAMBA, or the object-based OpenStack Storage protocol using the "UFO" translator.
Most of the functionality of GlusterFS is implemented as translators, including file-based mirroring and replication, file-based striping, file-based load balancing, volume failover, scheduling and disk caching, storage quotas, and volume snapshots with user serviceability.
The GlusterFS server is intentionally kept simple: it exports an existing directory as-is, leaving it up to client-side translators to structure the store. The clients themselves are stateless, do not communicate with each other, and are expected to have translator configurations consistent with each other. GlusterFS relies on an elastic hashing algorithm, rather than using either a centralized or distributed metadata model. The user can add, delete, or migrate volumes dynamically, which helps to avoid configuration coherency problems. This allows GlusterFS to scale up to several petabytes on commodity hardware by avoiding bottlenecks that normally affect more tightly coupled distributed file systems.
GlusterFS provides data reliability and availability through various kinds of replication: replicated volumes and Geo-replication. Replicated volumes ensure that there exists at least one copy of each file across the bricks, so if one fails, data is still stored and accessible. Geo-replication provides a master-slave model of replication, where volumes are copied across geographically distinct locations. This happens asynchronously and is useful for availability in case of a whole data center failure.
GlusterFS has been used as the foundation for academic research
and a survey article.
Red Hat markets the software for three markets: "on-premises", public cloud and "private cloud".