Java Cluster File System Overview

1. What is Java Cluster File System ?

Java Cluster File System (JCFS) is a distributed file system platform which is accessible by means of a simple Java API.

From the point of view of developers JCFS is very simple, you have only to replace java.io.File usages with Rfile and it's all done !

Java 6 class/method	JCFS corrispective	notes
java.io.File	RFile
java.io.FileInputStream	RFileInputStream
java.io.FileOutputStream	RFileOutputStream	"append mode" supported
java.io.file.length()	RFile.length()
java.io.file.delete()	RFile.delete()

Client applications for the filesystem can use this simple API and let the definition of transactionality, replication-requirements and quality-of-service direcly on server-side configuration files.

Client application can deal directly with JCFS API and specify each option at every write.

2. Quality of Service

JCFS offers a reliable and scalable filesystem which can be configured in order to suit every need.

In particular most important aspects are transactional writes and replication constrants.

2.1 Transactional Writes

When writing to a RFileOutputStream you can require that changes to the file MUST be applied only on a successful "close" on the stream (we cannot add a "commit" method in order not to modify the simple semantics of FileOutputStream).

Each client of the filesystem will see changes only after the close of the stream.

A particular case is when the file has to be overwritten and multiple copies of the file are present on different nodes.

Default write mode is "best-effort": using this simple mode files are immediatly readble by other clients (as it happens where you open a file in write mode with one application and in read mode with another)

2.2 Replication constraints

When writing on a file clients (or server-side configuration) can requite that the write has to be considered "done" (and "successful") only when the file has been replicated on at least N other nodes of the cluster.

It is possibile to let the system replicate the file asynchronously during idle time

3. Load balancing

As you will see in the architecture the filesystem is implemented as a network of nodes (peers), each client can be configured to read files from any node and to write files to any node.

This simple but effective property let the architecture be load-balanced naturally.

4. General Architecture

Figure: TODO

Server-side architecture:

1.network of peer-nodes
2.each node uses UDP multicast/broadcast to discover other peers
- 3.static configuration is possible anyway
3.each node uses TCP to exchange files
4.some informations can be propagated with UDP (such as "delete file")

Client-side architecture:

1.JCFS client can discover server-nodes using UDP multicast/broadcast
- 2.static configuration is possible anyway
3.JCFS client use TCP to exchange files with server nodes
4.JCFS client use connection-pool to reduce TCP connection-overhead
5.JCFS client search for files on some random node and use UDP to search files on other nodes in case of file-not-found response
6.JCFS streams are directly mapped to something like streams on server-side nodes and the only synchronization points are on "COMMIT" points (on close)

There is a "loopback" mode in order to let clients work on a real-filesystem, usually for tests or for simple application modes.

5. Usual topologies

Java cluster file system can be configurare in order to obtain:

1.load balanced reads
2.failsafe high available file system

But replicating files on each node could be expensive (disk-space, UDP packets....) and it is not so sane to place a platform which cost is greater than the return !!

This topologies will help you create a network suitable for your needs.

5.1 Loopback mode

There is a single JCFS client which is configured with a local directory that is accessed throught the RFile API.

This mode is useful only for:

1.testing
2.simple configuration of your application when you do not need JCFS but your application is designed to use it !!

5.2 Load balanced mode

There are N server node configured to talk to each other using UDP broadcast (or multicast).

Each client writes and reads randomly on any node (random writes, random reads with udp search in case of file-not-found).

Each file is not replicated on any node.

Variant: configure asynchronous replication of files

Warning: in case of failure of one node some files are not accessible (not-found)

5.3 High-availability mode

There are N server node configured to talk to each other using UDP broadcast (or multicast).

Each client writes and reads randomly on any node (random writes, random reads with udp search in case of file-not-found).

Every write requires replication of each file to at least N nodes.

In case of failure of one node files are guaranteed to be available.

5.4 Master-slave mode

One node is tagged as "master" and other node are tagged as "slaves".

Each client is configured to write only to "master" and to read from any node (or always from master).

Writes are replicated to at least 2 nodes (master + 1 slave or more) in order to assure read-availability in case of master-crash.

Note: the fact the one node is a "master" is not written anywhere ! It is on client-side configuration that you declare to use only that node and not to discover nodes.

Overview

Project Documentation