IonFS CLI: Integrating MongoDB as a Metadata Store

Iain Sutherland

Iain Sutherland

Engineer @ Ionburst Cloud

Background

Ionburst Cloud offers a revolutionary way to store data securely and privately in the Cloud, beyond the reach of hackers and unwanted surveillance. Data is transformed and persisted as redundant fragments across collections of storage nodes called Cloudlets™.

Resources needed:

Overview

In this post, we look at integrating MongoDB as an additional metadata store for our IonFS CLI.

IonFS was created as an example application to use the Ionburst SDK for storing objects and keep metadata concerning the objects in a customer-owned AWS S3 bucket. However, this metadata could equally be stored on any number of platforms, so this post will cover the work done to integrate MongoDB to store the metadata.

First steps

The first step made was to clone the IonFS Git repository and add the MongoDB client library. This is available on NuGet.

Once that was done, the coding changes could start.

The MongoDB metadata class

In IonFS there was already a class to deal with S3 interactions:

public class MetadataS3 : IIonFSMetadata

So, we create a new class to deal with MongoDB interactions:

public class MetadataMongoDB : IIonFSMetadata

The MetadataS3 class has a private member attribute, which is an instance of the class S3Wrapper that contains the implementation of the Amazon S3 client instance. From here, we will create a MongoDBWrapper class which does the same thing for MongoDB and holds an implementation of IMongoDatabase that the MetadataMongoDB class can access:

public class MongoDBWrapper
{
internal IMongoDatabase MongoDB;
}

The MongoDBWrapper class will use the application configuration file to determine the details used to connect to the MongoDB instance.

What about buckets?

The MetadataS3 class has a single argument constructor which is used to define the S3 bucket name used to store the metadata. This in turn is passed to the S3Wrapper class constructor. The bucket name is extracted from the configuration file for IonFS.

Now, MongoDB does not have buckets. The options we have for partitioning data are either databases or collections. I decided on using a database for the IonFS metadata to act in an equivalent manner to an S3 bucket. The MetadataMongoDB class constructor has a single string argument which is passed on to the MongoDBWrapper class constructor, and that is used to define the database name.

See the symmetry?

This means that the IonFS program can instantiate MetadataS3 and MetadataMongoDB in the same way, passing a single string as a parameter to the constructor:

IIonFSMetadata md = new MetadataS3(“bucket-name”);

Or,

IIonFSMetedata md = new MetadataMongoDB(“database-name”);

In fact, the name fed into either constructor can just be the same entry in the configuration file since it is only a label to partition data. S3 bucket or MongoDB database, it looks the same to the IonFS program.

What is written to the metadata store?

S3 is a straightforward key/value object store, but MongoDB is a little different. So we need a new class to represent what will be stored in the MongoDB document. With S3, the IonFSMetadata object is merely JSON serialized and written to the S3 bucket. For MongoDB, we need a wrapper class that will handle the ObjectId that is common to all MongoDB documents and contain the JSON serialized metadata.

[BsonIgnoreExtraElements]
[BsonDiscriminator("MongoDBIonFSMetadata")]
public class MongoDBIonFSMetadata : IMongoDBIonFSMetadata
{
[BsonId]
public ObjectId _id { get; set; }
[BsonElement("Key")]
public string Key { get; set; }
[BsonElement("Metadata")]
public string Metadata { get; set; } // JSON serialized metadata
[BsonElement("DateAdded")]
public DateTime DateAdded { get; set; }
}

Some additional decoration of the attributes was required in order to allow the MongoDB SDK functions to operate. I opted to have two main elements; Key and Metadata, that would map to the S3 Key and Value pairs.

Implementing the functions

Next, since MetadataMongoDB implements IIonFSMetadata, there are a number of functions that we have to implement using MongoDB instead of an Amazon S3 bucket.

The MetadataS3 class uses the AWSSDK to interact with S3. The typical sequence is: a request object is populated, that object is given to the SDK function, which returns an instance of a response object class, that can then be used to determine what the outcome was.

With the MongoDB library that we installed from NuGet, the operations are built around accessing a collection.

With S3, operations with the .NET SDK involve constructing a request object and sending that, such as:

s3.GetObjectRequest getRequest = new s3.GetObjectRequest
{
BucketName = s3.GetBucket(),
Key = file.FullName
};
using s3.GetObjectResponse response = await s3.S3.GetObjectAsync(getRequest);

With the .NET MongoDB SDK, the process is to establish a collection and then apply a filter to achieve the same thing. Within a MongoDB database, data is partioned by collection name. As such, I defined a constant string so that all the metadata stored in the MongoDB database would all belong to the same collection.

private const string COLLECTION_NAME = "IonFSMetadata";

So, to get a specific entry, we get the collection and filter it to a single key:

IMongoCollection<IMongoDBIonFSMetadata> objectMetadata = _mongo.MongoDB.GetCollection<IMongoDBIonFSMetadata>(COLLECTION_NAME);
var filter = Builders<IMongoDBIonFSMetadata>.Filter.Eq("Key", file.FullName);
List<IMongoDBIonFSMetadata> metadataList = await objectMetadata.Find(filter).ToListAsync();
// We expect our filter will have only 1 item in the list
IMongoDBIonFSMetadata metadataEntry = metadataList[0];

At this point we have to remember that while the value for a key in the S3 implementation is a complete serialization of the IonFSMetadata class, our MongoDB implementation encapsulates that within a wrapper class that deals with some details specific to MongoDB. We therefore need to deserialize an item within the object returned from MongoDB:

data = JsonConvert.DeserializeObject<IonFSMetadata>(metadataEntry.Metadata);

All that is required now is to implement all the public functions that the MetadataS3 class has, but using the MongoDB .NET SDK instead of the AWS .NET SDK. In every case with the MongoDB functions, the code has to establish a collection, sometimes filter it down to a single item, and call the appropriate function. For a PUT, a filter is not required:

MongoCollection<IMongoDBIonFSMetadata> objectMetadata = _mongo.MongoDB.GetCollection<IMongoDBIonFSMetadata>(COLLECTION_NAME);
await objectMetadata.InsertOneAsync(obj);

Where obj is an instance of the MongoDBIonFSMetadata class described earlier. A delete requires a filter and is simply:

IMongoCollection<IMongoDBIonFSMetadata> objectMetadata = _mongo.MongoDB.GetCollection<IMongoDBIonFSMetadata>(COLLECTION_NAME);
var filter = Builders<IMongoDBIonFSMetadata>.Filter.Eq("Key", fSObject.FullName);
await objectMetadata.DeleteOneAsync(filter);

Once complete, then all that is required is to make the IIonFSMetadata instance created by the IonFS program be an instance of MetadataMongoDB instead of MetadataS3:

IIonFSMetadata md = new MetadataMongoDB(“database-name”);

The program can now be run using a MongoDB database as a metadata store instead of an Amazon S3 bucket.

Going to town with C#

At this point, the program could be redirected in code to use MongoDB instead of S3, essentially locking into a choice at compile time. So why not go a little bit further and make it configurable at runtime, or even allow both to be used at the same time? With C# we get some interesting possibilities.

The approach used is to define the possible storage repositories in the standard .NET appsettings.json file:

"Repositories": [{
"Name": "first-S3",
"Class": "Ionburst.Apps.IonFS.MetadataS3",
"DataStore": "app1-metadata"
},
{
"Name": "second-S3",
"Class": "Ionburst.Apps.IonFS.MetadataS3",
"DataStore": "app2-metadata"
},
{
"Name": "first-mongo",
"Class": "Ionburst.Apps.IonFS.MetadataMongoDB",
"DataStore": "app2-metadata"
}
]

Where DataStore is a generic name for something specific to the storage platform. For S3 it represents a bucket name, and for MongoDB it represents a database name. The Class attribute is the fully qualified type name of the C# class that implements the storage. The class names seen here, MetadataS3 and MetadataMongoDB, represent the available metadata store types.

This configuration can be read by the IonFS program to create a list of metadata repositories, and then use whichever from the list is most appropriate for what the function is doing.

List<IonFSRepositoryConfiguration> configuredRepositories = new List<IonFSRepositoryConfiguration>();
config.GetSection("IonFS:Repositories").Bind(configuredRepositories);
// Build the repo collection from config
Repositories = new List<IonFSRepository>();
foreach(IonFSRepositoryConfiguration configuredRepository in configuredRepositories)
{
IonFSRepository newRepository = new IonFSRepository
{
Repository = configuredRepository.Name,
DataStore = configuredRepository.DataStore
};
Type t = Type.GetType(configuredRepository.Class);
newRepository.Metadata = (IonFSMetadata)Activator.CreateInstance(t, configuredRepository.DataStore);
Repositories.Add(newRepository);
}

The type of the class in the config file has to be a fully qualified name to allow system reflection to instantiate the class as an object using Activator.CreateInstance and assign to the IonFSRepository class attribute Metadata which is declared as:

public IIonFSMetadata Metadata { get; set; }

We can now fetch a specific metadata store from the repository list depending on what is required:

IIonFSMetada md = Repositories.Find(r => r.Repository == “second-S3”).Metadata;

Finishing up

Given the wide variety of NoSQL technologies currently available, it made sense to explore integrating one with the IonFS application. As we have seen above, it was relatively straightforward to implement a metadata store using MongoDB, that allows you to store metadata associated with objects stored in Ionburst Cloud.

You can get started with IonFS and MongoDB here or check out the source code on our GitLab.