Abstracting the Database in Go

Go as a language is relatively young, compared to older and more ubiquitous languages such as Java, C#, Ruby and Python. The great thing about a new language is that developers aren’t bound to patterns of writing code that may have become entrenched in other languages due to the familiarity of the most common frameworks in those languages. For example, in a simple Java Spring Web application, one has the controllers, the JPA repositories, entities, and the if-else statements in between representing the application logic. The difficulty about a relatively new language is that there isn’t a commonly agreed upon way to do things, making everyone a potential proselyte, tossed back and forth between the waves of various newly evangelized dogma.

Despite this difficulty, in this storming/norming phase of the language use, there’s a lot that can be learnt from the strengths and mistakes of other languages, frameworks and practice communities. This is also my attempt to solidify my own thinking and learning, and add to the general body of knowledge (or noise).

In Go, it’s easy to write absolutely everything in one main.go file. Consider the example of writing an application which records completely hypothetical trade wars between world superpowers¹ :

package mainimport (
	"context"
	"encoding/json"
	"net/http"
	"go.mongodb.org/mongo-driver/mongo"
	"go.mongodb.org/mongo-driver/mongo/options")

type TradeWar struct {
  Name string `json:"name" bson:"name"`	
  CountryA string `json:"countryA" bson:"countryA"`
  CountryB string `json:"countryB" bson:"countryB"`
}

func main() {
  mux := http.NewServeMux()
  opts := options.Client().ApplyURI("mongodb://localhost:27017")
  mongoClient, _ := mongo.Connect(context.Background(), opts)
  mux.Handle("/tradewar", http.HandlerFunc(func(writer http.ResponseWriter, request *http.Request) {	
    var postedTradeWar TradeWar		
    json.NewDecoder(request.Body).Decode(&postedTradeWar)		
    mongoClient.Database("tradewars").Collection("wars").InsertOne(context.Background(), postedTradeWar)	
  }))	
  http.ListenAndServe(":4000", mux)
}

To explain what the above code does, it first declares a type called a TradeWar, giving the property names in both JSON and BSON for saving in a Mongo database. Within the main function, the mux router is created, which will be used to route the http requests. A mongoClient is also created to save the details in the Mongo database. Lastly, the request handler is declared for the route /tradewar. Within this function, the JSON is decoded to the TradeWar struct created earlier and saved in the database through the mongoclient. All simple and well. Except, though applications normally start simple, they don’t always remain this simple.

If one wanted their database to change (e.g. deciding to use PostgreSQL instead of MongoDB), would the TradeWars struct now have to change, and everywhere that the database is accessed from within the application? In large applications, that is a lot of risky change. It helps if there are tests to mitigate the riskiness of the change, but if the various parts of the application aren’t abstracted, the tests are probably tightly coupled to the implementation of the database choice.

To allow for the ease of making future changes, as well as the general future maintainability of the code, the data layer can be abstracted, and the code reliant on the logic can depend on the abstraction of that, instead of the concrete implementation.

In the above example, we could create a TradeWarsStore store² which is the abstraction for storing the trade war data.

type TradeWarsStore interface {
  Save(ctx context.Context, tradeWar TradeWar) error
}

The TradeWar struct will no longer have bson details, as the underlying implementation may not be a mongo database. Also, even if it remains a mongo database, the collections we choose to store it in may change. In implementing the change, we only need to be change the code in one place with the rest of the code oblivious to this detail.

Lastly, in the Save function, the context is added to allow greater control for cancelling operations when needed or giving timeouts, and can be used across various database implementations.

When refactoring the above code, we can move the save logic to the new implementation of the interface:

type MongoTradeWarsStore struct {
  mongoClient *mongo.Client
}

func NewMongoTradeWarsStore(client *mongo.Client) TradeWarsStore {
  return &MongoTradeWarsStore{mongoClient: client}
}

func (store *MongoTradeWarsStore) Save(ctx context.Context, tradeWar TradeWar) error {
  database := store.mongoClient.Database("tradewards")
  collection := database.Collection("wars")
  _, err := collection.InsertOne(ctx, tradeWar)	return err
}

Within the Save function, we can map the TradeWar struct to some kind of entity struct that better represents the database schema (a word I’m using loosely), so that the rest of the application logic is not closely coupled to the database representation of that logic. We could also have directly created a bson structure that would have represented what we wanted the data structure to look like.

The handler function within the main method now doesn’t need to know of the underlying implementation of the database. It just needs to know that there is a store and that it saves stuff.

func main() {	
  mux := http.NewServeMux()
  opts := options.Client().ApplyURI("mongodb://localhost:27017")
  mongoClient, _ := mongo.Connect(context.Background(), opts)
  store := NewMongoTradeWarsStore(mongoClient)
  mux.Handle("/tradewar", http.HandlerFunc(func(writer http.ResponseWriter, request *http.Request) {
    var postedTradeWar TradeWar	
    json.NewDecoder(request.Body).Decode(&postedTradeWar)	
    store.Save(context.Background(), postedTradeWar)	
  }))	
  http.ListenAndServe(":4000", mux)
}

Abstractions make the code more testable, because unit tests do not need to rely on concrete implementations, but on the abstract interfaces created, allowing the creation of mocks representing the abstracted types. All in an effort to make the code more maintainable, and easier and less risky to change.

This may all look familiar to people who’ve done so-called Enterprise Develepment and have used what was termed the Repository Pattern. In this pattern, you created an interface for a repository for the thing you are saving and retrieving e.g. UserRepository (or IUserRepository), and your application code then depended on these abstractions, while your IoC Container mapped them to the concrete implementations, or as in the case of Spring and JPAs, the framework did the work for you.

The problem with how the Repository Pattern was implemented most of the time, was that each repository represented a table / collection in the underlying database, therefore not abstracting the implementation of the database. So, if you have a UserRepository, a TradeWarsRepository and a CountryRepository, you know you have User, TradeWars and Country tables or collections in your database. If you change the database structure, then all the code that depends on these repositories also has to change. This high coupling reduces the flexibility of the code, making changes more risky and therefore the code base harder to maintain.

Abstractions should represent domain abstractions and not technical abstractions, something that domain-driven design got right. Therefore, our struct could look like below and still have 1 store:

type Country struct {	
  ISOCode string
  Name    string
}

type TradeWar struct {
  Name string
  CountryA Country
  CountryB Country
}

When retrieving the TradeWar, we wouldn’t get all TradeWars items from the TradeWarStore, and all Country items from the CountryStore and then map them in our application logic code. Instead, when retrieving the TradeWar from the TradeWar store, they will come with the Country items as part. The mapping of the database structure to the domain-specific object is a responsibility of the database-specific packages / modules (the Store in our example), and not the application logic.

What are your thoughts on abstracting the database?

This was writting during the time of the China-United States trade war in 2019.
I thought of using the word Repository instead of Store, but that word has different baggage depending on if one comes from the Domain Driven Development school of thought, or if one is familiar with the Java JPA libraries for data storage. There is also a Data Access Object, but that also has baggage from other frameworks. From my limited experience, Store is only used for a redux store in front-end development, which is different enough to this that its associated patterns won’t cause confusion.

Abstracting the Database in Go

Published by Pat Kayongo

Leave a Comment Cancel reply

Share this:

Related

Published by Pat Kayongo

Leave a Comment Cancel reply