#1 Reason enterprises are leaning toward Opensource Database stacks


Lot of enterprises are becoming more innovative in terms of using the open source technologies on the pay-as-you-go platforms making it convenient for their organization to optimize their IT costs. In this the database stacks are playing majority of the role, I would name few DB stacks PostGreSQL, MariaDB, MySQL, MongoDB etc.With this you’ll be paying only for the computing capacity which means for the underlying Infrastructure or for the maintenance of the hardware.

  • Why enterprises are becoming more Nimble & leaning toward open source DB stacks?

Without going in-depth in to more technical concepts I would try to convince you guys here, making it easy for you to gain the high level knowledge.

  1. Easy to install & configure
  2. Availability of documents in public domain
  3. Connectivity in both Native / JDBC and ODBC
  4. Rich client side command lines
  5. Easy to maintain the DB instances with small team 


  • What are the pain areas using the open source DB stack like PostGreSQL ?

But there are certain pain areas with respect to using this open source DB stacks, INSERT, UPDATE, DELETE DML statements creates bloats in these databases and there would be dead tuples occupying the storage. When DELETE operation is being done the jobs dead tuples would be created utilizing much storage causing the space unavailability for newly INSERTED & UPDATED record set.

In order to clean up the dead tuples and bloats there needs to be a VACUUM activity which needs to be done for each table on the database schema. Think that there is a Datawarehouse schema which is  having the capacity of 5 TB storage, having 1000s of tables with larger data set. It is time killing task to hit the VACUUM command manually on each of the table. Well, there is not only a vacuum task however, there are so many other pre-vacuum and post vacuum activities those needs to be done. Perhaps with my own experience the entire maintenance activity would take 7 to 8 hours per week on each database instance.

Think if you have 16 databases instances for variety of applications including the Build, Test and Prod environments. How much time your team is going to spend on this every week ? surprising right !!


  • What precaution models we have to build to avoid those pain areas ?

Well, nevertheless to say with the right strategic process and governance set, with the extremely high technology mindset. Such above scenario could be tackled by automating those. While doing the vacuum activity several steps are being carried.

  • Extract all the tables names with bloat ratio before the Vacuum
  • Extract all the tables names with Dead Tuples
  • List the tables those needs to be Vacuumed
  • Run the Vacuum on all the tables
  • Extract all the tables names with bloat ratio after the Vacuum
  • Extract all the tables names with after Dead Tuples



  • What is our suggestion 

Incorporate everything in a single intelligent program that runs on the WAN network to Vacuum the N-Number of PostGreSQL RDS instances. Since most of the larger enterprises are the combination of both OnPremise and Cloud by creating a Hybrid cloud economics, I’m bringing in here the concept of WAN. You can also ask me “Why do you want your program to run on the WAN network & not directly on the cloud or EC2 instance ?” Well, the answer is very simple, it is based on the organization’s application and network architectures.





No comments:

Post a Comment

Featured Post

Cloud : A Trendsetter Technology