In 2005, Google made an astonishing announcement that it would keep increasing GMail’s email storage by the second as long as it had enough space on its servers. Currently GMail provides more than 7338 MB of free storage. It is indeed intriguing as to how it can provide this ‘infinite’ storage space for its entire subscriber base. Is it at all possible? After all there is only so much ‘finite’ storage space available. So how can we get to ‘infinite’ storage?
If you search you will see discussion and speculation galore on nature of its physical servers, amount of storage (in petabytes) it owns and type of storage like holographic storage, network and distributed storage it might be using. Discussion also revolves around the fact that most of the email users will use less than 25%-30% of the available storage and so we will technically never run out of physical space ever or that Google will keep buying servers to keep space with the demand.
But for some reason there is not much discussion or information on the very interesting math and science behind this concept. So I have decided to talk about the simple mathematical model that can be used(based on my understanding and I am no math whizkid!) to describe this ‘unlimited’ growth of the storage space. If you have observed carefully, it is interesting to note that storage space on GMail kept increasing at faster space in the beginning and then it started to slow down considerably.
“On October 12, 2007 the rate of increase was 5.37 MB per hour.
Approximately a week later, the rate decreased to 1.12 MB per hour, on January 4, 2008 further down to about 3.35 MB per day, or 0.14 MB per hour, and in October 2008 further down to about 353.9 KB per day.” – Wikipedia
How can we achieve this kind of behavior? i.e. write a software program (aka an algorithm) that will start with an initial value and then begin incrementing the value at a fast rate for some period of time and then start to slow down (or increment at a slower rate).
To answer that question we need to understand the concept of ‘Function Growth’ – which in simple terms can be described as the rate at which the value of any given function grows in relation to the function’s current input value. And different family of function grows at a different rate e.g. you can have constant growth O(1), linear growth O(n), exponential growth O(2^n), logarithmic growth O(log n) etc. Of these, logarithmic growth is the one which we are most interested in for our case. Why? That’s because growth rate of a log function is very similar to the growth rate observed in the ‘unlimited’ growth of the email storage.
What I am going to do next is to create a program that simulates the ‘unlimited’ growth of email storage. Let us make some basic assumption first. We will assume our initial storage starts at 5000 MB (5 GB). We will increase the storage every second by some ‘factor’. The simulation will run until the storage reaches 10000 MB (10 GB). We will then observe how long it takes to reach from 5GB to 10 GB.
Since we are simulating the growth “every second” we would consider total “seconds” there are in a day (which is 1 * 60 * 60 * 24). So we will start from 1 and once we have reached 86400th second, we will consider a day has gone by and again start from 1 second. We will use the following function: fn = c*[log(s)/(s*d )] where c = is some constant, s=each second and d=current day.
The simple code is as follows:
And if you run the program you will see the following output:
The first column is the "day", second column shows the "storage size" at the end of day, third column displays the daily growth while the last depicts the over all growth. If you now plot a graph of Day vs Size you will get something like this:
Can you see now what's going on? Starting from day 1, it will take about 1300 days i.e. approximately 4.5 years to reach 10GB. By changing the value of the constant 'c', you can control the overall rate. We also notice that the storage grew by almost 60% (upto 8000 MB) in the first 60 days. Then it slowed down considerably and grew at a much slower pace.
So effectively what we are seeing is that though the growth happens every second giving the illusion as if we are marching towards infinity in practical terms we could take years before we run out of physical strorage space. And who knows by then we might have found a way to really have infinite storage.



No comments:
Post a Comment