The Big O notation what is the explanation?

$\begingroup$

I studying discrete mathematics to use it's knowledge further in programming. Unfortunately one of the most complex things is the littlemost of what is explained.

The definition of the theorem is given as:

Let f and g be the functions from the set of integers or the set of
real numbers to the set of real numbers.
We say that f(x) is O(g(x)) if there are constants C and k such that
|f(x)| <= C|g(x)|, when x > k. 

There are confusions here in itself:

I know that the big O notation is used to deduce the number of steps required to solve a problem by an algorithm, and hence can be used to compare two algorithms and find their relative efficiency. So,

1) What are f(x) and g(x), are they the algorithms we are comparing?

2) What is this C and k in the definition?

3)What is the validity of the definition?

Furthermore, it gives an examples like:

Show that f(x) = x2+2x+1 is O(x2)

I apologize, but I've got to write the solution given:

It says,
"We observe that we can readily estimate the size of f(x) when x > 1 whenever x < x2 and 1 < x2 when x > 1. It follows that, 0<= x2+2x+1 <= 4x2, when x > 1. So, we can take c=4 and k = 1 as witnesses."

This solution is all a mystery to me.(And this is not my homework, really).
What I don't understand:

1)Why not take x = 0? We could still estimate the value of f(x), right?

2) "x < x2 and 1 < x2" when x > 1 , where I think it should had been "x <= x2" and where did the second even come from?

3) where did the 0 ≤ x2 +2x+1, even come from?

4) And most importantly How in the world does it even tell me how many steps do I need to take to solve any problem?(f(x) maybe?)

$\endgroup$

3 Answers

$\begingroup$

Suppose we have two algorithms each of which sorts an array of length $n$. Suppose the first takes $f(n) = n^2 + n + 1 $ steps and the second takes $g(n) = 1000n \log(n) + 200n$ steps. Which is a better (faster) algorithm?

Well for small values of $n$ the first one is faster, while for large values of $n$, which is what matters in a lot of computing situations, the second one will be faster.

This is the essence of what the "big-O" notation conveys.

Other answers may connect this idea to the formal definition. (I might later today.)

$\endgroup$ $\begingroup$

Broadly speaking, $f(x)$ is the runtime of some algorithm as a function of the "size" of the input $x$. You're interested in a brief, approximate description of the behavior of $f(x)$ for large $x$ (because in computing we're frequently dealing with large problems). Big O notation provides such a description, by saying that $f$ is no more than $Cg(x)$ if $x$ is large, for some $C$.

For your specific questions:

  1. $f$ and $g$ are arbitrary functions. In the algorithm context, $x$ is some descriptor of the size of the input to the algorithm, $f$ is the runtime of your algorithm as a function of the size of the input, and $g$ is a reference function that you're comparing $f$ to.

  2. $C$ and $k$ are arbitrary constants. The role of $k$ intuitively is to restrict attention to "large $x$". The role of $C$ is to allow for a hidden constant factor, so that for example $5x=O(x)$.

  3. I don't really know what this means, so I can't help you there.

As for the solution:

  1. Big O notation is all about comparing two functions in some limit (in your context the limit is as $x \to \infty$). Accordingly, small values of the input (less than some fixed constant) may be ignored.

  2. The distinction between $<$ and $\leq$ in this context does not really matter, but indeed if $x>1$ then $x^2>x>1$.

  3. All the terms in $x^2+2x+1$ are positive if $x$ is positive.

  4. $x^2+2x+1$ is the exact number of steps required for some unspecified algorithm to run. The point of Big O notation is to give a more approximate expression (usually because such an exact expression is not available).

$\endgroup$ 8 $\begingroup$

Broadly speaking what Big Oh provides (in the context of algorithm estimation) is an estimate for how long a program is going to take to execute to arrive at a particular result. The formal notation provided simply says.

Some function f(x), in this case the algorithm's runtime, will be no greater than some function g(x) multiplied by some arbitrarily large constant factor for all values x greater than k.

This is often done by taking n to be equal to the size of the input, and then express the runtime of an algorithm as O(f(n)) for some function f(n). For example:

Let's take the following code:

function example1(inputArray) { inputArray.forEach(function(item) { return item + 1 })
}

We would say that this code is O(n), as for an array of size n, the number of distinct computations taken is also n, and therefore, the function best representing the number of distinct computations is f(n) = n

Now let us take another example:

function example1(inputArray) { for(var x = 0; x <= inputArray.length; x+=2) { console.log(item + 2) }
}

In this case we would say that this algorithm if O($\frac{n}{2}$) as the number of computations taken is obviously going to be equivalen the half the length of the input array. Except that that statement is a lie! We would actually still say, that this algorithm is O(n). Why? This is where the C in the definition comes in. Not all operations take the same amount of time! If our code looked something like this:

function example1(inputArray) { inputArray.forEach(function(item) { return incrediblySlowDatabaseOperation })
}

The distinction between n and $\frac{n}{2}$ would stop being significant as the constant error inherent in doing some slow operation would come to dominate any possible error we would accrue by dropping the constant factor $\frac{1}{2}$.

However, if we take something like

function example1(inputArray) { inputArray.forEach(function(item) { inputArray.forEach(function(item) { return item + 1 }) })
}

We would now have O(n^2) which would no longer be a negligible difference.

So to reiterate.

f(x) is a function representing how long your code takes to run for input of size x. g(x) is a function representing how many pure computational steps your code has to take for an input of size x. C is some constant multiplier, representing non algorithmically time bound things your program has to do. k simply implies that your upper bound estimate (may) only start working beyond some minimum input size.

$\endgroup$

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like