Tuesday 27 January 2015

Algorithms Design and Complexity Analysis

Algorithm
Ø    Sequence of statements which solve the problem logically
Ø    Need not to belong one particular language
Ø    Sequence of English statements can also be algorithm
Ø    It is not a computer program
Ø    A problem can have many algorithms
Properties of algorithm –
Ø    Input: Zero or more quantities are externally supplied.
Ø    Definiteness: No ambiguity((Unique meaning).)
Ø    Effectiveness: Statements should be basic not complex
Ø    Finiteness: The Execution must be finish after finite number of steps
Ø    Output: Must provide desired output  
Program design using algorithm

Two phase process
1.Problem analysis:
  Analyze and understand the user defined problem.
  Write algorithm using basic algorithm construct.
2.  Implementation:
  Translate the algorithm into desired programming language. 

ØAlgorithm is just a logical form of solution for any problem which can be implemented in 
       any of the programming language.
ØAlgorithm always takes some input and produce some desired output.


Program vs. Software
Computer Program ?
Set of instructions to perform some specific task

Is Program itself a Software ?
NO, Program is small part of software.
Software merely comprises of Group of Programs (Source code), Documentation, Test cases, Input or Output Description etc.

Approaches for designing algorithms
Greedy Approach –
  At each step, it selects the best possible solution.
  The “greedy” sense of the “greedy algorithm” actually comes from the idea that we ignore what decisions have made in previous stages, and simply focus on the sub-optimal solution in every single stage. In this way, we attempt to derive the overall optimal solution. 
  Ex. Shortest path algorithm,  Kruskal's algorithm and Prim's algorithm for finding minimum spanning trees,
Divide and Conquer –
       a.  Divide problem into smaller problem
  b.  Solve each smaller problem
  c.  Combine the result of smaller of problem
       Ex. Quick sort and merge sort etc.
Non recursive –
  Recursion is very powerful but some compiler doesn’t support the this feature. Therefore, this approach uses iterative concept for designing algorithm.
Randomized –
  On the base of random feature of problem like randomness in input data, is used to design the algorithm. Ex. Quick sort
Modular programming approach –
   Divide big problem into smaller problem (module)
  Solve each smaller problem using different algorithm design
  Combine them to design algorithm for big problem

Constructs of algorithm
1.Identifying Numbers – Each statement should assigned a number.
2.Comment – Comment may be used to make algorithm easily readable and should not assigned any step number.
3.Variable – Generally, variable names will use capital letters.
4.Input/output – For taking input from user Read construct and for displaying any message or any variable value Print/Write construct can be used.
5.Control Structure –
  Controlling statements may of three types:
  Sequential logic – Write numbered steps continuously unless any contrary statement.
  Selection logic – To select statement from many statements, If – else or its variations can be used.
  Iteration logic – To repeat some statements, loops like while, for and do….while can be used. 
 
Searching an element in array (Version 1)
Algorithm - An array ARR of N elements is given. C represents the index of element in array ARR, if element found. Element to be searched is K and I is a simple variable to be used in loop.
Step 1:   Initialize C =-1,
Step 2:   for I=0 to N-1
Step 3:   If ARR[I] == K then,
Step 4:  C = I
  [End of step 3]
  [End of step 2]
Step 5:   If C >= 0 then,
Step 6:   Print “Element is at C index in array ARR”
Step 7:   Else
Step 8:   Print “Element is not found”
  [End of step 5]
Step 9:   Exit

Searching an element in array (Version 2)
Algorithm -  An array ARR of N elements is given. C represents the index of element in array ARR, if element found. Element to be searched is K and I is a simple variable to be used in loop.
Step 1:   Initialize C =-1, I =0
Step 2:   while I <N
Step 3:   If ARR[I] == K then,
Step 4:  C = I
   [End of step 3]
Step 5:    I = I+1
  [End of step 2]
Step 6:   If C >= 0 then,
Step 7:   Print “Element is at C index in array ARR”
Step 8:   Else
Step 9:   Print “Element is not found”
  [End of step 5]
Step 10:   Exit

Concept of Sub-algorithm
Sub-algorithm –
  Instead of writing an algorithm for large problem we can decompose the original problem into sub problems and algorithm for these sub problems are called as sub algorithms. We can call sub-algorithm in algorithm by passing some required arguments.
Example –
Binary search algorithm, we follow two steps
1.Sort the given list
2.Search the element or data

Algorithm analysis
Aim –
  To understand, how to select an algorithm for a given problem
Assumption –
1.The algorithms we wish to compare are correct.
2.The algorithm that probably takes lesser time is a better one.
Time –
Ø    We do not bother about the Exact time taken on any machine. 
§  The computational/storage power on different machines vary
Therefore, let us assume that
§  The cost of executing a statement is some abstract cost ci
§       a constant amount of time is required to execute each line
§      constant time is unity

Measuring  the running time
Experimental Studies
§Implement the solution
§No assumptions
Write a program implementing the algorithm
Run the program with inputs of varying size and composition.
Use a method like System.currentTimeMillis() to get an accurate measure of the actual running time
Limitations of Experimental Studies:
It is necessary to implement the algorithm, which may be difficult
Results may not be indicative of the running time on other inputs not included in the experiment.
In order to compare two algorithms, the same hardware and software environments must be used.
Asymptotic analysis –
§Growth rate function of algorithm with respect to time on increasing the input size
Asymptotic notations like Big O, Omega and Theta etc.
Beyond Experimental Studies–
First we are going to develop a high level description of an algorithm.
The way of describing an algorithm and we are going to use this description to figure out the running time such that we can get rid off from the limitation of the experimental studies approach:
1.No need to implement algorithm  to any system.
2. A methodology would help us to take into account of all possible input instances and
3. also it will allow us to evaluate the efficiency of the algorithm in a way that it is independent  of the platform we are using.

Complexity 
ØComplexity :Time requirement. Generally, a function in terms of size of input.

ØIf space is fixed then only run time will be considered for obtaining the complexity of algorithm.
ØWe do not bother about the Exact time taken on any machine.
We analyze basically 3 cases for complexity of algorithms –
1.Best case - (Omega)
2.Average case - θ (Theta)
3.Worst case - O (Big O)

How do we analyze algorithms?
1. First we identify what are the primitive operations in our pseudo-code.
primitive operation: It is a low level operation.
Following operations are called as  primitive operations:
ØData movement (assign)
Ø Control (branch, subroutine call, return)
Ø Arithmetic an logical operations (e.g. addition, comparison)
2. Count the number of primitive operations that are executed by an algorithm.

Illustration 1: Searching an element in array

  
Illustration 1(Cont…), array with distinct elements


Illustration 2: Largest Element in array 


Observations
ØEach of the operation takes a certain amount of time, depending upon the computer system you have. C 1, C2, C 3, C4, C5, C 6 etc. just represent the amount of time taken for these operations and they can be in any units.
Ø    But, It is often difficult to get precise measurements of ci’s
Ø    The running time of an algorithm differs with respect to the nature of the input instance.
Ø    The running time of an algorithm almost always expressed as a functions of the input  size.
Note: The instance is a set or a sequence of numbers that user will enter.
Eg. 33,44,67,205,23 is one instance 978,45,2,6667 is another instance.

Relook on complexity of above algorithms
A relook at the time complexity expressions we have obtained so far
Ø   Searching an element in array
Time complexity 
  = 1* C1 + (N+1)*C2 + N * C3 + N * C4 + N * C5 + 1 * C6 + 1 * C7 + 1 * C8 
=  C1 + N*C2 + C2 + N*C3 + N * C4 + N *C5 + C6 + C7 +C8
= N + N + N + N + 5
= 4N + 5
ØSearching an element in array (distinct element)
Time complexity 
  = 1* C1 + (N+1)*C2 + N * C3 + 1 * C4 + N * C5 + 1 * C6 + 1 * C7 + 1 * C8 
=  C1 + N*C2 + C2 + N*C3 +  C4 + N *C5 + C6 + C7 +C8
= N + N + N + 6
= 3N + 5

Ø   Largest Element in array
Descending Order –
Time Complexity   = 1* C1 + n*C2 + (n-1 )* C3  + 1* C5
  = C1 + n * C2 + n * C3 – C3 + C5
  = 2n + 1
Ascending Order –
Time Complexity  = 1* C1 + n*C2 + (n-1 )* C3 + (n-1) * C4 + 1* C5
  = C1 + n * C2 + n * C3 – C3 + n * C4 – C4 +C5
  = 3n 

Asymptotic Notations
ØAsymptotic notation is a way of expressing the cost of an algorithm.
ØGoal of Asymptotic notation is to simplify Analysis by getting rid of unneeded information
Big –O (Worst Case) –
Definition:
  for a given function f(n), we say that
  f(n)= O(g(n)) |  if there exists positive constants c and nsuch that,                   
0<=f(n)<=cg(n) for all n>=no}
Defines a tight upper bound for a function, it tells you how long your algorithm is going to take in the worst case.
§allows us to keep track of the leading term while ignoring smaller terms
§f(n)= O(g(n)) implies that 
§g(n)grows at least as fast as f(n) or
§f(n)is of the order at most g(n)


Big –O Notation-Example 1
Suppose f(n) = 5n and g(n) = n.
• To show that f (n)= O(g(n)). we have to show the existence of a constant C as given earlier. Clearly 5 is such a constant
  so f(n) = 5 * g(n).
• We could choose a larger C such as 6, because the
definition states that f(n) must be less than or equal to C * g(n),
but we usually try and find the smallest one.
Therefore, a constant C exists (we only need one) and
f (n)= O(g(n)).

Big –O Notation-Example 2
Show that f(n) = 4n + 5= O(n)
Solution:To show that f(n) = 4n + 5 = O(n), we need to produce a constant C such that:
f(n) <= C * n for all n>=n0.
If we try C = 4, this doesn't work because 4n + 5 is not less than 4n. We need
C to be at least 9 to cover all n.
If n0 = 1, C has to be 9, but C can be smaller for greater values of n (if n = 100, C can be 5). Since the chosen C must work for all n>=n0, we must use 9:
4n + 5 <= 4n + 5n = 9n
Since we have produced a constant C that works for all n, we can conclude:
  f(n) = O(n)

Big –O Notation-Example 3
If  f(n) = n2:  prove that f(n) ≠ O(n)
To do this, we must show that there cannot exist a constant C that satisfies the big-Oh definition.
We will prove this by contradiction.
Suppose there is a constant C that works; then, by the
definition of big-Oh: n2 ≤ C * n for all n.
• Suppose n is any positive real number greater than C,
then: n * n > C * n, or n2 > C * n.
So there exists a real number n such that n2 > C * n.
This contradicts the supposition, so the supposition is false
There is no C that can work for all n:
  f(n) ≠ O(n) when f(n) = n2

More Examples:
1. 3n + 2 = O(n)
3n + 2 ≤ 4n for all n ≥ 2.
2. 3n + 3 = O(n)
3n + 3 ≤ 4n for all n ≥ 3.
3. 100n + 6 = O(n)
100n + 6 ≤ 101n for all n ≥ 6.

An interesting observation


Worst Case Analysis (Usually Done)
ØIn the worst case analysis, we calculate upper bound on running time of an algorithm.
ØWe must know the case that causes maximum number of operations to be executed.
ØFor Linear Search, the worst case happens when the element to be searched (K in the our code) is not present in the array. When K is not present, the search() functions compares it with all the elements of arr[] one by one. Therefore, the worst case time complexity of linear search would be   O(n).
Big – Notation
Big – (Best Case) –
Definition:
  for a given function f(n), we say that
   f(n)=  Ω (g(n)) | if there exists positive constants c and no such that,                  
   0<=cg(n)<=f(n) for all n>=no}
§Defines tight lower bound for a function
§allows us to keep track of the leading term while ignoring smaller terms
§f(n)=  Ω (g(n)) implies that
§g(n) grows at most as fast as f(n) or
§f(n) is of the order at least g(n)


Examples:
1. 3n + 2 = Ω(n)
3n + 2 ≥ 3n for all n ≥ 1.

2. 3n + 3 = Ω(n)
3n + 3 ≥ 3n for all n ≥ 1.

3. 100n + 6 = Ω(n)
100n + 6 ≥ 100n for all n ≥ 0.

Best Case Analysis (Bogus) 
ØIn the best case analysis, we calculate lower bound on running time of an algorithm.
ØWe must know the case that causes minimum number of operations to be executed.
ØIn the linear search problem, the best case occurs when K is present at the first location.
ØThe number of operations in the best case is constant (not dependent on n). So time complexity in the best case would be O(1)

Big –θ Notation
Big – θ (Average Case) –
Definition:
  for a given function f(n), we say that
   f(n)=θ(g(n)) |  if there exists positive constants c1 and c2 and no such that,          
  0<=c1g(n)<=f(n)<=c2g(n) for all n>=no}
§f(n)=θ(g(n)) iff   f(n)=O(g(n)) and f(n)=(g(n))
§Defines tight lower bound and upper bound for a function, at the same time 

Example:
 3n + 2 = Θ(n)
3n + 2 ≥ 3n for all n ≥ 2.
3n + 2 ≤ 4n for all n ≥ 2.
So, C1 = 3, C2 =4 & n0 =2.

Average Case Analysis (Sometimes done) 
ØIn average case analysis, we take all possible inputs and calculate computing time for all of the inputs.
Ø Sum all the calculated values and divide the sum by total number of inputs.

For more understanding go through this document.

lec1_Complexity -



No comments:

Post a Comment