•        This plot shows the performance of learning algorithm for 1 serve over 20,000 trials.
  •         Plota  where
      •     a=1 refers to successful shots => reaching the ball
      •     a=2 refers to completely successful shots => hitting a correct shot as well
    • Here
      • Gamma = 0.9
      • Lambda = 0.95
      • Reward Function = 10  for complete success
                                                     = 5 for partial success
                                                     =  -1 for failure
                                                      = 0 otherwise