in the lecture we defined TD of gamma, the sum of distances needed to shift bad points to the their side of the margin. This was used to bound the number of mistakes.

If we look at the update rule - it remains the same. Do we assume the "bad" points (within the margin or in the opposite side of it) will not be given to the algorithm after a certain time t in the algorithm? if we don't, we can repetitively feed the perceptron with a bad point until the decision boundary is completely wrong..

Perceptron unrealizable case - guaranteed convergence?