you have an interesting point of view, and some of the things you have said are correct, but if you try to use gradient descent on a function from, say, ℤ → ℝ, you are going to be a very sad xanda. i would indeed describe such a function as being discontinuous not just at π but everywhere, at least with the usual definition of continuity (though there is a sense in which such a function could be, for example, scott-continuous)
even in the case of a single discontinuity in the derivative, like in relu', you lose the intermediate value theorem and everything that follows from it; it's not an inconsequential or marginally relevant fact
even in the case of a single discontinuity in the derivative, like in relu', you lose the intermediate value theorem and everything that follows from it; it's not an inconsequential or marginally relevant fact