With Newton's method, you'll be solving Hx=g (H = hessian of f, g = gradient of ...

With Newton's method, you'll be solving Hx=g (H = hessian of f, g = gradient of f) at each iteration. For large number of variables N, building H is of order N^2 and solving Hx = g is of order N^3 with an usual solver. N^2 and N^3 are really large for an already large N. I believe the reason is as simple as that. It isn't that it is tedious and difficult to write down the formulas or the code to compute H. It's just too costly, computationally speaking. There is also an increased memory cost (possibly having to store H).

When people do have ways to go around this problem, they do use Newton's method for large scale problems.