This is the kind of thing that works until it spectacularly does not. XML parsing with regex is fine for simple, well-controlled cases but breaks as soon as you hit edge cases. We learned this the hard way trying to parse security questionnaire exports. Started with regex, ended up rewriting with a proper XML parser after hitting too many weird formatting issues.