Probably the main justification is that the analysis and transformation steps needed to compute the vjp and jvp pullbacks of a function (which correspond to reverse- and forward-mode automatic differentiation) require enough of the other machinery of a compiler that they are best done WITHIN a compiler. Then other things become quite natural, too, like producing the tangent vector versions of data structures like tuples and maps!
Moreover, a statically typed language like Swift is a much better starting point for this kind of effort than Python. Array shapes and dimensions are already a type system - you might as well go the whole distance and get all the other safety, readability, and efficiency benefits!
PS shout out for named array axes as the future of array-based (and hence differentiable) programming... see http://nlp.seas.harvard.edu/NamedTensor for a good rationale
Moreover, a statically typed language like Swift is a much better starting point for this kind of effort than Python. Array shapes and dimensions are already a type system - you might as well go the whole distance and get all the other safety, readability, and efficiency benefits!
PS shout out for named array axes as the future of array-based (and hence differentiable) programming... see http://nlp.seas.harvard.edu/NamedTensor for a good rationale