I think you’re actually massively overthinking it. The state of a neural network...

andoando · on May 1, 2024

I appreciate your responses, sorry I hope I don't seem like Im arguing for the sake of arguing.

>Essentially what the parameters represent is how likely each neuron is to be activated (have a high value) if others in the previous layer are. So you can think of the parameters as encoding strengths of connections between each pair of neurons in consecutive layers. Thinking about ‘what path to take through the neural layers’ is way too sophisticated — it’s not doing anything like that.

Im a little confused. The discussion thus far about how neural networks are essentially just compositions of functions, but you are now saying that the function is static, and only the parameters change.

But that aside, if these parameters change which neurons are activated, and this activation affects which neurons are activated in the next layer, are these parameters effectively not changing the path taken through the layers?

>Sure, if you incorporate that extra state as one of your inputs, it might be, but that’s a different function.

So say we have this program " let c = 2; function 3sum (a,b) { return a+b + c; } let d = 3sum(3,4)"

I believe you are saying, if we had constructed this instead as

"function(a,b,c) { return a+b+c } let d = 3sum(3,4,2) "

then, this is a different function.

Certainly, these are different in a sense, but at a fundamental level, when you compile this all down and run it, there is an equivalency in the transformation that is happening. That is, the two functions equivalently take some input state A (composed of a,b,c) and return the same output state B, while applying the same intermediary steps (add a to b, add c to result of (add to b)). Really, in the first case where c is defined outside the scope of the function block, the interpreter is effectively producing the function 3sum(x,y,c) as it has to at some point, one way or another, inject c into a+b+c.

Similarly, I am won't argue that the current, formal definitions of functions in mathematics are exactly that of functions as they're generally defined in programming.

Rather, what I saying is that there is an equivalent way to think and study functions that equally apply to both fields. That is, a function is simply a transformation from A to B, where A and B can be anything, whether that is bits, numbers, or any other construction in any system. The only primitive distinction to make here is whether A and B are the same thing or different.