Local and global scope | Intro to CS - Python | Khan Academy
What do you think happens when I run this program? Does it print zero, four, or raise some kind of error? To find out, let's explore variable scope.
The scope of a variable describes the region of the program where we can access it. When we run this program, the computer will store the instructions for the get_fall_damage
function in short-term memory. Then, it'll store the variables max_damage
, damage
, and height
. These all exist in what we call global scope because we define them at the top level of the program. The computer stores them in that base level in its short-term memory so they can be accessed anywhere in the program. They're global.
Then, we execute this function call. So, the computer starts off by creating a new stack frame, which remember is just a separate region of short-term memory. The parameters height
and damage
get stored in that new stack frame. Notice that these don't get stored in that base global level, so we don't overwrite the value of this existing global damage. Instead, we're creating new damage
and height
variables in a different scope. We call this local scope because it's local to this region of memory.
As soon as the function call returns, that stack frame gets popped off, so all those local variables disappear. By the time we get to this last print statement, we only have the global damage available, so this prints zero. What does this tell us? Anytime we assign a variable as a part of a function call, that variable gets stored on that local stack frame. That means we can't use functions to update global state. The only reliable way we can communicate information out of a function is through the return value.
So, instead of trying to update the global variable damage
, we can just return out the result of our damage calculation. When we exit out of that function call, we're back in global scope, so we can store that return value. Okay, but why is Python like this? Technically, we can force Python to access the variable in global scope. Instead, we put the keyword global
and the name of the variable, but we should almost never do this.
Python defaults to local scope for a reason. By the time we're finished, our program may have tons and tons of function definitions. If every variable in every function was in global scope, then every variable and every function parameter would have to have a globally unique name. We couldn't reuse anything. That becomes incredibly difficult to enforce. If I have thousands of lines of code spread across tons of different files, anytime I want to create a new variable, I'd have to first check if that name is used anywhere else in the program.
Otherwise, my function and that other function will start conflicting, where we're overwriting each other's values, which is not what we wanted here. Local scope lets us avoid this problem because we don't need to worry about the larger program state. We can reuse whatever names we want inside our local function scope.
Let's say I change the value of height
to 100. We still have the same four names in global scope and the same two names in local scope. So, what happens when we access max_damage
inside the function? Well, it turns out while we can't modify the value of a global variable from local scope, we can view the value. When the computer looks up a name, it starts its search in the closest scope. Here, it searches the local stack frame first for the name max_damage
.
If it doesn't find it in memory there, it moves its search to the next closest enclosing scope. So, now it'll try to look up max_damage
in global scope, and once it finds it, it uses that value. If it hadn't found it, it would raise a name error when it ran out of scopes to search.
Now, this can be convenient, but it's still not recommended. This function can't guarantee that there exists a global variable named max_damage
, and callers of this function may not realize that's a requirement. So, we're just asking for bugs. It's much safer if I only rely on information passed in through the function parameters. Instead of accessing the global max_damage
, I can add a second parameter that takes in that max_damage
threshold. That way, I can enforce the callers of my function think about what value max_damage
should have instead of just hoping that some global variable has been set correctly.
Our key takeaways: names defined at the top level of a program are in global scope and names defined inside a function are in local scope. There are ways to cross these boundaries, but in good programming practice, we should avoid it as much as possible. Instead, think of your functions like input-output machines. We can communicate information in from an enclosing scope at the beginning through the function arguments, and we can communicate information out to the enclosing scope at the end through the return value. Anything that happens in the middle is a local secret.