The process of importing a module/package (after locating it):
- First checks if it is cached. If not, continue
- It creates a
ModuleTypeobject with that name - Cache the module in
sys.modules - Executes the source code inside the module (first prefixing it with .py and then assign
__file__)- In the case of the package/subpackage, it assign it the
__init__.pyfile - It also executes all the
__init__.pyon the path
- In the case of the package/subpackage, it assign it the
- Assign a variable to the module object
Below is in roughly what it is done in Python code:
import sys, types
def import_module(modname):
# Check if it is in the cache first
if modname in sys.modules:
return sys.modules[modname]
sourcepath = modname + '.py'
with open(sourcepath, 'r') as f:
sourcecode = f.read()
mod = types.ModuleType(modname)
mod.__file__ = sourcepath
# Cache the module
sys.modules[modname] = mod
# Convert it to Python ByteCode
code = compile(sourcecode, sourcepath, 'exec')
# Execute the code in the module from top to bottom
# And update the state (globals) in the module's dictionary
exec(code, mod.__dict__)
# We return the cached one in case there is some patching inside the module
return sys.modules[modname]Finally, Python puts a lock when importing a module until it is done so that we don’t have multiple threads trying to import the same module at the same time.
As a result, if a module is imported in different modules in the sample application, it would be imported ONLY once. This is very useful for config files that sets up application’s configuration such as loggers. Importing the configuration module in many places would only lead to executing the module once.