The process of importing a module/package (after locating it):
- First checks if it is cached. If not, continue
- It creates a
ModuleType
object with that name - Cache the module in
sys.modules
- Executes the source code inside the module (first prefixing it with .py and then assign
__file__
)- In the case of the package/subpackage, it assign it the
__init__.py
file - It also executes all the
__init__.py
on the path
- In the case of the package/subpackage, it assign it the
- Assign a variable to the module object
Below is in roughly what it is done in Python code:
import sys, types
def import_module(modname):
# Check if it is in the cache first
if modname in sys.modules:
return sys.modules[modname]
= modname + '.py'
sourcepath with open(sourcepath, 'r') as f:
= f.read()
sourcecode = types.ModuleType(modname)
mod __file__ = sourcepath
mod.
# Cache the module
= mod
sys.modules[modname]
# Convert it to Python ByteCode
= compile(sourcecode, sourcepath, 'exec')
code
# Execute the code in the module from top to bottom
# And update the state (globals) in the module's dictionary
exec(code, mod.__dict__)
# We return the cached one in case there is some patching inside the module
return sys.modules[modname]
Finally, Python puts a lock when importing a module until it is done so that we don’t have multiple threads trying to import the same module at the same time.
As a result, if a module is imported in different modules in the sample application, it would be imported ONLY once. This is very useful for config files that sets up application’s configuration such as loggers. Importing the configuration module in many places would only lead to executing the module once.