Python

Python Modules and Packages

In this article, we will explain Modules and Packages in Python.

1. Introduction

Python is an easy-to-learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.

The Python interpreter and the extensive standard library are freely available in source or binary form for all major platforms from the Python Website, and may be freely distributed. The same site also contains distributions of and pointers to many free third-party Python modules, programs and tools, and additional documentation.

2. Modules

If you quit the terminal you are using to write your functions and variables those will be lost. So in case you want to write a long program, it’s better to write it in a file then use that file as an input to the program. So you are actually creating a script that will be executed when you run it. This way you can easily break your program into multiple files which are easy to maintain. Let say you wrote a utility function for a program. You know this function will be used in multiple places. So by breaking your program into multiple manageable files you can easily import the definition of that function rather than copy-pasting the same function everywhere.

To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a module can be imported into other modules or into the main module (the collection of variables that you have access to in a script executed at the top level and in calculator mode).

A module is a file containing Python definitions and statements. The file name is the module name with the .py at the end. Within a module, the module’s name (as a string) is available as the value of the global variable __name__.

3. Example

In this section, we will see some working examples. We will create a module and will use it. Let us create a file called square.py with the below content

def square(num):
  return num ** 2

This file has one function called square() which takes a number and returns its square. Now let’s enter the Python interpreter.

~/study/python$ python3
Python 3.8.1 (v3.8.1:1b293b6006, Dec 18 2019, 14:08:53)
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

Now try to call this newly created function. You will get an error:

>>> square.square(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'square' is not defined
>>>

Python doesn’t know where the square function is. Now import this file and run the same command:

>>> import square
>>> square.square(5)
25
>>>

Note that when we imported we only specified the file name, not the method name. If you want to use the same function multiple times you can assign it to a variable:

>>> sq = square.square
>>> sq(6)
36
>>>

A module can contain executable statements as well as function definitions. These statements are intended to initialize the module. They are executed only the first time the module name is encountered in an import statement. They are also run if the file is executed as a script. Each module has its own private symbol table, which is used as the global symbol table by all functions defined in the module. Thus, the author of a module can use global variables in the module without worrying about accidental clashes with a user’s global variables. If you want you can refer to the module’s global variables as well.

The good thing is that a module can also import other modules. Generally, it’s a good idea to keep all the import statements in one place – mostly at the top, so it’s easier to see the dependency of the module. The imported module names are placed in the importing module’s global symbol table.

In the previous example we imported the whole module, but you can also import the names directly. Let’s modify our existing example: we will rename the file to squareAndCube.py and will introduce a cube function as well:

def square(num):
  return num ** 2

def cube(num):
  return num ** 3

Now let us see how to import names directly:

>>> from squareAndCube import square, cube
>>> square(4)
16
>>> cube(4)
64
>>>

We can also import all the name using the below command:

from squareAndCube import *

This imports all names except those beginning with an underscore (_). It is generally not advised to import everything because it introduces an unknown set of names into the interpreter, possibly hiding some already defined things.

We can use as to link the module name:

>>> from squareAndCube import square as sq
>>> sq(8)
64
>>>

For efficiency reasons, each module is only imported once per interpreter session. Therefore, if you change your modules, you must restart the interpreter – or, if it’s just one module you want to test interactively, use importlib.reload()

4. Scripting

In this section we will see how to run our Python module as a script. Let’s create a file sq.py as below:

def square(num):
  print(num ** 2)

if __name__ == "__main__":
  import sys
  square(int(sys.argv[1]))

We can run this file as a script by running the below command:

python3 sq.py 9

When you run a Python module, the code in the module will be executed, just as if you imported it, but with the __name__ set to __main__. By adding the above code you will made the file executable and also importable (if there is a word like that :) ).

5. Compilation

To speed up loading modules, Python caches the compiled version of each module in the __pycache__ directory under the name module.version.pyc, where the version encodes the format of the compiled file. Python checks the modification date of the source against the compiled version to see if it’s out of date and needs to be recompiled. This is a completely automatic process. Also, the compiled modules are platform-independent, so the same library can be shared among systems with different architectures.

Python does not check the cache in two circumstances. First, it always recompiles and does not store the result for the module that’s loaded directly from the command line. Second, it does not check the cache if there is no source module. To support a non-source (compiled only) distribution, the compiled module must be in the source directory, and there must not be a source module.

6. Standard Modules

Python comes with a library of standard modules. Some modules are built into the interpreter; these provide access to operations that are not part of the core of the language but are nevertheless built-in, either for efficiency or to provide access to operating system primitives such as system calls. sys is one of the module which is built into every Python interpreter. This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available.

The built-in function dir() is used to find out which names a module defines. It returns a sorted list of strings:

>>> import sys, sq
>>> dir(sq)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'square']
>>> dir(sys)
['__breakpointhook__', '__displayhook__', '__doc__', '__excepthook__', '__interactivehook__', '__loader__', '__name__', '__package__', '__spec__', '__stderr__', '__stdin__', '__stdout__', '__unraisablehook__', '_base_executable', '_clear_type_cache', '_current_frames', '_debugmallocstats', '_framework', '_getframe', '_git', '_home', '_xoptions', 'abiflags', 'addaudithook', 'api_version', 'argv', 'audit', 'base_exec_prefix', 'base_prefix', 'breakpointhook', 'builtin_module_names', 'byteorder', 'call_tracing', 'callstats', 'copyright', 'displayhook', 'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix', 'executable', 'exit', 'flags', 'float_info', 'float_repr_style', 'get_asyncgen_hooks', 'get_coroutine_origin_tracking_depth', 'getallocatedblocks', 'getcheckinterval', 'getdefaultencoding', 'getdlopenflags', 'getfilesystemencodeerrors', 'getfilesystemencoding', 'getprofile', 'getrecursionlimit', 'getrefcount', 'getsizeof', 'getswitchinterval', 'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info', 'intern', 'is_finalizing', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1', 'ps2', 'pycache_prefix', 'set_asyncgen_hooks', 'set_coroutine_origin_tracking_depth', 'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit', 'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout', 'thread_info', 'unraisablehook', 'version', 'version_info', 'warnoptions']
>>>

Without arguments, dir() lists the names you have defined currently. dir() does not list the names of built-in functions and variables. If you want a list of those, they are defined in the standard module builtins

7. Packages

Packages are a way of structuring Python’s module namespace by using dotted module names. For example, the module name ABC.XYZ designates a submodule named ABC in a package named XYZ. Just like the use of modules saves the authors of different modules from having to worry about each other’s global variable names, the use of dotted module names saves the authors of multi-module packages from having to worry about each other’s module names.

Let us say you are building a system for a supermarket. There code structure will look something like below. Please note this is just an example not a real application.

com
  __init__.py
  mysupermarket
    __init__.py
    domain
      __init__.py
      order.py
      customer.py
    controller
      __init__.py
      OrderController.py
      CustomerController.py
    service
      __init__.py
      OrderService.py
      CustomerService.py

When importing the package, Python searches through the directories on sys.path looking for the package subdirectory. The __init__.py files are required to make Python treat directories containing the file as packages. This prevents directories with a common name, such as string, unintentionally hiding valid modules that occur later on the module search path. In the simplest case, __init__.py can just be an empty file.

Users of the package can import individual modules from the package, for example:

import com.mysupermarket.controller.OrderController

An alternative way of importing the submodule is:

from com.mysupermarket.controller import OrderController

Yet another variation is to import the desired function or variable directly:

from com.mysupermarket.controller.OrderController import placeOrder

This loads the submodule OrderController, but this makes its function placeOrder() directly available. Note that when using from <package> import <item>, the item can be either a submodule (or subpackage) of the package, or some other name defined in the package, like a function, class or variable. The import statement first tests whether the item is defined in the package; if not, it assumes it is a module and attempts to load it. If it fails to find it, an ImportError exception is raised.

Contrarily, when using syntax like import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.

8. Importing *

If a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it if they don’t see a use for importing * from their package. For example, the file com/mysupermarket/controller could contain the following code:

__all__ = ["OrderController"]

This would mean that from com.mysupermarket.controller import * would import only one named submodules of the controller package.

If __all__ is not defined, the statement from com.mysupermarket.controller import * does not import all submodules from the package com.mysupermarket.controller into the current namespace; it only ensures that the package com.mysupermarket.controller has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py.

Mohammad Meraj Zia

Senior Java Developer
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button