django源码剖析(steup、runserver、生命周期)

工作上会经常用到不熟悉的第三方模块,大多数时候会选择看文档、百度谷歌、看源码等形式去把它用起来。几年工作经验下来源码看的不少了,但当面试被问到django的生命周期时,只能浅谈根据wsgi协议会走application,后续如何返回response一概不知。于是抽时间读了读django的源码。

一、django.steup()

在django项目中用过celery的都知道,celery开启之前必须设置os环境中的"DJANGO_SETTINGS_MODULE",然后通过django.steup()准备django环境后才能正常启动celery,那么django.steup()到底干了什么?先贴下django.steup()源码:

def setup(set_prefix=True):
    """
    Configure the settings (this happens as a side effect of accessing the
    first setting), configure logging and populate the app registry.
    Set the thread-local urlresolvers script prefix if `set_prefix` is True.
    """
    from django.apps import apps
    from django.conf import settings
    from django.urls import set_script_prefix
    from django.utils.log import configure_logging

    configure_logging(settings.LOGGING_CONFIG, settings.LOGGING)
    if set_prefix:
        set_script_prefix(
            '/' if settings.FORCE_SCRIPT_NAME is None else settings.FORCE_SCRIPT_NAME
        )
    apps.populate(settings.INSTALLED_APPS)

首先看configure_logging,函数源码不贴了,它做的就是当在settings中配置了LOGGING字典后,会按该字典配置logging

然后是set_script_prefix,这个没有跟具体意义,通过三母运算将_prefixes.value的值设为"/"或者settings.FORCE_SCRIPT_NAME的值最后是apps.populate(settings.INSTALLED_APPS),源码如下:

def populate(self, installed_apps=None):
    """
    Load application configurations and models.

    Import each application module and then each model module.

    It is thread-safe and idempotent, but not reentrant.
    """
    if self.ready:
        return

    # populate() might be called by two threads in parallel on servers
    # that create threads before initializing the WSGI callable.
    with self._lock:
        if self.ready:
            return

        # An RLock prevents other threads from entering this section. The
        # compare and set operation below is atomic.
        if self.loading:
            # Prevent reentrant calls to avoid running AppConfig.ready()
            # methods twice.
            raise RuntimeError("populate() isn't reentrant")
        self.loading = True

        # Phase 1: initialize app configs and import app modules.
        for entry in installed_apps:
            if isinstance(entry, AppConfig):
                app_config = entry
            else:
                app_config = AppConfig.create(entry)
            if app_config.label in self.app_configs:
                raise ImproperlyConfigured(
                    "Application labels aren't unique, "
                    "duplicates: %s" % app_config.label)

            self.app_configs[app_config.label] = app_config
            app_config.apps = self

        # Check for duplicate app names.
        counts = Counter(
            app_config.name for app_config in self.app_configs.values())
        duplicates = [
            name for name, count in counts.most_common() if count > 1]
        if duplicates:
            raise ImproperlyConfigured(
                "Application names aren't unique, "
                "duplicates: %s" % ", ".join(duplicates))

        self.apps_ready = True

        # Phase 2: import models modules.
        for app_config in self.app_configs.values():
            app_config.import_models()

        self.clear_cache()

        self.models_ready = True

        # Phase 3: run ready() methods of app configs.
        for app_config in self.get_app_configs():
            app_config.ready()

        self.ready = True
        self.ready_event.set()

先看app.__init__(),初始化了一些变量,包括会保存每个子app信息的app_configs字典,该app为全局app对象(Apps的实例对象)

再看populate函数,遍历settings.INSTALLED_APPS,遍历的每个子app字符串交给AppConfig.create生成对象(app_config),最后全局app对象的app_configs字典中添加键为app_config.label,值为app_config,并让app_config.apps赋值为全局app对象。AppConfig.create()函数源码如下:

@classmethod
def create(cls, entry):
    """
    Factory that creates an app config from an entry in INSTALLED_APPS.
    """
    # create() eventually returns app_config_class(app_name, app_module).
    app_config_class = None
    app_config_name = None
    app_name = None
    app_module = None

    # If import_module succeeds, entry points to the app module.
    try:
        app_module = import_module(entry)
    except Exception:
        pass
    else:
        # If app_module has an apps submodule that defines a single
        # AppConfig subclass, use it automatically.
        # To prevent this, an AppConfig subclass can declare a class
        # variable default = False.
        # If the apps module defines more than one AppConfig subclass,
        # the default one can declare default = True.
        if module_has_submodule(app_module, APPS_MODULE_NAME):
            mod_path = '%s.%s' % (entry, APPS_MODULE_NAME)
            mod = import_module(mod_path)
            # Check if there's exactly one AppConfig candidate,
            # excluding those that explicitly define default = False.
            app_configs = [
                (name, candidate)
                for name, candidate in inspect.getmembers(mod, inspect.isclass)
                if (
                    issubclass(candidate, cls) and
                    candidate is not cls and
                    getattr(candidate, 'default', True)
                )
            ]
            if len(app_configs) == 1:
                app_config_class = app_configs[0][1]
                app_config_name = '%s.%s' % (mod_path, app_configs[0][0])
            else:
                # Check if there's exactly one AppConfig subclass,
                # among those that explicitly define default = True.
                app_configs = [
                    (name, candidate)
                    for name, candidate in app_configs
                    if getattr(candidate, 'default', False)
                ]
                if len(app_configs) > 1:
                    candidates = [repr(name) for name, _ in app_configs]
                    raise RuntimeError(
                        '%r declares more than one default AppConfig: '
                        '%s.' % (mod_path, ', '.join(candidates))
                    )
                elif len(app_configs) == 1:
                    app_config_class = app_configs[0][1]
                    app_config_name = '%s.%s' % (mod_path, app_configs[0][0])

        # If app_module specifies a default_app_config, follow the link.
        # default_app_config is deprecated, but still takes over the
        # automatic detection for backwards compatibility during the
        # deprecation period.
        try:
            new_entry = app_module.default_app_config
        except AttributeError:
            # Use the default app config class if we didn't find anything.
            if app_config_class is None:
                app_config_class = cls
                app_name = entry
        else:
            message = (
                '%r defines default_app_config = %r. ' % (entry, new_entry)
            )
            if new_entry == app_config_name:
                message += (
                    'Django now detects this configuration automatically. '
                    'You can remove default_app_config.'
                )
            else:
                message += (
                    "However, Django's automatic detection %s. You should "
                    "move the default config class to the apps submodule "
                    "of your application and, if this module defines "
                    "several config classes, mark the default one with "
                    "default = True." % (
                        "picked another configuration, %r" % app_config_name
                        if app_config_name
                        else "did not find this configuration"
                    )
                )
            warnings.warn(message, RemovedInDjango41Warning, stacklevel=2)
            entry = new_entry
            app_config_class = None

    # If import_string succeeds, entry is an app config class.
    if app_config_class is None:
        try:
            app_config_class = import_string(entry)
        except Exception:
            pass
    # If both import_module and import_string failed, it means that entry
    # doesn't have a valid value.
    if app_module is None and app_config_class is None:
        # If the last component of entry starts with an uppercase letter,
        # then it was likely intended to be an app config class; if not,
        # an app module. Provide a nice error message in both cases.
        mod_path, _, cls_name = entry.rpartition('.')
        if mod_path and cls_name[0].isupper():
            # We could simply re-trigger the string import exception, but
            # we're going the extra mile and providing a better error
            # message for typos in INSTALLED_APPS.
            # This may raise ImportError, which is the best exception
            # possible if the module at mod_path cannot be imported.
            mod = import_module(mod_path)
            candidates = [
                repr(name)
                for name, candidate in inspect.getmembers(mod, inspect.isclass)
                if issubclass(candidate, cls) and candidate is not cls
            ]
            msg = "Module '%s' does not contain a '%s' class." % (mod_path, cls_name)
            if candidates:
                msg += ' Choices are: %s.' % ', '.join(candidates)
            raise ImportError(msg)
        else:
            # Re-trigger the module import exception.
            import_module(entry)

    # Check for obvious errors. (This check prevents duck typing, but
    # it could be removed if it became a problem in practice.)
    if not issubclass(app_config_class, AppConfig):
        raise ImproperlyConfigured(
            "'%s' isn't a subclass of AppConfig." % entry)

    # Obtain app name here rather than in AppClass.__init__ to keep
    # all error checking for entries in INSTALLED_APPS in one place.
    if app_name is None:
        try:
            app_name = app_config_class.name
        except AttributeError:
            raise ImproperlyConfigured(
                "'%s' must supply a name attribute." % entry
            )

    # Ensure app_name points to a valid module.
    try:
        app_module = import_module(app_name)
    except ImportError:
        raise ImproperlyConfigured(
            "Cannot import '%s'. Check that '%s.%s.name' is correct." % (
                app_name,
                app_config_class.__module__,
                app_config_class.__qualname__,
            )
        )

    # Entry is a path to an app config class.
    return app_config_class(app_name, app_module)

app_config_class为每个子app那个Config类(在子app目录下apps.py中的那个Config类,继承自AppConfig),最终实质是调用AppConfig的init方法返回实例对象

回到populate函数,遍历全局app的app_configs.values()拿到每个子app对象,然后调用子app对象(app_config)的import_models()方法,其实质是将子app目录下的models.py作为模块导入,导入后的对象赋值给子app对象的models_module属性,源码如下:

# Phase 2: import models modules.
for app_config in self.app_configs.values():
    app_config.import_models()

继续,遍历全局app的app_configs.values()拿到每个子app对象,然后调用子app对象(app_config)的ready方法,看django官方文档,在signals的介绍中有提到该ready函数的作用(文档链接),一般会用于该子app下的某个模型类发生增删改查等操作时通过发出信号,另一方接收到该信号去做相应的操作。populate函数最后调用threading.Event().set()方法将事件标志设置为True,使用该Event().wait的地方不再等待。

由此:django.steup()做的三件事情:

  1. 配置全局logging
  2. set_script_prefix
  3. 准备INSTALLED_APPS中的应用:生成子app对象 -> 导入子app对象的models -> 调用子app对象的ready函数

 

二、runserver

runserver运行方式是:python manage.py runserver,于是从manage.py开始,manage.py源码如下:

def main():
    """Run administrative tasks."""
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'ws_demo.settings')
    try:
        from django.core.management import execute_from_command_line
    except ImportError as exc:
        raise ImportError(
            "Couldn't import Django. Are you sure it's installed and "
            "available on your PYTHONPATH environment variable? Did you "
            "forget to activate a virtual environment?"
        ) from exc
    execute_from_command_line(sys.argv)


if __name__ == '__main__':
    main()

 进入execute_from_command_line函数,先创建了ManagementUtility类的实例对象,然后调用实例对象的execute方法。ManagementUtility的init方法中保存了sys.argv和manage.py所在的目录路径,execute函数源码如下:

def execute(self):
    """
    Given the command-line arguments, figure out which subcommand is being
    run, create a parser appropriate to that command, and run it.
    """
    try:
        subcommand = self.argv[1]
    except IndexError:
        subcommand = 'help'  # Display help if no arguments were given.

    # Preprocess options to extract --settings and --pythonpath.
    # These options could affect the commands that are available, so they
    # must be processed early.
    parser = CommandParser(
        prog=self.prog_name,
        usage='%(prog)s subcommand [options] [args]',
        add_help=False,
        allow_abbrev=False,
    )
    parser.add_argument('--settings')
    parser.add_argument('--pythonpath')
    parser.add_argument('args', nargs='*')  # catch-all
    try:
        options, args = parser.parse_known_args(self.argv[2:])
        handle_default_options(options)
    except CommandError:
        pass  # Ignore any option errors at this point.

    try:
        settings.INSTALLED_APPS
    except ImproperlyConfigured as exc:
        self.settings_exception = exc
    except ImportError as exc:
        self.settings_exception = exc

    if settings.configured:
        # Start the auto-reloading dev server even if the code is broken.
        # The hardcoded condition is a code smell but we can't rely on a
        # flag on the command class because we haven't located it yet.
        if subcommand == 'runserver' and '--noreload' not in self.argv:
            try:
                autoreload.check_errors(django.setup)()
            except Exception:
                # The exception will be raised later in the child process
                # started by the autoreloader. Pretend it didn't happen by
                # loading an empty list of applications.
                apps.all_models = defaultdict(dict)
                apps.app_configs = {}
                apps.apps_ready = apps.models_ready = apps.ready = True

                # Remove options not compatible with the built-in runserver
                # (e.g. options for the contrib.staticfiles' runserver).
                # Changes here require manually testing as described in
                # #27522.
                _parser = self.fetch_command('runserver').create_parser('django', 'runserver')
                _options, _args = _parser.parse_known_args(self.argv[2:])
                for _arg in _args:
                    self.argv.remove(_arg)

        # In all other cases, django.setup() is required to succeed.
        else:
            django.setup()

    self.autocomplete()

    if subcommand == 'help':
        if '--commands' in args:
            sys.stdout.write(self.main_help_text(commands_only=True) + '\n')
        elif not options.args:
            sys.stdout.write(self.main_help_text() + '\n')
        else:
            self.fetch_command(options.args[0]).print_help(self.prog_name, options.args[0])
    # Special-cases: We want 'django-admin --version' and
    # 'django-admin --help' to work, for backwards compatibility.
    elif subcommand == 'version' or self.argv[1:] == ['--version']:
        sys.stdout.write(django.get_version() + '\n')
    elif self.argv[1:] in (['--help'], ['-h']):
        sys.stdout.write(self.main_help_text() + '\n')
    else:
        self.fetch_command(subcommand).run_from_argv(self.argv)

前面通过parser去配置一些命令参数就不说了,通过settings.INSTALLED_APPS会调LazySettings的__getattr__方法,之后settings._wrapped成为Settings类对象(其实settings属于懒加载,在此之前任何发生导入并使用settings都会先一步将settings._wrapped赋值为Settings类对象),于是settings.configured为真,进入该判断内部,由于未加--noreload参数会调用django.steup(),然后调用autocomplete函数,由于最初环境并为设置DJANGO_AUTO_COMPLETE,因此autocomplete函数直接返回,最后会到self.fetch_command(subcommand).

run_from_argv(self.argv),此时subcommand为runserver,进入fetch_command函数,源码如下:

def fetch_command(self, subcommand):
    """
    Try to fetch the given subcommand, printing a message with the
    appropriate command called from the command line (usually
    "django-admin" or "manage.py") if it can't be found.
    """
    # Get commands outside of try block to prevent swallowing exceptions
    commands = get_commands()
    try:
        app_name = commands[subcommand]
    except KeyError:
        if os.environ.get('DJANGO_SETTINGS_MODULE'):
            # If `subcommand` is missing due to misconfigured settings, the
            # following line will retrigger an ImproperlyConfigured exception
            # (get_commands() swallows the original one) so the user is
            # informed about it.
            settings.INSTALLED_APPS
        elif not settings.configured:
            sys.stderr.write("No Django settings specified.\n")
        possible_matches = get_close_matches(subcommand, commands)
        sys.stderr.write('Unknown command: %r' % subcommand)
        if possible_matches:
            sys.stderr.write('. Did you mean %s?' % possible_matches[0])
        sys.stderr.write("\nType '%s help' for usage.\n" % self.prog_name)
        sys.exit(1)
    if isinstance(app_name, BaseCommand):
        # If the command is already loaded, use it directly.
        klass = app_name
    else:
        klass = load_command_class(app_name, subcommand)
    return klass

进入get_commands,源码如下:

@functools.lru_cache(maxsize=None)
def get_commands():
    """
    Return a dictionary mapping command names to their callback applications.

    Look for a management.commands package in django.core, and in each
    installed application -- if a commands package exists, register all
    commands in that package.

    Core commands are always included. If a settings module has been
    specified, also include user-defined commands.

    The dictionary is in the format {command_name: app_name}. Key-value
    pairs from this dictionary can then be used in calls to
    load_command_class(app_name, command_name)

    If a specific version of a command must be loaded (e.g., with the
    startapp command), the instantiated module can be placed in the
    dictionary in place of the application name.

    The dictionary is cached on the first call and reused on subsequent
    calls.
    """
    commands = {name: 'django.core' for name in find_commands(__path__[0])}

    if not settings.configured:
        return commands

    for app_config in reversed(list(apps.get_app_configs())):
        path = os.path.join(app_config.path, 'management')
        commands.update({name: app_config.name for name in find_commands(path)})

    return commands

functools.lru_cache类似防抖函数,将参数和结果缓存,后续调用如果出现一致历史参数直接将结果返回。

commands为一个字典,键为django/core/management/commands文件夹下所有以非"_"开头的模块文件名(py文件名,里面就包括runserver),值为django.core

然后遍历全局app对象的app_configs.values()拿到根据settings.INSTALLED_APPS列表生成的每个子app对象(详见django.steup),拼接子app对象所在目录/management/commands,拿到文件夹下所有以非"_"开头的模块文件名作为键,值为子app对象的name属性,加入到commands字典中,最后将commands返回,回到fetch_command函数,app_name即为django.core,其为字符串非BaseCommand类对象,因此调用load_command_class函数,该函数源代码如下:

def load_command_class(app_name, name):
    """
    Given a command name and an application name, return the Command
    class instance. Allow all errors raised by the import process
    (ImportError, AttributeError) to propagate.
    """
    module = import_module('%s.management.commands.%s' % (app_name, name))
    return module.Command()

因此klass即为django.core.management.commands.runserver下的Command类对象,回到execute函数,进入Command类的run_from_argv方法,Command类未定义run_from_argv方法,进入其继承的BaseCommand类,找到该类的run_from_argv方法,源码如下:

def run_from_argv(self, argv):
    """
    Set up any environment changes requested (e.g., Python path
    and Django settings), then run this command. If the
    command raises a ``CommandError``, intercept it and print it sensibly
    to stderr. If the ``--traceback`` option is present or the raised
    ``Exception`` is not ``CommandError``, raise it.
    """
    self._called_from_command_line = True
    parser = self.create_parser(argv[0], argv[1])

    options = parser.parse_args(argv[2:])
    cmd_options = vars(options)
    # Move positional args out of options to mimic legacy optparse
    args = cmd_options.pop('args', ())
    handle_default_options(options)
    try:
        self.execute(*args, **cmd_options)
    except CommandError as e:
        if options.traceback:
            raise

        # SystemCheckError takes care of its own formatting.
        if isinstance(e, SystemCheckError):
            self.stderr.write(str(e), lambda x: x)
        else:
            self.stderr.write('%s: %s' % (e.__class__.__name__, e))
        sys.exit(e.returncode)
    finally:
        try:
            connections.close_all()
        except ImproperlyConfigured:
            # Ignore if connections aren't setup at this point (e.g. no
            # configured settings).
            pass

前期到handle_default_options(options)都是在准备一些环境参数,进入Command类的execute方法,设置了环境变量DJANGO_COLORS为nocolor后进入BaseCommand类的execute方法,随后进入Command类的handle方法,最后进入Command类的run方法,源码如下:

def run(self, **options):
    """Run the server, using the autoreloader if needed."""
    use_reloader = options['use_reloader']

    if use_reloader:
        autoreload.run_with_reloader(self.inner_run, **options)
    else:
        self.inner_run(None, **options)

当运行时加了 --noreload时use_reloader为False,否则为True,进入autoreload.run_with_reloader(self.inner_run, **options)方法中,源码如下:

def run_with_reloader(main_func, *args, **kwargs):
    signal.signal(signal.SIGTERM, lambda *args: sys.exit(0))
    try:
        if os.environ.get(DJANGO_AUTORELOAD_ENV) == 'true':
            reloader = get_reloader()
            logger.info('Watching for file changes with %s', reloader.__class__.__name__)
            start_django(reloader, main_func, *args, **kwargs)
        else:
            exit_code = restart_with_reloader()
            sys.exit(exit_code)
    except KeyboardInterrupt:
        pass

 第一次进入未设置环境变量DJANGO_AUTORELOAD_ENV,因此会走restart_with_reloader函数,函数源码如下:

def restart_with_reloader():
    new_environ = {**os.environ, DJANGO_AUTORELOAD_ENV: 'true'}
    args = get_child_arguments()
    while True:
        p = subprocess.run(args, env=new_environ, close_fds=False)
        if p.returncode != 3:
            return p.returncode

这里和werkzeug完全一样了,复制os.environ,并将设置DJANGO_AUTORELOAD_ENV为true,在死循环中通过subprocess.run开启一个新的进程任务,该进程任务会走到start_django函数,start_djangi函数源码如下:

def start_django(reloader, main_func, *args, **kwargs):
    ensure_echo_on()

    main_func = check_errors(main_func)
    django_main_thread = threading.Thread(target=main_func, args=args, kwargs=kwargs, name='django-main-thread')
    django_main_thread.setDaemon(True)
    django_main_thread.start()

    while not reloader.should_stop:
        try:
            reloader.run(django_main_thread)
        except WatchmanUnavailable as ex:
            # It's possible that the watchman service shuts down or otherwise
            # becomes unavailable. In that case, use the StatReloader.
            reloader = StatReloader()
            logger.error('Error connecting to Watchman: %s', ex)
            logger.info('Watching for file changes with %s', reloader.__class__.__name__)

start_django函数中开启线程运行main_func函数,并设置为守护进程,main_func函数为Command类的inner_run函数,该函数就是实质开启后端应用了,调用django.core.servers.basehttp中的run函数处理请求,源码就不贴了

回到start_django,调用reloader的run方法,在werkzeug中,是根据是否能导入watchdog来确定用StatReloader还是WatchdogReloader,因为对流程影响不大这里没过多去看,姑且用StatReloader作为示例,进入StatReloader的run方法,该类下未定义run方法,进入其继承的类BaseReloader的run方法源码如下:

def run(self, django_main_thread):
    logger.debug('Waiting for apps ready_event.')
    self.wait_for_apps_ready(apps, django_main_thread)
    from django.urls import get_resolver

    # Prevent a race condition where URL modules aren't loaded when the
    # reloader starts by accessing the urlconf_module property.
    try:
        get_resolver().urlconf_module
    except Exception:
        # Loading the urlconf can result in errors during development.
        # If this occurs then swallow the error and continue.
        pass
    logger.debug('Apps ready_event triggered. Sending autoreload_started signal.')
    autoreload_started.send(sender=self)
    self.run_loop()

进入run_loop方法,然后进入StatReloader的tick方法,源码如下:

def tick(self):
    mtimes = {}
    while True:
        for filepath, mtime in self.snapshot_files():
            old_time = mtimes.get(filepath)
            mtimes[filepath] = mtime
            if old_time is None:
                logger.debug('File %s first seen with mtime %s', filepath, mtime)
                continue
            elif mtime > old_time:
                logger.debug('File %s previous mtime: %s, current mtime: %s', filepath, old_time, mtime)
                self.notify_file_changed(filepath)

        time.sleep(self.SLEEP_TIME)
        yield

StatReloader时通过判断文件的更新时间确定文件是否被修改,当存在文件被修改时,通过StatReloader的notify_file_changed方法中调用trigger_reload方法以状态码3退出,由于后端应用是通过守护线程开启,因此也会随之退出。此时回到restart_with_reloader函数,通过subprocess.run开启的进程任务以状态码3退出,于是会重新通过subprocess.run开启一个新的进程任务,以此反复。当subprocess.run不是通过状态码3退出时,restart_with_reloader函数返回该状态码,回到run_with_reloader,接收restart_with_reloader函数返回的状态码,随即通过sys.exit()以该状态码退出,最终项目退出。附上StatReloader的notify_file_changed方法和trigger_reload方法源码:

def notify_file_changed(self, path):
    results = file_changed.send(sender=self, file_path=path)
    logger.debug('%s notified as changed. Signal results: %s.', path, results)
    if not any(res[1] for res in results):
        trigger_reload(path)
        
def trigger_reload(filename):
    logger.info('%s changed, reloading.', filename)
    sys.exit(3)

 

三、django处理请求生命周期

生命周期故名思义就是要讲django的启动到最后停止的整个过程,django自启动后除非有致命异常、人工关闭等情况本身是不会停止的,那么一般问django的生命周期问的基本都是处理请求的生命周期,关于django的启动流程我放到后面。

根据wsgi协议,请求到来应用处理请求信息后,调用application将response返回。

django的application定义在项目目录下的wsgi.py文件中,它是一个django.core.handlers.wsgi.WSGIHandler类对象,附源码:

def get_wsgi_application():
    """
    The public interface to Django's WSGI support. Return a WSGI callable.

    Avoids making django.core.handlers.WSGIHandler a public API, in case the
    internal WSGI implementation changes or moves in the future.
    """
    django.setup(set_prefix=False)
    return WSGIHandler()

进入该类,在其init方法中可以看到有做load中间件操作,也可以看到中间件是按逆袭依次遍历,最终process_view是按MIDDLEWARE正序插入,process_template_response和process_exception倒序插入,附源码:

class WSGIHandler(base.BaseHandler):
    request_class = WSGIRequest

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.load_middleware()

# base.BaseHandler.load_middleware()
def load_middleware(self, is_async=False):
    """
    Populate middleware lists from settings.MIDDLEWARE.

    Must be called after the environment is fixed (see __call__ in subclasses).
    """
    self._view_middleware = []
    self._template_response_middleware = []
    self._exception_middleware = []

    get_response = self._get_response_async if is_async else self._get_response
    handler = convert_exception_to_response(get_response)
    handler_is_async = is_async
    for middleware_path in reversed(settings.MIDDLEWARE):
        middleware = import_string(middleware_path)
        middleware_can_sync = getattr(middleware, 'sync_capable', True)
        middleware_can_async = getattr(middleware, 'async_capable', False)
        if not middleware_can_sync and not middleware_can_async:
            raise RuntimeError(
                'Middleware %s must have at least one of '
                'sync_capable/async_capable set to True.' % middleware_path
            )
        elif not handler_is_async and middleware_can_sync:
            middleware_is_async = False
        else:
            middleware_is_async = middleware_can_async
        try:
            # Adapt handler, if needed.
            adapted_handler = self.adapt_method_mode(
                middleware_is_async, handler, handler_is_async,
                debug=settings.DEBUG, name='middleware %s' % middleware_path,
            )
            mw_instance = middleware(adapted_handler)
        except MiddlewareNotUsed as exc:
            if settings.DEBUG:
                if str(exc):
                    logger.debug('MiddlewareNotUsed(%r): %s', middleware_path, exc)
                else:
                    logger.debug('MiddlewareNotUsed: %r', middleware_path)
            continue
        else:
            handler = adapted_handler

        if mw_instance is None:
            raise ImproperlyConfigured(
                'Middleware factory %s returned None.' % middleware_path
            )

        if hasattr(mw_instance, 'process_view'):
            self._view_middleware.insert(
                0,
                self.adapt_method_mode(is_async, mw_instance.process_view),
            )
        if hasattr(mw_instance, 'process_template_response'):
            self._template_response_middleware.append(
                self.adapt_method_mode(is_async, mw_instance.process_template_response),
            )
        if hasattr(mw_instance, 'process_exception'):
            # The exception-handling stack is still always synchronous for
            # now, so adapt that way.
            self._exception_middleware.append(
                self.adapt_method_mode(False, mw_instance.process_exception),
            )

        handler = convert_exception_to_response(mw_instance)
        handler_is_async = middleware_is_async

    # Adapt the top of the stack, if needed.
    handler = self.adapt_method_mode(is_async, handler, handler_is_async)
    # We only assign to this when initialization is complete as it is used
    # as a flag for initialization being complete.
    self._middleware_chain = handler

后续会用到self.__middleware_chain,这里理清下self.__middleware_chain最终变成了啥,首先得看convert_exception_to_response函数,源码如下:

def convert_exception_to_response(get_response):
    """
    Wrap the given get_response callable in exception-to-response conversion.

    All exceptions will be converted. All known 4xx exceptions (Http404,
    PermissionDenied, MultiPartParserError, SuspiciousOperation) will be
    converted to the appropriate response, and all other exceptions will be
    converted to 500 responses.

    This decorator is automatically applied to all middleware to ensure that
    no middleware leaks an exception and that the next middleware in the stack
    can rely on getting a response instead of an exception.
    """
    @wraps(get_response)
    def inner(request):
        try:
            response = get_response(request)
        except Exception as exc:
            response = response_for_exception(request, exc)
        return response
    return inner

发现它是一个利用functools.wraps函数所构造的装饰器,functools.wraps方法就不细说了,它的作用是将原函数和原函数部分参数(如有)进行包裹,返回一个partial对象。如需知道functools.wraps()函数详细介绍可点击这里,因此convert_exception_to_response函数是捕获原函数传入request参数后执行处理的异常,如无异常将原函数结果返回,有异常则调用response_for_exception函数传入request和异常信息参数后返回执行处理的结果。回到load_middleware方法,第一次调用convert_exception_to_response函数后返回的handle即是通过functools.wraps包裹self._get_response方法后生成的partial对象,可以写成这样:handle = partial(self._get_response, requests),当然这是self._get_response方法执行无异常情况。

继续,逆序遍历middleware列表并对元素根据字符串导入,然后传入handle生成实例对象mw_instance,以CsrfViewMiddleware为例,实例化方法__init__在django.utils.deprecation.MiddlewareMixin这个类中,源码如下(顺便贴下__call__方法,后续有用):

class MiddlewareMixin:
    def __init__(self, get_response=None):
        self.get_response = get_response
        super().__init__()

    def __call__(self, request):
        response = None
        if hasattr(self, 'process_request'):
            response = self.process_request(request)
        response = response or self.get_response(request)
        if hasattr(self, 'process_response'):
            response = self.process_response(request, response)
        return response

__init__方法中保存了下self.get_response为partial(self._get_response, requests),回到load_middleware方法,handler = convert_exception_to_response(mw_instance),这里开始套娃了,最终的handle大致是这样的:partial(SecurityMiddleware(), ... ,partial(BrowseMiddleware(), 

partial(self._get_response, requests))),于是self.__middleware_chain就是套娃后的handle

当请求来临时,调用application实质是调用WSGIHandler类的__call__方法,方法源码如下:

def __call__(self, environ, start_response):
    set_script_prefix(get_script_name(environ))
    signals.request_started.send(sender=self.__class__, environ=environ)
    request = self.request_class(environ)
    response = self.get_response(request)

    response._handler_class = self.__class__

    status = '%d %s' % (response.status_code, response.reason_phrase)
    response_headers = [
        *response.items(),
        *(('Set-Cookie', c.output(header='')) for c in response.cookies.values()),
    ]
    start_response(status, response_headers)
    if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
        # If `wsgi.file_wrapper` is used the WSGI server does not call
        # .close on the response, but on the file wrapper. Patch it to use
        # response.close instead which takes care of closing all files.
        response.file_to_stream.close = response.close
        response = environ['wsgi.file_wrapper'](response.file_to_stream, response.block_size)
    return response

在call方法中首先会调用set_script_prefix函数并通过signals.request_started.send发出信号,随后实例化WSGIRequest类对象为request,然后通过base.BaseHandler类的get_response方法生成response,get_response方法源码如下:

def get_response(self, request):
    """Return an HttpResponse object for the given HttpRequest."""
    # Setup default url resolver for this thread
    set_urlconf(settings.ROOT_URLCONF)
    response = self._middleware_chain(request)
    response._closable_objects.append(request)

    # If the exception handler returns a TemplateResponse that has not
    # been rendered, force it to be rendered.
    if not getattr(response, 'is_rendered', True) and callable(getattr(response, 'render', None)):
        response = response.render()

    if response.status_code >= 400:
        log_response(
            '%s: %s', response.reason_phrase, request.path,
            response=response,
            request=request,
        )

    return response

经过打印发现中间件的process_request、process_view、process_response均在response = self._middleware_chain(request)这行代码中,self.__middleware_chain在load_middleware()方法中已经讲过了,于是会按settings中的中间件列表顺序依次执行__call__方法并传入request,从上面MiddlewareMixin类的__call__方法可以看出,会按中间件列表顺序执行process_request方法并获取返回值response,django自带的中间件类的process_request方法返回值均为None,于是会执行self.get_response(request),也就是执行django.core.handlers.base.BaseHandler.

_get_response()方法,该方法源码如下:

    def _get_response(self, request):
        """
        Resolve and call the view, then apply view, exception, and
        template_response middleware. This method is everything that happens
        inside the request/response middleware.
        """
        response = None

        if hasattr(request, 'urlconf'):
            urlconf = request.urlconf
            set_urlconf(urlconf)
            resolver = get_resolver(urlconf)
        else:
            resolver = get_resolver()

        resolver_match = resolver.resolve(request.path_info)
        callback, callback_args, callback_kwargs = resolver_match
        request.resolver_match = resolver_match

        # Apply view middleware
        for middleware_method in self._view_middleware:
            response = middleware_method(request, callback, callback_args, callback_kwargs)
            if response:
                break

        if response is None:
            wrapped_callback = self.make_view_atomic(callback)
            try:
                response = wrapped_callback(request, *callback_args, **callback_kwargs)
            except Exception as e:
                response = self.process_exception_by_middleware(e, request)

        # Complain if the view returned None (a common error).
        if response is None:
            if isinstance(callback, types.FunctionType):    # FBV
                view_name = callback.__name__
            else:                                           # CBV
                view_name = callback.__class__.__name__ + '.__call__'

            raise ValueError(
                "The view %s.%s didn't return an HttpResponse object. It "
                "returned None instead." % (callback.__module__, view_name)
            )

        # If the response supports deferred rendering, apply template
        # response middleware and then render the response
        elif hasattr(response, 'render') and callable(response.render):
            for middleware_method in self._template_response_middleware:
                response = middleware_method(request, response)
                # Complain if the template response middleware returned None (a common error).
                if response is None:
                    raise ValueError(
                        "%s.process_template_response didn't return an "
                        "HttpResponse object. It returned None instead."
                        % (middleware_method.__self__.__class__.__name__)
                    )
            try:
                response = response.render()
            except Exception as e:
                response = self.process_exception_by_middleware(e, request)

        return response

    def process_exception_by_middleware(self, exception, request):
        """
        Pass the exception to the exception middleware. If no middleware
        return a response for this exception, raise it.
        """
        for middleware_method in self._exception_middleware:
            response = middleware_method(request, exception)
            if response:
                return response
        raise

前面url匹配等就不说了,通过for middleware_method in self._view_middleware执行了每个中间件的process_view方法,通过for middleware_method in self._template_response_middleware执行了每个中间件的process_template_response方法,有异常时通过process_exception_by_middleware方法执行每个中间件的process_exception方法,回到MiddlewareMixin类的__call__方法,此时self.get_response(request)有了返回值,于是依次执行process_response方法。这里的self.get_response()方法只会执行一次,并没有看到有使用functools.lru_cache对函数进行缓存,也许就是functools.wraps()的特性吧。

回到WSGIHandler类的__call__方法,start_response(status, response_headers) 通过打印发现start_response是wsgiref.handlers.BaseHandler类的start_response方法,该方法源码如下:

def start_response(self, status, headers, exc_info=None):
    """'start_response()' callable as specified by PEP 3333"""

    if exc_info:
        try:
            if self.headers_sent:
                # Re-raise original exception if headers sent
                raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
        finally:
            exc_info = None  # avoid dangling circular ref
    elif self.headers is not None:
        raise AssertionError("Headers already set!")

    self.status = status
    self.headers = self.headers_class(headers)
    status = self._convert_string_type(status, "Status")
    assert len(status) >= 4, "Status must be at least 4 characters"
    assert status[:3].isdigit(), "Status message must begin w/3-digit code"
    assert status[3] == " ", "Status message must have a space after code"

    if __debug__:
        for name, val in headers:
            name = self._convert_string_type(name, "Header name")
            val = self._convert_string_type(val, "Header value")
            assert not is_hop_by_hop(name), \
                f"Hop-by-hop header, '{name}: {val}', not allowed"

    return self.write

根据注释,这时一个根据PEP 3333规定的一个回调函数。回到WSGIHandler类的__call__方法,最终将response返回。


  • 作者:合十
  • 发表时间:2022年6月23日 23:22
  • 更新时间:2024年10月10日 02:25
  • 所属分类:我用Python

Comments

该文章还未收到评论,点击下方评论框开始评论吧~