【译】Emiller Nginx模块开发指南(第二部分)

由于格式问题,这篇文章看起来可能会不太舒服,可以直接阅读我的google doc:

中文版       中英对照版

2. Components of an Nginx Module

2. nginx模块的各个组件

 

As I said, you have a lot of flexibility when it comes to making an Nginx module. This section will describe the parts that are almost always present. It's intended as a guide for understanding a module, and a reference for when you think you're ready to start writing a module.

正如我所说的,开发一个nginx模块的时候,具有非常非常强的灵活性。这部分就介绍一下这些东西。类似于理解模块的一个指南或者是你觉得你已经准备好开始开发模块的时候的一个参考资料。(译注:这一部分是最重要的,而且在日后的开发中,这部分可以当作字典来用)

 

2.1. Module Configuration Struct(s)

2.1. 模块配置结构

Modules can define up to three configuration structs, one for the main, server, and location contexts. Most modules just need a location configuration. The naming convention for these is ngx_http_<module name>_(main|srv|loc)_conf_t. Here's an example, taken from the dav module:

一个模块要定义三个配置结构,分别对应main、server和location context。大多数模块只需要定义一个location的结构就行了。命名规则如下:ngx_http_<模块名>_[main|srv|loc]_conf_t,一下是个取自dav模块(译注:dav模块被认为是nginx模块开发的hello world,属于必读代码)的例子:

 

typedef struct {

    ngx_uint_t  methods;

    ngx_flag_t  create_full_put_path;

    ngx_uint_t  access;

} ngx_http_dav_loc_conf_t;

 

Notice that Nginx has special data types (ngx_uint_t and ngx_flag_t); these are just aliases for the primitive data types you know and love (cf. core/ngx_config.h if you're curious).

注意,nginx有一些特别的数据类型(ngx_uint_t 和 ngx_flag_t);这些都是原始数据类型的别名(在 core/ngx_config.h里可以找到这些别名的定义)。

 

The elements in the configuration structs are populated by module directives.

配置结构里的这些成员会被模块的指令付值(译注:换句话说,这些成员是配置文件里每一条指令的句柄)。

 

2.2. Module Directives

2.2. 模块指令

 

A module's directives appear in a static array of ngx_command_ts. Here's an example of how they're declared, taken from a small module I wrote:

一个模块的指令放在一个叫ngx_command_ts的静态数组里。这有个如何声明指令的例子:

 

static ngx_command_t  ngx_http_circle_gif_commands[] = {

    { ngx_string("circle_gif"),

      NGX_HTTP_LOC_CONF|NGX_CONF_NOARGS,

      ngx_http_circle_gif,

      NGX_HTTP_LOC_CONF_OFFSET,

      0,

      NULL },

 

    { ngx_string("circle_gif_min_radius"),

      NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1,

      ngx_conf_set_num_slot,

      NGX_HTTP_LOC_CONF_OFFSET,

      offsetof(ngx_http_circle_gif_loc_conf_t, min_radius),

      NULL },

      ...

      ngx_null_command

};

(译注:这里用ngx_string()定义的字符串就是配置文件中的指令名)

 

And here is the declaration of ngx_command_t (the struct we're declaring), found in core/ngx_conf_file.h:

这是ngx_command_t的声明(在core/ngx_conf_file.h里):

 

struct ngx_command_t {

    ngx_str_t             name;

    ngx_uint_t            type;

    char               *(*set)(ngx_conf_t *cf, ngx_command_t *cmd, void *conf);

    ngx_uint_t            conf;

    ngx_uint_t            offset;

    void                 *post;

};

 

It seems like a bit much, but each element has a purpose.

貌似有点小复杂,但每一个成员都有意义。

 

The name is the directive string, no spaces. The data type is an ngx_str_t, which is usually instantiated with just (e.g.) ngx_str("proxy_pass"). Note: an ngx_str_t is a struct with a data element, which is a string, and a len element, which is the length of that string. Nginx uses this data structure most places you'd expect a string.

name是指令字符串,不包括空格。这是个ngx_str_t类型的数据,这种数据类型通常用ngx_string(“prox_pass”)(译注:原文写成了ngx_str,应该是ngx_string)的方式来实例化(译注:实例化是一个oop的词,不过用在这挺贴切的)。注意:ngx_str_t这个数据结构包括了一个类型为字符串的data成员和一个记录data长度的整型形的len成员。nginx几乎所有涉及到字符串的地方都使用这种数据结构(译注:很大程度上是为了方便内存管理,这种方式初始化的字符串,是放在一个pool中的)。

 

type is a set of flags that indicate where the directive is legal and how many arguments the directive takes. Applicable flags, which are bitwise-OR'd, are:

type是一个flag的集合,用来标记指令在哪是合法的,有几个参数什么的。在使用过程中,通常是下边这些数值的按位或:

  • NGX_HTTP_MAIN_CONF: directive is valid in the main config
  • NGX_HTTP_MAIN_CONF:指令在main配置中有效
  • NGX_HTTP_SRV_CONF: directive is valid in the server (host) config
  • NGX_HTTP_SRV_CONF:指令在server配置中有效
  • NGX_HTTP_LOC_CONF: directive is valid in a location config
  • NGX_HTTP_LOC_CONF:指令在location配置中有效
  • NGX_HTTP_UPS_CONF: directive is valid in an upstream config
  • NGX_HTTP_UPS_CONF:指令在upstream配置中有效
  • NGX_CONF_NOARGS: directive can take 0 arguments
  • NGX_CONF_NOARGS:指令没有参数
  • NGX_CONF_TAKE1: directive can take exactly 1 argument
  • NGX_CONF_TAKE1:指令读入1个参数
  • NGX_CONF_TAKE2: directive can take exactly 2 arguments
  • NGX_CONF_TAKE2:指令读入2个参数
  • ...
  • NGX_CONF_TAKE7: directive can take exactly 7 arguments
  • NGX_CONF_TAKE7:指令读入带7个参数
  • NGX_CONF_FLAG: directive takes a boolean ("on" or "off")
  • NGX_CONF_FLAG:指令读入一个boolean型(“on”或“off”)
  • NGX_CONF_1MORE: directive must be passed at least one argument
  • NGX_CONF_1MORE:指令至少读入1个参数
  • NGX_CONF_2MORE: directive must be passed at least two arguments
  • NGX_CONF_2MORE:指令至少读入2个参数

 

There are a few other options, too, see core/ngx_conf_file.h.

还有一些其他的选项,参见core/ngx_conf_file.h

 

The set struct element is a pointer to a function for setting up part of the module's configuration; typically this function will translate the arguments passed to this directive and save an appropriate value in its configuration struct. This setup function will take three arguments:

set结构成员是一个指向设定该模块配置函数的指针;set结构成员是一个指向设定该模块配置函数的指针;一般来说这个函数用来处理该指令接收到的参数并把处理后的值存入相应的结构中。这个设定函数可以带三个参数:

  • a pointer to an ngx_conf_t struct, which contains the arguments passed to the directive
  •  一个指向ngx_conf_t结构的指针,包括该指令接收到的参数
  • a pointer to the current ngx_command_t struct
  • 一个指向当前ngx_command_t结构的指针
  • a pointer to the module's custom configuration struct
  • 一个指向模块自定义配置结构的指针

 

This setup function will be called when the directive is encountered. Nginx provides a number of functions for setting particular types of values in the custom configuration struct. These functions include:

当检测到相应指令时,这写设置函数就会被调用。nginx提供了一堆把自定义的配置结构转换成标准类型的函数。这些函数包括:

  • ngx_conf_set_flag_slot: translates "on" or "off" to 1 or 0
  • ngx_conf_set_flag_slot:把“on”和“off”转成1和0
  • ngx_conf_set_str_slot: saves a string as an ngx_str_t
  • ngx_conf_set_str_slot:把字符串格式化为以ngx_str_t类型
  • ngx_conf_set_num_slot: parses a number and saves it to an int
  • ngx_conf_set_num_slot:解析数字并转换为一个整型
  • ngx_conf_set_size_slot: parses a data size ("8k", "1m", etc.) and saves it to a size_t
  • ngx_conf_set_size_slot:解析表示大小的值(“8k”,“1m”等)并格式化成size_t格式

 

There are several others, and they're quite handy (see core/ngx_conf_file.h). Modules can also put a reference to their own function here, if the built-ins aren't quite good enough.

还有别的,不过都不复杂(参考:core/ngx_conf_file.h)。模块开发者也可以把对自定义函数的引用写在这里,用来替代内建函数。

 

How do these built-in functions know where to save the data? That's where the next two elements of ngx_command_t come in, conf and offset. conf tells Nginx whether this value will get saved to the module's main configuration, server configuration, or location configuration (with NGX_HTTP_MAIN_CONF_OFFSET, NGX_HTTP_SRV_CONF_OFFSET, or NGX_HTTP_LOC_CONF_OFFSET). offset then specifies which part of this configuration struct to write to.

那么内建函数是如何获得这些数据的存储位置呢?ngx_command_t里有两个成员标记这个位置,conf和offset。conf标示数据是放在main,server还是location(NGX_HTTP_MAIN_CONF_OFFSET、NGX_HTTP_SRV_CONF_OFFSET、或NGX_HTTP_LOC_CONF_OFFSET)。offset标示在相应的配置里写在哪个结构中。

 

Finally, post is just a pointer to other crap the module might need while it's reading the configuration. It's often NULL.

最后,post就是一个指向其他无足轻重的变量的指针,一般来说配置为NULL。

 

The commands array is terminated with ngx_null_command as the last element.

ngx_null_command是这个数组的结束标志。

 

2.3. The Module Context

2.3. 模块上下文

 

This is a static ngx_http_module_t struct, which just has a bunch of function references for creating the three configurations and merging them together. Its name is ngx_http_<module name>_module_ctx. In order, the function references are:

这是一个静态的ngx_http_module_t结构,由一系列负责创建配置和合并配置的函数引用组成。命名规则是:ngx_http_<模块名>_module_ctx。函数引用依次为:

  • preconfiguration
  • 配置前调用
  • postconfiguration
  • 配置后调用
  • creating the main conf (i.e., do a malloc and set defaults)
  • 创建main配置时调用(例如:分配内存空间和设置默认值)
  • initializing the main conf (i.e., override the defaults with what's in nginx.conf)
  • 初始化main配置时调用(例如:用nginx.conf中的值覆盖默认值)
  • creating the server conf
  • 创建server配置时调用
  • merging it with the main conf
  • 与main配置合并时调用
  • creating the location conf
  • 创建location配置时调用
  • merging it with the server conf
  • 与server配置合并时调用

 

These take different arguments depending on what they're doing. Here's the struct definition, taken from http/ngx_http_config.h, so you can see the different function signatures of the callbacks:

不同的参数取决于不同的功能。这里有一个摘自http/ngx_http_config.h的结构定义,可以看出不同的回调函数之间的区别:

typedef struct {

    ngx_int_t   (*preconfiguration)(ngx_conf_t *cf);

    ngx_int_t   (*postconfiguration)(ngx_conf_t *cf);

 

    void       *(*create_main_conf)(ngx_conf_t *cf);

    char       *(*init_main_conf)(ngx_conf_t *cf, void *conf);

 

    void       *(*create_srv_conf)(ngx_conf_t *cf);

    char       *(*merge_srv_conf)(ngx_conf_t *cf, void *prev, void *conf);

 

    void       *(*create_loc_conf)(ngx_conf_t *cf);

    char       *(*merge_loc_conf)(ngx_conf_t *cf, void *prev, void *conf);

} ngx_http_module_t;

 

You can set functions you don't need to NULL, and Nginx will figure it out.

你可以把不需要的函数就设成NULL,nginx会搞定它。

 

Most handlers just use the last two: a function to allocate memory for location-specific configuration (called ngx_http_<module name>_create_loc_conf), and a function to set defaults and merge this configuration with any inherited configuration (called ngx_http_<module name >_merge_loc_conf). The merge function is also responsible for producing an error if the configuration is invalid; these errors halt server startup.

大多数的handler都只用最后两个:一个用来给特定的配置分配内存(叫作:ngx_http_<模块名>_create_loc_conf),另一个用来设置默认值,和合并继承来的配置(叫作:ngx_http_<模块名>_merge_loc_conf)。合并函数还负责检测到非法配置后的报错;这些错误会导致服务终止。

 

Here's an example module context struct:

这里是一个示例模块的context结构:

static ngx_http_module_t  ngx_http_circle_gif_module_ctx = {

    NULL,                          /* preconfiguration */

    NULL,                          /* postconfiguration */

 

    NULL,                          /* create main configuration */

    NULL,                          /* init main configuration */

 

    NULL,                          /* create server configuration */

    NULL,                          /* merge server configuration */

 

    ngx_http_circle_gif_create_loc_conf,  /* create location configuration */

    ngx_http_circle_gif_merge_loc_conf /* merge location configuration */

};

 

Time to dig in deep a little bit. These configuration callbacks look quite similar across all modules and use the same parts of the Nginx API, so they're worth knowing about.

是时候去了解更深入的东西了。这些配置回调在几乎所有的模块中都差不多,而且调用的都是同一块nginx API,所以还是很值得了解的。

 

2.3.1. create_loc_conf

2.3.1. create_loc_conf

 

Here's what a bare-bones create_loc_conf function looks like, taken from the circle_gif module I wrote (see the the source). It takes a directive struct (ngx_conf_t) and returns a newly created module configuration struct (in this case ngx_http_circle_gif_loc_conf_t).

这是一个最精简的create_loc_conf函数,出自circle_gif模块(查看原码)。它输入一个指令结构(ngx_conf_t),返回一个新建模块的配置结构(在这个例子中是:ngx_http_circle_gif_loc_conf_t)。

 

static void *

ngx_http_circle_gif_create_loc_conf(ngx_conf_t *cf)

{

    ngx_http_circle_gif_loc_conf_t  *conf;

 

    conf = ngx_pcalloc(cf->pool, sizeof(ngx_http_circle_gif_loc_conf_t));

    if (conf == NULL) {

        return NGX_CONF_ERROR;

    }

    conf->min_radius = NGX_CONF_UNSET_UINT;

    conf->max_radius = NGX_CONF_UNSET_UINT;

    return conf;

}

 

First thing to notice is Nginx's memory allocation; it takes care of the free'ing as long as the module uses ngx_palloc (a malloc wrapper) or ngx_pcalloc (a calloc wrapper).

首先要注意nginx的内存分配,只要用ngx_palloc(malloc的封装)或ngx_palloc(calloc的封装),系统会自动释放内存(译注:跟前边提到过的ngx_string一样,都是一个pool)。

 

The possible UNSET constants are NGX_CONF_UNSET_UINT, NGX_CONF_UNSET_PTR, NGX_CONF_UNSET_SIZE, NGX_CONF_UNSET_MSEC, and the catch-all NGX_CONF_UNSET. UNSET tell the merging function that the value should be overridden.

UNSET常量有可能是NGX_CONF_UNSET_UINT、NGX_CONF_UNSET_PTR、NGX_CONF_UNSET_SIZE或NGX_CONF_UNSET_MSEC,并且表示全部的NGX_CONF_UNSET。UNSET常量告诉合并函数哪些值应该被覆盖。

 

2.3.2. merge_loc_conf

2.3.2. merge_loc_conf

 

Here's the merging function used in the circle_gif module:

这里是circle_gif模块中用到的合并函数:

 

static char *

ngx_http_circle_gif_merge_loc_conf(ngx_conf_t *cf, void *parent, void *child)

{

    ngx_http_circle_gif_loc_conf_t *prev = parent;

    ngx_http_circle_gif_loc_conf_t *conf = child;

 

    ngx_conf_merge_uint_value(conf->min_radius, prev->min_radius, 10);

    ngx_conf_merge_uint_value(conf->max_radius, prev->max_radius, 20);

 

    if (conf->min_radius < 1) {

        ngx_conf_log_error(NGX_LOG_EMERG, cf, 0, 

            "min_radius must be equal or more than 1");

        return NGX_CONF_ERROR;

    }

    if (conf->max_radius < conf->min_radius) {

        ngx_conf_log_error(NGX_LOG_EMERG, cf, 0, 

            "max_radius must be equal or more than min_radius");

        return NGX_CONF_ERROR;

    }

 

    return NGX_CONF_OK;

}

 

Notice first that Nginx provides nice merging functions for different data types (ngx_conf_merge_<data type>_value); the arguments are

首先要知道nginx提供了很好用的合并不同数据类型的函数(ngx_conf_merge_<数据类型>_value);这些参数是:

this location's value

 

  • 当前location的值
  • the value to inherit if #1 is not set
  • 如果第一个参数没有付值
  • the default if neither #1 nor #2 is set
  • 如果前两个参数都没有付值

 

The result is then stored in the first argument. Available merge functions include ngx_conf_merge_size_value, ngx_conf_merge_msec_value, and others. See core/ngx_conf_file.h for a full list.

结果放在第一个参数中,合并函数包括ngx_conf_merge_size_value和ngx_conf_merge_msed_value,还有其他的参考:core/ngx_conf_file.h。

 

Trivia question: How do these functions write to the first argument, since the first argument is passed in by value?

有一个问题:第一个参数是用来付值的,那么这些默认值是如何被写进去的呢?

 

Answer: these functions are defined by the preprocessor (so they expand to a few "if" statements and assignments before reaching the compiler).

回答:这些函数其实都是预处理命令(在编译之前,他们就会被扩展成一些“if”语句什么的)。

 

Notice also how errors are produced; the function writes something to the log file, and returns NGX_CONF_ERROR. That return code halts server startup. (Since the message is logged at level NGX_LOG_EMERG, the message will also go to standard out; FYI, core/ngx_log.h has a list of log levels.)

还有要注意的就是错误是如何产生的;函数去写日志文件,并返回NGX_CONF_ERROR。返回代码会终止服务。(如果这些消息被标记了NGX_LOG_EMERG,那么消息也会打印到标准输出;core/nginx_log.h有完整的log级别定义。)

 

2.4. The Module Definition

2.4. 模块定义

 

Next we add one more layer of indirection, the ngx_module_t struct. The variable is called ngx_http_<module name>_module. This is where references to the context and directives go, as well as the remaining callbacks (exit thread, exit process, etc.). The module definition is sometimes used as a key to look up data associated with a particular module. The module definition usually looks like this:

下边我们增加一层,ngx_module_t结构。他的变量命名为ngx_http_<模块名>_module。之前说道的context引用和指令执行都包含其中,另外还有回调函数(退出线程/进程,等等)。模块定义有时候就像是一个用来查找与某一特定模块相关的数据的键。模块定义通常就像这样:

 

ngx_module_t  ngx_http_<module name>_module = {

    NGX_MODULE_V1,

    &ngx_http_<module name>_module_ctx, /* module context */

    ngx_http_<module name>_commands,   /* module directives */

    NGX_HTTP_MODULE,               /* module type */

    NULL,                          /* init master */

    NULL,                          /* init module */

    NULL,                          /* init process */

    NULL,                          /* init thread */

    NULL,                          /* exit thread */

    NULL,                          /* exit process */

    NULL,                          /* exit master */

    NGX_MODULE_V1_PADDING

};

...substituting <module name> appropriately. Modules can add callbacks for process/thread creation and death, but most modules keep things simple. (For the arguments passed to each callback, see core/ngx_conf_file.h.)

<module name>的位置就填相应的模块名。模块可以为进程/线程创建和销毁定义回调函数,但是大多数模块都不这么干,保持一个简洁的定义。(参数详情参见:core/ngx_conf_file.h)

 

2.5. Module Installation

2.5. 模块安装

 

The proper way to install a module depends on whether the module is a handler, filter, or load-balancer; so the details are reserved for those respective sections.

模块的安装方式各不相同,取决于他们是那一种类型的:句柄,过滤器,还是负载均衡;所以安装的详细说明参见各自章节。(译注:作者写这一节是为了让我骂他么?)

标签:nginx module 模块开发 扩展开发 Emiller

添加新评论