Nginx Internals Joshua Zhu 09/19/2009
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
Source Code Layout  Files  $ find . -name "*.[hc]" -print | wc –l 234  $ ls src core event http mail misc os  Lines of code  $ find . -name "*.[hc]" -print | xargs wc -l | tail -n1 110953 total
Code Organization  core/  The backbone and infrastructure  event/  The event-driven engine and modules  http/  The HTTP server and modules  mail/  The Mail proxy server and modules  misc/  C++ compatibility test and the Google perftools module  os/  OS dependent implementation files
Nginx Architecture  Non-blocking  Event driven  Single threaded[*]  One master process and several worker processes  Resource efficient  Highly modular
The Big Picture
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
Memory Pool  Avoid memory fragmentation  Avoid memory leak  Allocation and deallocation can be very fast  Lifetime and pool size  Cycle  Connection  Request
Memory Pool (cont’d)  ngx_pool_t  Small blocks  Large blocks  Free chain list  Cleanup handler list  API  ngx_palloc  memory aligned  ngx_pnalloc  ngx_pcalloc
Memory Pool Example (1 Chunk)
Memory Pool Example (2 Chunks)
Buffer Management  Buffer  Pointers  memory  start/pos/last/end  file  file_pos/file_last/file  Flags  last_buf  last_in_chain  flush  in_file  memory  …
Buffer Management (cont’d)  Buffer chain  Singly-linked list of buffers  Output chain  Context  in/free/busy chains  Output filter  Chain writer  Writer context
String Utilities  ngx_str_t  data  len  sizeof() - 1  Memory related  String formatting  String comparison  String search  Base64 encoding/decoding  URI escaping/unescaping  UTF-8 decoding  String-to-number conversion
Data Structures  Abstract data types  Array  List  Queue  Hash table  Red black tree  Radix tree  Characteristic  Set object values after added  keep interfaces clean  Chunked memory (part)  efficient
Logging  Error log  Level  Debug  Access log  Multiple logs  Log format  variables  Per location  Rotation
Configuration File  Directive  name  type  set  conf  offset  post  Parsing  ngx_conf_parse  Values  init  merge
Configuration File (cont’d)  Block  events  http  server  upstream  location  if  Variables  Buildins  Other types  http_  sent_http_  upstream_http_  cookie_  arg_
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
Master and Workers  Master  Monitor workers, respawn when a worker dies  Handle signals and notify workers  exit  reconfiguration  update  log rotation  …  Worker  Process client requests  handle connections  Get cmd from master
Master Process Cycle
Worker Process Cycle
Inter-process Communication  Signals  Channel  socketpair  command  Shared memory  Connection counter  Stat  Atomic & spinlock  Mutex
Event  ngx_event_t  Read  Write  Timeout  Callbacks  Handlers  ngx_event_accept  ngx_process_events_and_timers  ngx_handle_read_event  ngx_handle_write_event  Posted events  Posted accept events queue  Posted events queue
Time Cache  The overhead of gettimeofday()  Time cache variables  ngx_cached_time  ngx_current_msec  Time strings  ngx_cached_err_log_time  ngx_cached_http_time  ngx_cached_http_log_time  Timer resolution  Interval timer  setitimer()
Events and Timers Processing
Timer Management  Actions  Add a timer  Delete a timer  Get the minimum timer  Red black tree[*]  O(log n) complexity
Accept Mutex  Thundering herd  Serialize accept()  Lock/unlock  Listening sockets  Delay
I/O  Multiplexing  kqueue/epoll  NGX_USE_CLEAR_EVENT (edge triggered)  select/poll/dev/poll  NGX_USE_LEVEL_EVENT (level triggered)  …  Advanced I/O  sendfile()  writev()  direct I/O  mmap()  AIO  TCP/IP options  TCP_CORK/TCP_NODELAY/TCP_DEFER_ACCEPT
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
Important Structures  Connection  ngx_connection_t  HTTP connection  ngx_http_connection_t  HTTP request  ngx_http_request_t  headers_in  headers_out  …
Virtual Servers  Address  Port  Server names  Core server conf
Locations  Location tree  Static  Regex  = ^~ ~ ~*  Per-location configuration  Value  inheritance  override  Handler  Named location  try_files/post_action/error_page
HTTP Contexts  Types  main_conf  srv_conf  loc_conf  Request  ngx_http_get_module_main_conf  ngx_http_get_module_srv_conf  ngx_http_get_module_loc_conf  Parse conf file  ngx_http_conf_get_module_main_conf  ngx_http_conf_get_module_srv_conf  ngx_http_conf_get_module_loc_conf  Module context  ngx_http_get_module_ctx  ngx_http_set_ctx
HTTP Handling  Receive data  Parse the request  Find the virtual server  Find the location  Run phase handlers  Generate the response  Filter response headers  Filter the response body  Send out the output to the client
Request Parsing  Request line  Headers  Interesting tricks  Finite state machine  ngx_strX_cmp
Phases and Handlers  Phases  POST_READ  SERVER_REWRITE  FIND_CONFIG  REWRITE  POST_REWRITE  PREACCESS  ACCESS  POST_ACCESS  TRY_FILES  CONTENT  LOG  Phase handler  Checker  Handler  Next
Phases and Handlers (cont’d)  Phase engine  Handlers  server_rewrite_index  location_rewrite_index  r->phase_handler  Default checkers  ngx_http_core_generic_phase  ngx_http_core_find_config_phase  ngx_http_core_post_rewrite_phase  ngx_http_core_access_phase  ngx_http_core_post_access_phase  ngx_http_core_try_files_phase  ngx_http_core_content_phase
Phases and Handlers (cont’d) phase modules POST_READ realip SERVER_REWRITE rewrite REWRITE rewrite PREACCESS limit_req, limit_zone, realip ACCESS access, auth_basic CONTENT autoindex, dav, gzip, index, random_index, static LOG log
Filter Chain  Singly-linked list like (CoR)  Filter response only  Header filter  Body filter  Send out the response  ngx_http_send_header  top_header_filter  ngx_http_output_filter  ngx_http_top_body_filter  ngx_http_header_filter  ngx_http_copy_filter  ngx_http_write_filter  Process order
Filter Chain Example
HTTP Handling Example  curl -i http://localhost/
HTTP Keep-Alive  Request memory reuse  Connection memory shrink  Keep-alive timeout  Request count
Subrequest  Filters  Addition filter  SSI filter  Maximum subrequests
Internal Redirect  Return a different URL than originally requested  Examples  try_files  index/random_index  post_action  send_error_page  upstream_process_headers
Upstream  Hooks  input_filter_init  input_filter  create_request  reinit_request  process_header  abort_request  finalize_request  rewrite_redirect  Modules  FastCGI  Proxy  Memcached  Event pipe  Load balancer
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
Mail Proxy  Sequence diagram
Mail Proxy (cont’d)  Mail session  Command parsing  Packets relay  Things you can do  Load balancing  Authentication rewriting  Black lists/white lists
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
General Module Interface  Context  index & ctx_index  Directives  Type  core/event/http/mail  Hooks  init_master  called at master process initialization  init_module  called when the module is loaded  init_process  called at worker process initialization  exit_process  called at worker process termination  exit_master  called at master process termination
Core Module Interface  Name  Hooks  create_conf  init_conf  Examples  Core  Events  Log  HTTP
Event Module Interface  Name  Hooks  create_conf  init_conf  event_actions  add  del  enable  disable  add_conn  del_conn  process_changes  process_events  init  done
Mail Module Interface  Protocol  type  init_session  init_protocol  parse_command  auth_state  create_main_conf  init_main_conf  create_srv_conf  merge_srv_conf
HTTP Module Interface  Hooks  preconfiguration  postconfiguration  create_main_conf  init_main_conf  create_srv_conf  merge_srv_conf  create_loc_conf  merge_loc_conf
A “Hello World” HTTP Module • Creating a hello world! module  Files  ngx_http_hello_module.c  config  Build  ./configure –add-module=/path/to/hello/module  Configuration  location & directive
Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
Auto Scripts  Handle the differences  OS  Compiler  Data types  Libraries  Module enable/disable  Modules order
Reconfiguration
Hot Code Swapping
Thank You!  My site: http://www.zhuzhaoyuan.com  My blog: http://blog.zhuzhaoyuan.com

Nginx Internals

  • 1.
    Nginx Internals Joshua Zhu 09/19/2009
  • 2.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 3.
    Source Code Layout  Files  $ find . -name "*.[hc]" -print | wc –l 234  $ ls src core event http mail misc os  Lines of code  $ find . -name "*.[hc]" -print | xargs wc -l | tail -n1 110953 total
  • 4.
    Code Organization  core/  The backbone and infrastructure  event/  The event-driven engine and modules  http/  The HTTP server and modules  mail/  The Mail proxy server and modules  misc/  C++ compatibility test and the Google perftools module  os/  OS dependent implementation files
  • 5.
    Nginx Architecture  Non-blocking  Event driven  Single threaded[*]  One master process and several worker processes  Resource efficient  Highly modular
  • 6.
  • 7.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 8.
    Memory Pool  Avoid memory fragmentation  Avoid memory leak  Allocation and deallocation can be very fast  Lifetime and pool size  Cycle  Connection  Request
  • 9.
    Memory Pool (cont’d)  ngx_pool_t  Small blocks  Large blocks  Free chain list  Cleanup handler list  API  ngx_palloc  memory aligned  ngx_pnalloc  ngx_pcalloc
  • 10.
  • 11.
  • 12.
    Buffer Management  Buffer  Pointers  memory  start/pos/last/end  file  file_pos/file_last/file  Flags  last_buf  last_in_chain  flush  in_file  memory  …
  • 13.
    Buffer Management (cont’d)  Buffer chain  Singly-linked list of buffers  Output chain  Context  in/free/busy chains  Output filter  Chain writer  Writer context
  • 14.
    String Utilities  ngx_str_t  data  len  sizeof() - 1  Memory related  String formatting  String comparison  String search  Base64 encoding/decoding  URI escaping/unescaping  UTF-8 decoding  String-to-number conversion
  • 15.
    Data Structures  Abstract data types  Array  List  Queue  Hash table  Red black tree  Radix tree  Characteristic  Set object values after added  keep interfaces clean  Chunked memory (part)  efficient
  • 16.
    Logging  Error log  Level  Debug  Access log  Multiple logs  Log format  variables  Per location  Rotation
  • 17.
    Configuration File  Directive  name  type  set  conf  offset  post  Parsing  ngx_conf_parse  Values  init  merge
  • 18.
    Configuration File (cont’d)  Block  events  http  server  upstream  location  if  Variables  Buildins  Other types  http_  sent_http_  upstream_http_  cookie_  arg_
  • 19.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 20.
    Master and Workers  Master  Monitor workers, respawn when a worker dies  Handle signals and notify workers  exit  reconfiguration  update  log rotation  …  Worker  Process client requests  handle connections  Get cmd from master
  • 21.
  • 22.
  • 23.
    Inter-process Communication  Signals  Channel  socketpair  command  Shared memory  Connection counter  Stat  Atomic & spinlock  Mutex
  • 24.
    Event  ngx_event_t  Read  Write  Timeout  Callbacks  Handlers  ngx_event_accept  ngx_process_events_and_timers  ngx_handle_read_event  ngx_handle_write_event  Posted events  Posted accept events queue  Posted events queue
  • 25.
    Time Cache  The overhead of gettimeofday()  Time cache variables  ngx_cached_time  ngx_current_msec  Time strings  ngx_cached_err_log_time  ngx_cached_http_time  ngx_cached_http_log_time  Timer resolution  Interval timer  setitimer()
  • 26.
  • 27.
    Timer Management  Actions  Add a timer  Delete a timer  Get the minimum timer  Red black tree[*]  O(log n) complexity
  • 28.
    Accept Mutex  Thundering herd  Serialize accept()  Lock/unlock  Listening sockets  Delay
  • 29.
    I/O  Multiplexing  kqueue/epoll  NGX_USE_CLEAR_EVENT (edge triggered)  select/poll/dev/poll  NGX_USE_LEVEL_EVENT (level triggered)  …  Advanced I/O  sendfile()  writev()  direct I/O  mmap()  AIO  TCP/IP options  TCP_CORK/TCP_NODELAY/TCP_DEFER_ACCEPT
  • 30.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 31.
    Important Structures  Connection  ngx_connection_t  HTTP connection  ngx_http_connection_t  HTTP request  ngx_http_request_t  headers_in  headers_out  …
  • 32.
    Virtual Servers  Address  Port  Server names  Core server conf
  • 33.
    Locations  Location tree  Static  Regex  = ^~ ~ ~*  Per-location configuration  Value  inheritance  override  Handler  Named location  try_files/post_action/error_page
  • 34.
    HTTP Contexts  Types  main_conf  srv_conf  loc_conf  Request  ngx_http_get_module_main_conf  ngx_http_get_module_srv_conf  ngx_http_get_module_loc_conf  Parse conf file  ngx_http_conf_get_module_main_conf  ngx_http_conf_get_module_srv_conf  ngx_http_conf_get_module_loc_conf  Module context  ngx_http_get_module_ctx  ngx_http_set_ctx
  • 35.
    HTTP Handling  Receive data  Parse the request  Find the virtual server  Find the location  Run phase handlers  Generate the response  Filter response headers  Filter the response body  Send out the output to the client
  • 36.
    Request Parsing  Request line  Headers  Interesting tricks  Finite state machine  ngx_strX_cmp
  • 37.
    Phases and Handlers  Phases  POST_READ  SERVER_REWRITE  FIND_CONFIG  REWRITE  POST_REWRITE  PREACCESS  ACCESS  POST_ACCESS  TRY_FILES  CONTENT  LOG  Phase handler  Checker  Handler  Next
  • 38.
    Phases and Handlers(cont’d)  Phase engine  Handlers  server_rewrite_index  location_rewrite_index  r->phase_handler  Default checkers  ngx_http_core_generic_phase  ngx_http_core_find_config_phase  ngx_http_core_post_rewrite_phase  ngx_http_core_access_phase  ngx_http_core_post_access_phase  ngx_http_core_try_files_phase  ngx_http_core_content_phase
  • 39.
    Phases and Handlers(cont’d) phase modules POST_READ realip SERVER_REWRITE rewrite REWRITE rewrite PREACCESS limit_req, limit_zone, realip ACCESS access, auth_basic CONTENT autoindex, dav, gzip, index, random_index, static LOG log
  • 40.
    Filter Chain  Singly-linked list like (CoR)  Filter response only  Header filter  Body filter  Send out the response  ngx_http_send_header  top_header_filter  ngx_http_output_filter  ngx_http_top_body_filter  ngx_http_header_filter  ngx_http_copy_filter  ngx_http_write_filter  Process order
  • 41.
  • 42.
    HTTP Handling Example  curl -i http://localhost/
  • 43.
    HTTP Keep-Alive  Request memory reuse  Connection memory shrink  Keep-alive timeout  Request count
  • 44.
    Subrequest  Filters  Addition filter  SSI filter  Maximum subrequests
  • 45.
    Internal Redirect  Return a different URL than originally requested  Examples  try_files  index/random_index  post_action  send_error_page  upstream_process_headers
  • 46.
    Upstream  Hooks  input_filter_init  input_filter  create_request  reinit_request  process_header  abort_request  finalize_request  rewrite_redirect  Modules  FastCGI  Proxy  Memcached  Event pipe  Load balancer
  • 47.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 48.
    Mail Proxy  Sequence diagram
  • 49.
    Mail Proxy (cont’d)  Mail session  Command parsing  Packets relay  Things you can do  Load balancing  Authentication rewriting  Black lists/white lists
  • 50.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 51.
    General Module Interface  Context  index & ctx_index  Directives  Type  core/event/http/mail  Hooks  init_master  called at master process initialization  init_module  called when the module is loaded  init_process  called at worker process initialization  exit_process  called at worker process termination  exit_master  called at master process termination
  • 52.
    Core Module Interface  Name  Hooks  create_conf  init_conf  Examples  Core  Events  Log  HTTP
  • 53.
    Event Module Interface  Name  Hooks  create_conf  init_conf  event_actions  add  del  enable  disable  add_conn  del_conn  process_changes  process_events  init  done
  • 54.
    Mail Module Interface  Protocol  type  init_session  init_protocol  parse_command  auth_state  create_main_conf  init_main_conf  create_srv_conf  merge_srv_conf
  • 55.
    HTTP Module Interface  Hooks  preconfiguration  postconfiguration  create_main_conf  init_main_conf  create_srv_conf  merge_srv_conf  create_loc_conf  merge_loc_conf
  • 56.
    A “Hello World”HTTP Module • Creating a hello world! module  Files  ngx_http_hello_module.c  config  Build  ./configure –add-module=/path/to/hello/module  Configuration  location & directive
  • 57.
    Agenda  Source code layout  Key concepts and infrastructure  The event-driven architecture  HTTP request handling  Mail proxying process  Nginx module development  Misc. topics
  • 58.
    Auto Scripts  Handle the differences  OS  Compiler  Data types  Libraries  Module enable/disable  Modules order
  • 59.
  • 60.
  • 61.
    Thank You!  My site: http://www.zhuzhaoyuan.com  My blog: http://blog.zhuzhaoyuan.com