Implementation Details
Now it is time to explore the NSE implementation details in depth. Understanding how NSE works is useful for designing efficient scripts and libraries. The canonical reference to the NSE implementation is the source code, but this section provides an overview of key details. It should be valuable to folks trying to understand and extend the NSE source code, as well as to script authors who want to better understand how their scripts are executed.
Initialization Phase
NSE is initialized before any scanning when Nmap first starts, by the
open_nse
function. open_nse
creates a fresh Lua state that will persist across
host groups,
until the program exits.
It then loads the standard Lua libraries and compiled NSE libraries.
The standard Lua libraries are
documented in the Lua Reference
Manual. The standard Lua libraries available to NSE are
debug
,
io
,
math
,
os
,
package
,
string
, and
table
.
Compiled NSE libraries are those that are defined in a C++ file instead
of a Lua file. They include
nmap
,
pcre
,
bin
,
bit
, and
openssl
(if available).
After loading the basic libraries, open_nse
loads
the file
nse_main.lua
. The NSE core is in this
file—Lua code manages scripts and sets up the appropriate
environment. In this situation Lua really shines as a glue language.
C++ is used to provide the network framework and low-level libraries.
Lua is used to structure data, determine which scripts to load,
and schedule and execute scripts.
nse_main.lua
sets up the Lua
environment to be ready for script scanning later on. It
loads all the scripts the user has chosen and returns a function
that does the actual script scanning to
open_nse
.
The nselib
directory is added to the Lua path to
give scripts access to the standard NSE library. NSE loads replacements
for the standard coroutine functions so that yields initiated by NSE are
caught and propagated back to the NSE scheduler.
nse_main.lua
next defines classes and functions
to be used during setup. The script arguments
(--script-args
) are loaded into
nmap.registry.args
.
A script database is created if one doesn't already exist or if this
was requested with --script-updatedb
.
Finally, the scripts listed on the command line are loaded.
The get_chosen_scripts
function works to find
chosen scripts by comparing categories, filenames, and directory names.
The scripts are loaded into memory for later use.
get_chosen_scripts
works by transforming the
argument to --script
into a block of Lua code and
then executing it. (This is how the and
,
or
, and not
operators are
supported.) Any specifications that don't directly match a category or
a filename from
script.db
are checked against file and directory names. If the specification is a
regular file, it's loaded. If a directory, all the
*.nse
files within it are loaded. Otherwise, the
engine raises an error.
get_chosen_scripts
finishes by arranging the
selected scripts according to their dependencies (see
the section called “dependencies
Field”).
Scripts that have no dependencies are in runlevel 1. Scripts that
directly depend on these are in runlevel 2, and so on.
When a script scan is run, each runlevel is run separately and in
order.
nse_main.lua
defines two classes:
Script
and Thread
. These classes
are the objects that represent NSE scripts and their script threads.
When a script is loaded, Script.new
creates a new Script object. The script file is loaded into Lua
and saved for later use. These classes and their methods are intended
to encapsulate the data needed for each script and its threads.
Script.new
also contains sanity checks to ensure that the
script has required fields such as the action
function.
Script Scanning
When NSE runs a script scan, script_scan
is called
in nse_main.cc
. Since there are three script scan
phases, script_scan
accepts two arguments, a
script scan type which can be one of these values:
SCRIPT_PRE_SCAN
(Script Pre-scanning phase) or
SCRIPT_SCAN
(Script scanning phase) or
SCRIPT_POST_SCAN
(Script Post-scanning phase),
and a second argument which is a list of targets to scan if
the script scan phase is SCRIPT_SCAN
.
These targets will be passed to the nse_main.lua
main function for scanning.
The main function for a script scan generates a number of script
threads based on whether the rule
function
returns true. The generated threads are stored in a list of runlevel
lists. Each runlevel list of threads is passed separately to the
run
function. The run
function
is the main worker function for NSE where all the magic happens.
The run
function's purpose is run all the threads
in a runlevel until they all finish. Before doing this however,
run
redefines some Lua registry values that
help C code function. One such function,
_R[WAITING_TO_RUNNING]
, allows the network library
binding written in C to move a thread from the waiting queue to the
running queue. Scripts are run until the
running and waiting queues are both empty. Threads that yield are
moved to the waiting queue; threads that are ready to continue
are moved back to the running queue. The cycle continues until
the thread quits or ends in error. Along with the waiting and running
queues, there is a pending queue. It serves as a temporary location
for threads moving from the waiting queue to the running queue before
a new iteration of the running queue begins.