pf.conf
PF.CONF(5) OpenBSD Programmer's Manual PF.CONF(5)
NAME
pf.conf - packet filter configuration file
DESCRIPTION
The pf(4) packet filter modifies, drops, or passes packets according to
rules or definitions specified in pf.conf.
This is an overview of the sections in this manual page:
Options
Global options tune the behaviour of the packet filtering engine.
Queueing
Queueing provides rule-based bandwidth control.
Translation
Translation specifies how addresses are mapped to other addresses.
Packet Filtering
Packet filtering provides rule-based blocking or passing of packets.
Tables
Tables provide a method for dealing with large numbers of addresses.
Anchors
Anchors are containers for rules and tables.
Stateful Filtering
Stateful filtering tracks packets by state.
Traffic Normalisation
Including scrub, fragment handling, and blocking spoofed traffic.
Operating System Fingerprinting
A method for detecting a host's operating system.
Examples
Translation and filter examples.
Comments can be put anywhere in the file using a hash mark (`#'), and ex-
tend to the end of the current line. Additional configuration files can
be included with the include keyword, for example:
include "/etc/pf/sub.filter.conf"
Macros can be defined that will later be expanded in context. Macro
names must start with a letter, and may contain letters, digits and un-
derscores. Macro names may not be reserved words (for example pass, in,
out). Macros are not expanded inside quotes.
For example:
ext_if = "kue0"
all_ifs = "{" $ext_if lo0 "}"
pass out on $ext_if from any to any
pass in on $ext_if proto tcp from any to any port 25
OPTIONS
pf(4) may be tuned for various situations using the set command.
set block-policy
The block-policy option sets the default behaviour for the packet
block action:
drop Packet is silently dropped.
return A TCP RST is returned for blocked TCP packets, an ICMP
UNREACHABLE is returned for blocked UDP packets, and
all other packets are silently dropped.
set debug
Set the debug level to one of the following:
loud Generate debug messages for common conditions.
misc Generate debug messages for various errors.
none Don't generate debug messages.
urgent Generate debug messages only for serious errors.
set fingerprints
Load fingerprints of known operating systems from the given file-
name. By default fingerprints of known operating systems are au-
tomatically loaded from pf.os(5), but can be overridden via this
option. Setting this option may leave a small period of time
where the fingerprints referenced by the currently active ruleset
are inconsistent until the new ruleset finishes loading.
set hostid
The 32-bit hostid identifies this firewall's state table entries
to other firewalls in a pfsync(4) failover cluster. By default
the hostid is set to a pseudo-random value, however it may be de-
sirable to manually configure it, for example to more easily
identify the source of state table entries. The hostid may be
specified in either decimal or hexadecimal.
set limit
Sets hard limits on the memory pools used by the packet filter.
See pool(9) for an explanation of memory pools.
For example, to set the maximum number of entries in the memory
pool used by state table entries (generated by pass rules which
do not specify no state) to 20000:
set limit states 20000
To set the maximum number of entries in the memory pool used for
fragment reassembly to 20000:
set limit frags 20000
To set the maximum number of entries in the memory pool used for
tracking source IP addresses (generated by the sticky-address and
src.track options) to 2000:
set limit src-nodes 2000
To set limits on the memory pools used by tables:
set limit tables 1000
set limit table-entries 100000
The first limits the number of tables that can exist to 1000.
The second limits the overall number of addresses that can be
stored in tables to 100000.
Various limits can be combined on a single line:
set limit { states 20000, frags 20000, src-nodes 2000 }
set loginterface
Enable collection of packet and byte count statistics for the
given interface or interface group. These statistics can be
viewed using:
# pfctl -s info
In this example pf(4) collects statistics on the interface named
dc0:
set loginterface dc0
One can disable the loginterface using:
set loginterface none
set optimization
Optimize state timeouts for one of the following network environ-
ments:
aggressive
Aggressively expire connections. This can greatly reduce
the memory usage of the firewall at the cost of dropping
idle connections early.
conservative
Extremely conservative settings. Avoid dropping legiti-
mate connections at the expense of greater memory uti-
lization (possibly much greater on a busy network) and
slightly increased processor utilization.
high-latency
A high-latency environment (such as a satellite connec-
tion).
normal A normal network environment. Suitable for almost all
networks.
satellite
Alias for high-latency.
set reassemble
The reassemble option turns reassembly of fragmented packets on
or off. If no-df is given, fragments with the dont-fragment bit
set have it cleared before entering the fragment cache, and thus
the reassembled packet doesn't have dont-fragment set either.
Setting this option does not affect non-fragmented packets.
Fragment reassembly is turned on by default.
set require-order
If set to yes, pfctl(8) will enforce that statement types in the
ruleset are listed in the following order, to match the operation
of the underlying packet filtering engine: options, queueing,
translation, filtering. This option is disabled by default.
set ruleset-optimization
basic Enable basic ruleset optimization. This is the default
behaviour. Basic ruleset optimization does four things
to improve the performance of ruleset evaluations:
1. remove duplicate rules
2. remove rules that are a subset of another rule
3. combine multiple rules into a table when advanta-
geous
4. re-order the rules to improve evaluation perfor-
mance
none Disable the ruleset optimizer.
profile Uses the currently loaded ruleset as a feedback profile
to tailor the ordering of quick rules to actual network
traffic.
It is important to note that the ruleset optimizer will modify
the ruleset to improve performance. A side effect of the ruleset
modification is that per-rule accounting statistics will have
different meanings than before. If per-rule accounting is impor-
tant for billing purposes or whatnot, either the ruleset optimiz-
er should not be used or a label field should be added to all of
the accounting rules to act as optimization barriers.
Optimization can also be set as a command-line argument to
pfctl(8), overriding the settings in pf.conf.
set skip on <ifspec>
List interfaces for which packets should not be filtered. Pack-
ets passing in or out on such interfaces are passed as if pf was
disabled, i.e. pf does not process them in any way. This can be
useful on loopback and other virtual interfaces, when packet fil-
tering is not desired and can have unexpected effects.
set state-defaults
The state-defaults option sets the state options for states cre-
ated from rules without an explicit keep state. For example:
set state-defaults pflow, no-sync
set state-policy
The state-policy option sets the default behaviour for states:
if-bound States are bound to interface.
floating States can match packets on any interfaces (the de-
fault).
set timeout
frag Seconds before an unassembled fragment is expired.
interval Interval between purging expired states and fragments.
src.track Length of time to retain a source tracking entry after
the last state expires.
When a packet matches a stateful connection, the seconds to live
for the connection will be updated to that of the protocol and
modifier which corresponds to the connection state. Each packet
which matches this state will reset the TTL. Tuning these values
may improve the performance of the firewall at the risk of drop-
ping valid idle connections.
tcp.closed
The state after one endpoint sends an RST.
tcp.closing
The state after the first FIN has been sent.
tcp.established
The fully established state.
tcp.finwait
The state after both FINs have been exchanged and the
connection is closed. Some hosts (notably web servers on
Solaris) send TCP packets even after closing the connec-
tion. Increasing tcp.finwait (and possibly tcp.closing)
can prevent blocking of such packets.
tcp.first
The state after the first packet.
tcp.opening
The state before the destination host ever sends a pack-
et.
ICMP and UDP are handled in a fashion similar to TCP, but with a
much more limited set of states:
icmp.error
The state after an ICMP error came back in response to an
ICMP packet.
icmp.first
The state after the first packet.
udp.first
The state after the first packet.
udp.multiple
The state if both hosts have sent packets.
udp.single
The state if the source host sends more than one packet
but the destination host has never sent one back.
Other protocols are handled similarly to UDP:
other.first
other.multiple
other.single
Timeout values can be reduced adaptively as the number of state
table entries grows.
adaptive.end
When reaching this number of state entries, all timeout
values become zero, effectively purging all state entries
immediately. This value is used to define the scale fac-
tor; it should not actually be reached (set a lower state
limit, see below).
adaptive.start
When the number of state entries exceeds this value,
adaptive scaling begins. All timeout values are scaled
linearly with factor (adaptive.end - number of states) /
(adaptive.end - adaptive.start).
Adaptive timeouts are enabled by default, with an adaptive.start
value equal to 60% of the state limit, and an adaptive.end value
equal to 120% of the state limit. They can be disabled by set-
ting both adaptive.start and adaptive.end to 0.
The adaptive timeout values can be defined both globally and for
each rule. When used on a per-rule basis, the values relate to
the number of states created by the rule, otherwise to the total
number of states.
For example:
set timeout tcp.first 120
set timeout tcp.established 86400
set timeout { adaptive.start 6000, adaptive.end 12000 }
set limit states 10000
With 9000 state table entries, the timeout values are scaled to
50% (tcp.first 60, tcp.established 43200).
QUEUEING
Packets can be assigned to queues for the purpose of bandwidth control.
At least two declarations are required to configure queues, and later any
packet filtering rule can reference the defined queues by name. During
the filtering component of pf.conf, the last referenced queue name is
where any packets from pass rules will be queued, while for block rules
it specifies where any resulting ICMP or TCP RST packets should be
queued. The scheduler defines the algorithm used to decide which packets
get delayed, dropped, or sent out immediately. There are three sched-
ulers currently supported:
cbq Class Based Queueing. Queues attached to an interface build a
tree, thus each queue can have further child queues. Each queue
can have a priority and a bandwidth assigned. Priority mainly con-
trols the time packets take to get sent out, while bandwidth has
primarily effects on throughput. cbq achieves both partitioning
and sharing of link bandwidth by hierarchically structured classes.
Each class has its own queue and is assigned its share of band-
width. A child class can borrow bandwidth from its parent class as
long as excess bandwidth is available (see the option borrow, be-
low).
hfsc Hierarchical Fair Service Curve. Queues attached to an interface
build a tree, thus each queue can have further child queues. Each
queue can have a priority and a bandwidth assigned. Priority main-
ly controls the time packets take to get sent out, while bandwidth
primarily affects throughput. hfsc supports both link-sharing and
guaranteed real-time services. It employs a service curve based
QoS model, and its unique feature is an ability to decouple delay
and bandwidth allocation.
priq Priority Queueing. Queues are flat attached to the interface, thus
queues cannot have further child queues. Each queue has a unique
priority assigned, ranging from 0 to 15. Packets in the queue with
the highest priority are processed first.
The interfaces on which queueing should be activated are declared using
the altq on declaration. altq on has the following keywords:
<interface>
Queueing is enabled on the named interface.
<scheduler>
Specifies which queueing scheduler to use.
bandwidth <bw>
The maximum bitrate for all queues on an interface may be specified
using the bandwidth keyword. The value can be specified as an ab-
solute value or as a percentage of the interface bandwidth. When
using an absolute value, the suffixes b, Kb, Mb, and Gb are used to
represent bits, kilobits, megabits, and gigabits per second, re-
spectively. The value must not exceed the interface bandwidth. If
bandwidth is not specified, the interface bandwidth is used (but
take note that some interfaces do not know their bandwidth, or can
adapt their bandwidth rates).
qlimit <limit>
The maximum number of packets held in the queue. The default is
50.
tbrsize <size>
Adjusts the size, in bytes, of the token bucket regulator. If not
specified, heuristics based on the interface bandwidth are used to
determine the size.
queue <list>
Defines a list of subqueues to create on an interface.
In the following example, the interface dc0 should queue up to 5Mbps in
four second-level queues using Class Based Queueing. Those four queues
will be shown in a later example.
altq on dc0 cbq bandwidth 5Mb queue { std, http, mail, ssh }
Once interfaces are activated for queueing using the altq directive, a
sequence of queue directives may be defined. The name associated with a
queue must match a queue defined in the altq directive or, except for the
priq scheduler, in a parent queue declaration. The following keywords
can be used:
on <interface>
Specifies the interface the queue operates on. If not given, it
operates on all matching interfaces.
bandwidth <bw>
Specifies the maximum bitrate to be processed by the queue. This
value must not exceed the value of the parent queue and can be
specified as an absolute value or a percentage of the parent
queue's bandwidth. If not specified, defaults to 100% of the par-
ent queue's bandwidth. The priq scheduler does not support band-
width specification.
priority <level>
Between queues a priority level can be set. For cbq and hfsc, the
range is 0 to 7 and for priq, the range is 0 to 15. The default
for all is 1. priq queues with a higher priority are always served
first. cbq and hfsc queues with a higher priority are preferred in
the case of overload.
qlimit <limit>
The maximum number of packets held in the queue. The default is
50.
The scheduler can specify additional parameters using the format
scheduler(parameters). The parameters are:
default Packets not matched by another queue are assigned to this
one. Exactly one default queue is required.
ecn Enables Explicit Congestion Notification (ECN) on this queue.
ECN implies RED.
red Enables Random Early Detection (RED) on this queue. RED
drops packets with a probability proportional to the average
queue length.
The cbq scheduler supports an additional option:
borrow The queue can borrow bandwidth from the parent.
The hfsc scheduler supports some additional options:
linkshare <sc> The bandwidth share of a backlogged queue.
realtime <sc> The minimum required bandwidth for the queue.
upperlimit <sc> The maximum allowed bandwidth for the queue.
<sc> is an abbreviation for service curve.
The format for service curve specifications is (m1, d, m2). m2 controls
the bandwidth assigned to the queue. m1 and d are optional and can be
used to control the initial bandwidth assignment. For the first d mil-
liseconds the queue gets the bandwidth given as m1, afterwards the value
given in m2.
Furthermore, with cbq and hfsc, child queues can be specified as in an
altq declaration, thus building a tree of queues using a part of their
parent's bandwidth.
Packets can be assigned to queues based on filter rules by using the
queue keyword. Normally only one queue is specified; when a second one
is specified it will instead be used for packets which have a TOS of
lowdelay and for TCP ACKs with no data payload.
To continue the previous example, the examples below would specify the
four referenced queues, plus a few child queues. Interactive ssh(1) ses-
sions get priority over bulk transfers like scp(1) and sftp(1). The
queues may then be referenced by filtering rules (see PACKET FILTERING
below).
queue std bandwidth 10% cbq(default)
queue http bandwidth 60% priority 2 cbq(borrow red) \
{ employees, developers }
queue developers bandwidth 75% cbq(borrow)
queue employees bandwidth 15%
queue mail bandwidth 10% priority 0 cbq(borrow ecn)
queue ssh bandwidth 20% cbq(borrow) { ssh_interactive, ssh_bulk }
queue ssh_interactive bandwidth 50% priority 7 cbq(borrow)
queue ssh_bulk bandwidth 50% priority 0 cbq(borrow)
block return out on dc0 inet all queue std
pass out on dc0 inet proto tcp from $developerhosts to any port 80 \
queue developers
pass out on dc0 inet proto tcp from $employeehosts to any port 80 \
queue employees
pass out on dc0 inet proto tcp from any to any port 22 \
queue(ssh_bulk, ssh_interactive)
pass out on dc0 inet proto tcp from any to any port 25 \
queue mail
TRANSLATION
Translation rules modify either the source or destination address of the
packets associated with a stateful connection. A stateful connection is
automatically created to track packets matching such a rule as long as
they are not blocked by the filtering section of pf.conf. The transla-
tion engine modifies the specified address and/or port in the packet, re-
calculates IP, TCP, and UDP checksums as necessary, and passes it to the
packet filter for evaluation.
Since translation occurs before filtering, the filter engine will see
packets as they look after any addresses and ports have been translated.
Filter rules will therefore have to filter based on the translated ad-
dress and port number. Packets that match a translation rule are only
automatically passed if the pass modifier is given, otherwise they are
still subject to block and pass rules.
The state entry created permits pf(4) to keep track of the original ad-
dress for traffic associated with that state and correctly direct return
traffic for that connection.
Various types of translation are possible with pf:
binat
A binat rule specifies a bidirectional mapping between an external
IP netblock and an internal IP netblock.
nat A nat rule specifies that IP addresses are to be changed as the
packet traverses the given interface. This technique allows one or
more IP addresses on the translating host to support network traf-
fic for a larger range of machines on an "inside" network. Al-
though in theory any IP address can be used on the inside, it is
strongly recommended that one of the address ranges defined by RFC
1918 be used. Those netblocks are:
10.0.0.0 - 10.255.255.255 (all of net 10, i.e. 10/8)
172.16.0.0 - 172.31.255.255 (i.e. 172.16/12)
192.168.0.0 - 192.168.255.255 (i.e. 192.168/16)
rdr The packet is redirected to another destination and possibly a dif-
ferent port. rdr rules can optionally specify port ranges instead
of single ports. rdr ... port 2000:2999 -> ... port 4000 redirects
ports 2000 to 2999 (inclusive) to port 4000. rdr ... port
2000:2999 -> ... port 4000:* redirects port 2000 to 4000, 2001 to
4001, ..., 2999 to 4999.
In addition to modifying the address, some translation rules may modify
source or destination ports for TCP or UDP connections; implicitly in the
case of nat rules and explicitly in the case of rdr rules. Port numbers
are never translated with a binat rule.
Evaluation order of the translation rules is dependent on the type of the
translation rules and the direction of a packet. binat rules are always
evaluated first. Then either the rdr rules are evaluated on an inbound
packet or the nat rules on an outbound packet. Rules of the same type
are evaluated in the same order in which they appear in the ruleset. The
first matching rule decides what action is taken.
The no option prefixed to a translation rule causes packets to remain un-
translated, much in the same way as drop quick works in the packet fil-
ter. If no rule matches the packet, it is passed to the filter engine
unmodified.
Translation rules apply only to packets that pass through the specified
interface, and if no interface is specified, translation is applied to
packets on all interfaces. For instance, redirecting port 80 on an ex-
ternal interface to an internal web server will only work for connections
originating from the outside. Connections to the address of the external
interface from local hosts will not be redirected, since such packets do
not actually pass through the external interface. Redirections cannot
reflect packets back through the interface they arrive on, they can only
be redirected to hosts connected to different interfaces or to the fire-
wall itself.
Note that redirecting external incoming connections to the loopback ad-
dress will effectively allow an external host to connect to daemons bound
solely to the loopback address, circumventing the traditional blocking of
such connections on a real interface. For example:
rdr on ne3 inet proto tcp to port smtp -> 127.0.0.1 port spamd
Unless this effect is desired, any of the local non-loopback addresses
should be used instead as the redirection target, which allows external
connections only to daemons bound to this address or not bound to any ad-
dress.
For nat and rdr rules for which there is a single redirection address
which has a subnet mask smaller than 32 for IPv4 or 128 for IPv6 (more
than one IP address), a variety of different methods for assigning this
address can be used:
bitmask
The bitmask option applies the network portion of the redirection
address to the address to be modified (source with nat, destination
with rdr).
random [sticky-address]
The random option selects an address at random within the defined
block of addresses.
sticky-address can be specified to ensure that multiple connections
from the same source are mapped to the same redirection address.
Associations are destroyed as soon as there are no longer states
which refer to them; in order to make the mappings last beyond the
lifetime of the states, increase the global options with set
timeout src.track.
round-robin [sticky-address]
The round-robin option loops through the redirection address(es).
sticky-address is as described above.
When more than one redirection address is specified, round-robin is
the only permitted pool type.
source-hash [key]
The source-hash option uses a hash of the source address to deter-
mine the redirection address, ensuring that the redirection address
is always the same for a given source. An optional key can be
specified after this keyword either in hex or as a string; by de-
fault pfctl(8) randomly generates a key for source-hash every time
the ruleset is reloaded.
static-port
With nat rules, the static-port option prevents pf(4) from modify-
ing the source port on TCP and UDP packets.
PACKET FILTERING
pf(4) has the ability to block, pass, and match packets based on at-
tributes of their layer 3 and layer 4 headers. Filter rules determine
which of these actions are taken; filter parameters specify the packets
to which a rule applies.
For each packet processed by the packet filter, the filter rules are
evaluated in sequential order, from first to last. For block and pass,
the last matching rule decides what action is taken; if no rule matches
the packet, the default action is to pass the packet. For match, rules
are evaluated every time they match; the pass/block state of a packet re-
mains unchanged.
Most parameters are optional. If a parameter is specified, the rule only
applies to packets with matching attributes. Certain parameters can be
expressed as lists, in which case pfctl(8) generates all needed rule com-
binations.
By default pf(4) filters packets statefully: the first time a packet
matches a pass rule, a state entry is created; for subsequent packets the
filter checks whether the packet matches any state. If it does, the
packet is passed without evaluation of any rules. After the connection
is closed or times out, the state entry is automatically removed.
The following actions can be used in the filter:
block
The packet is blocked. There are a number of ways in which a block
rule can behave when blocking a packet. The default behaviour is
to drop packets silently, however this can be overridden or made
explicit either globally, by setting the block-policy option, or on
a per-rule basis with one of the following options:
drop The packet is silently dropped.
return This causes a TCP RST to be returned for TCP pack-
ets and an ICMP UNREACHABLE for other types of
packets.
return-icmp
return-icmp6 This causes ICMP messages to be returned for pack-
ets which match the rule. By default this is an
ICMP UNREACHABLE message, however this can be
overridden by specifying a message as a code or
number.
return-rst This applies only to TCP packets, and issues a TCP
RST which closes the connection. An optional pa-
rameter, ttl, may be given with a TTL value.
Options returning ICMP packets currently have no effect if pf(4)
operates on a bridge(4), as the code to support this feature has
not yet been implemented.
The simplest mechanism to block everything by default and only pass
packets that match explicit rules is specify a first filter rule
of:
block all
match
The packet is matched. This mechanism is used to provide fine
grained filtering without altering the block/pass state of a pack-
et. match rules differ from block and pass rules in that parame-
ters are set every time a packet matches the rule, not only on the
last matching rule. For the following parameters, this means that
the parameter effectively becomes ``sticky'' until explicitly over-
ridden: max-mss, min-ttl, no-df, queue, random-id, reassemble tcp,
rtable, and set-tos.
log is different still, in that the action happens every time a
rule matches i.e. a single packet can get logged more than once.
pass The packet is passed; state is created unless the no state option
is specified.
The following parameters can be used in the filter:
in or out
A packet always comes in on, or goes out through, one interface.
in and out apply to incoming and outgoing packets; if neither are
specified, the rule will match packets in both directions.
log In addition to the action specified, a log message is generated.
Only the packet that establishes the state is logged, unless the
no state option is specified. The logged packets are sent to a
pflog(4) interface, by default pflog0. This interface is moni-
tored by the pflogd(8) logging daemon, which dumps the logged
packets to the file /var/log/pflog in pcap(3) binary format.
log (all)
Used to force logging of all packets for a connection. This is
not necessary when no state is explicitly specified. As with
log, packets are logged to pflog(4).
log (user)
Logs the UID and PID of the socket on the local host used to send
or receive a packet, in addition to the normal information.
log (to <interface>)
Send logs to the specified pflog(4) interface instead of pflog0.
quick If a packet matches a rule which has the quick option set, this
rule is considered the last matching rule, and evaluation of sub-
sequent rules is skipped.
on <interface>
This rule applies only to packets coming in on, or going out
through, this particular interface or interface group. For more
information on interface groups, see the group keyword in
ifconfig(8).
<af> This rule applies only to packets of this address family. Sup-
ported values are inet and inet6.
proto <protocol>
This rule applies only to packets of this protocol. Common pro-
tocols are ICMP, ICMP6, TCP, and UDP. For a list of all the pro-
tocol name to number mappings used by pfctl(8), see the file
/etc/protocols.
from <source> port <source> os <source> to <dest> port <dest>
This rule applies only to packets with the specified source and
destination addresses and ports.
Addresses can be specified in CIDR notation (matching netblocks),
as symbolic host names, interface names or interface group names,
or as any of the following keywords:
any Any address.
no-route Any address which is not currently routable.
route <label> Any address matching the given route(8) label.
<table> Any address matching the given table.
urpf-failed Any source address that fails a unicast reverse
path forwarding (URPF) check, i.e. packets coming
in on an interface other than that which holds
the route back to the packet's source address.
Ranges of addresses are specified using the `-' operator. For
instance: ``10.1.1.10 - 10.1.1.12'' means all addresses from
10.1.1.10 to 10.1.1.12, hence addresses 10.1.1.10, 10.1.1.11, and
10.1.1.12.
Interface names and interface group names can have modifiers ap-
pended:
:0 Do not include interface aliases.
:broadcast Translates to the interface's broadcast ad-
dress(es).
:network Translates to the network(s) attached to the inter-
face.
:peer Translates to the point-to-point interface's peer
address(es).
Host names may also have the :0 option appended to restrict the
name resolution to the first of each v4 and v6 address found.
Host name resolution and interface to address translation are
done at ruleset load-time. When the address of an interface (or
host name) changes (under DHCP or PPP, for instance), the ruleset
must be reloaded for the change to be reflected in the kernel.
Surrounding the interface name (and optional modifiers) in paren-
theses changes this behaviour. When the interface name is sur-
rounded by parentheses, the rule is automatically updated whenev-
er the interface changes its address. The ruleset does not need
to be reloaded. This is especially useful with nat.
Ports can be specified either by number or by name. For example,
port 80 can be specified as www. For a list of all port name to
number mappings used by pfctl(8), see the file /etc/services.
Ports and ranges of ports are specified using these operators:
= (equal)
!= (unequal)
< (less than)
<= (less than or equal)
> (greater than)
>= (greater than or equal)
: (range including boundaries)
>< (range excluding boundaries)
<> (except range)
`><', `<>' and `:' are binary operators (they take two argu-
ments). For instance:
port 2000:2004
means `all ports >= 2000 and <= 2004', hence ports 2000,
2001, 2002, 2003, and 2004.
port 2000 >< 2004
means `all ports > 2000 and < 2004', hence ports 2001,
2002, and 2003.
port 2000 <> 2004
means `all ports < 2000 or > 2004', hence ports 1-1999
and 2005-65535.
The operating system of the source host can be specified in the
case of TCP rules with the os modifier. See the OPERATING SYSTEM
FINGERPRINTING section for more information.
The host, port, and OS specifications are optional, as in the
following examples:
pass in all
pass in from any to any
pass in proto tcp from any port <= 1024 to any
pass in proto tcp from any to any port 25
pass in proto tcp from 10.0.0.0/8 port > 1024 \
to ! 10.1.2.3 port != ssh
pass in proto tcp from any os "OpenBSD"
pass in proto tcp from route "DTAG"
The following additional parameters can be used in the filter:
all This is equivalent to "from any to any".
allow-opts
By default, IPv4 packets with IP options or IPv6 packets with
routing extension headers are blocked. When allow-opts is speci-
fied for a pass rule, packets that pass the filter based on that
rule (last matching) do so even if they contain IP options or
routing extension headers. For packets that match state, the
rule that initially created the state is used. The implicit pass
rule that is used when a packet does not match any rules does not
allow IP options.
divert-reply
Used to receive replies for sockets that are bound to addresses
which are not local to the machine. See setsockopt(2) for infor-
mation on how to bind these sockets.
divert-to <host> port <port>
Used to redirect packets to a local socket bound to host and
port. The packets will not be modified, so getsockname(2) on the
socket will return the original destination address of the pack-
et.
flags <a>/<b> | any
This rule only applies to TCP packets that have the flags <a> set
out of set <b>. Flags not specified in <b> are ignored. For
stateful connections, the default is flags S/SA. To indicate
that flags should not be checked at all, specify flags any. The
flags are: (F)IN, (S)YN, (R)ST, (P)USH, (A)CK, (U)RG, (E)CE, and
C(W)R.
flags S/S Flag SYN is set. The other flags are ignored.
flags S/SA This is the default setting for stateful connections.
Out of SYN and ACK, exactly SYN may be set. SYN,
SYN+PSH, and SYN+RST match, but SYN+ACK, ACK, and
ACK+RST do not. This is more restrictive than the
previous example.
flags /SFRA
If the first set is not specified, it defaults to
none. All of SYN, FIN, RST, and ACK must be unset.
Because flags S/SA is applied by default (unless no state is
specified), only the initial SYN packet of a TCP handshake will
create a state for a TCP connection. It is possible to be less
restrictive, and allow state creation from intermediate (non-SYN)
packets, by specifying flags any. This will cause pf(4) to syn-
chronize to existing connections, for instance if one flushes the
state table. However, states created from such intermediate
packets may be missing connection details such as the TCP window
scaling factor. States which modify the packet flow, such as
those affected by nat, binat, or rdr rules, modulate or synproxy
state options, or scrubbed with reassemble tcp, will also not be
recoverable from intermediate packets. Such connections will
stall and time out.
group <group>
Similar to user, this rule only applies to packets of sockets
owned by the specified group.
icmp-type <type> code <code>
icmp6-type <type> code <code>
This rule only applies to ICMP or ICMP6 packets with the speci-
fied type and code. Text names for ICMP types and codes are
listed in icmp(4) and icmp6(4). The protocol and the ICMP type
indicator (icmp-type or icmp6-type) must match.
label <string>
Adds a label to the rule, which can be used to identify the rule.
For instance, ``pfctl -s labels'' shows per-rule statistics for
rules that have labels.
The following macros can be used in labels:
$dstaddr The destination IP address.
$dstport The destination port specification.
$if The interface.
$nr The rule number.
$proto The protocol name.
$srcaddr The source IP address.
$srcport The source port specification.
For example:
ips = "{ 1.2.3.4, 1.2.3.5 }"
pass in proto tcp from any to $ips \
port > 1023 label "$dstaddr:$dstport"
Expands to:
pass in inet proto tcp from any to 1.2.3.4 \
port > 1023 label "1.2.3.4:>1023"
pass in inet proto tcp from any to 1.2.3.5 \
port > 1023 label "1.2.3.5:>1023"
The macro expansion for the label directive occurs only at con-
figuration file parse time, not during runtime.
probability <number>
A probability attribute can be attached to a rule, with a value
set between 0 and 1, bounds not included. In that case, the rule
will be honoured using the given probability value only. For ex-
ample, the following rule will drop 20% of incoming ICMP packets:
block in proto icmp probability 20%
queue <queue> | (<queue>, <queue>)
Packets matching this rule will be assigned to the specified
queue. If two queues are given, packets which have a TOS of
lowdelay and TCP ACKs with no data payload will be assigned to
the second one. See QUEUEING for setup details.
For example:
pass in proto tcp to port 25 queue mail
pass in proto tcp to port 22 queue(ssh_bulk, ssh_prio)
rtable <number>
Used to select an alternate routing table for the routing lookup.
Only effective before the route lookup happened, i.e. when fil-
tering inbound.
tag <string>
Packets matching this rule will be tagged with the specified
string. The tag acts as an internal marker that can be used to
identify these packets later on. This can be used, for example,
to provide trust between interfaces and to determine if packets
have been processed by translation rules. Tags are "sticky",
meaning that the packet will be tagged even if the rule is not
the last matching rule. Further matching rules can replace the
tag with a new one but will not remove a previously applied tag.
A packet is only ever assigned one tag at a time. Packet tagging
can be done during nat, rdr, or binat rules in addition to filter
rules. Tags take the same macros as labels (see above).
tagged <string>
Used with filter or translation rules to specify that packets
must already be tagged with the given tag in order to match the
rule. Inverse tag matching can also be done by specifying the !
operator before the tagged keyword.
tos <string> | <number>
This rule applies to packets with the specified TOS bits set.
string may be one of lowdelay, throughput, or reliability; number
may be either a hex or decimal number.
For example, the following rules are identical:
pass all tos lowdelay
pass all tos 0x10
pass all tos 16
user <user>
This rule only applies to packets of sockets owned by the speci-
fied user. For outgoing connections initiated from the firewall,
this is the user that opened the connection. For incoming con-
nections to the firewall itself, this is the user that listens on
the destination port. For forwarded connections, where the fire-
wall is not a connection endpoint, the user and group are
unknown.
All packets, both outgoing and incoming, of one connection are
associated with the same user and group. Only TCP and UDP pack-
ets can be associated with users; for other protocols these pa-
rameters are ignored.
User and group refer to the effective (as opposed to the real)
IDs, in case the socket is created by a setuid/setgid process.
User and group IDs are stored when a socket is created; when a
process creates a listening socket as root (for instance, by
binding to a privileged port) and subsequently changes to another
user ID (to drop privileges), the credentials will remain root.
User and group IDs can be specified as either numbers or names.
The syntax is similar to the one for ports. The value unknown
matches packets of forwarded connections. unknown can only be
used with the operators = and !=. Other constructs like user >=
unknown are invalid. Forwarded packets with unknown user and
group ID match only rules that explicitly compare unknown with
the operators = or !=. For instance user >= 0 does not match
forwarded packets. The following example allows only selected
users to open outgoing connections:
block out proto { tcp, udp } all
pass out proto { tcp, udp } all user { < 1000, dhartmei }
Routing
If a packet matches a rule with one of the following route options set,
the packet filter will route the packet according to the type of route
option. When such a rule creates state, the route option is also applied
to all packets matching the same connection.
dup-to
The dup-to option creates a duplicate of the packet and routes it
like route-to. The original packet gets routed as it normally
would.
fastroute
The fastroute option does a normal route lookup to find the next
hop for the packet.
reply-to
The reply-to option is similar to route-to, but routes packets that
pass in the opposite direction (replies) to the specified inter-
face. Opposite direction is only defined in the context of a state
entry, and reply-to is useful only in rules that create state. It
can be used on systems with multiple external connections to route
all outgoing packets of a connection through the interface the in-
coming connection arrived through (symmetric routing enforcement).
route-to
The route-to option routes the packet to the specified interface
with an optional address for the next hop. When a route-to rule
creates state, only packets that pass in the same direction as the
filter rule specifies will be routed in this way. Packets passing
in the opposite direction (replies) are not affected and are routed
normally.
For the dup-to, reply-to, and route-to route options for which there is a
single redirection address which has a subnet mask smaller than 32 for
IPv4 or 128 for IPv6 (more than one IP address), the methods random,
round-robin, and source-hash, as described above in TRANSLATION, can be
used.
TABLES
Tables are named structures which can hold a collection of addresses and
networks. Lookups against tables in pf(4) are relatively fast, making a
single rule with tables much more efficient, in terms of processor usage
and memory consumption, than a large number of rules which differ only in
IP address (either created explicitly or automatically by rule expan-
sion).
Tables can be used as the source or destination of filter or translation
rules. They can also be used for the redirect address of nat and rdr
rules and in the routing options of filter rules, but only for round-
robin pools.
Tables can be defined with any of the following pfctl(8) mechanisms. As
with macros, reserved words may not be used as table names.
manually Persistent tables can be manually created with the add or
replace option of pfctl(8), before or after the ruleset has
been loaded.
pf.conf Table definitions can be placed directly in this file and load-
ed at the same time as other rules are loaded, atomically.
Table definitions inside pf.conf use the table statement, and
are especially useful to define non-persistent tables. The
contents of a pre-existing table defined without a list of ad-
dresses to initialize it is not altered when pf.conf is loaded.
A table initialized with the empty list, { }, will be cleared
on load.
Tables may be defined with the following attributes:
const The const flag prevents the user from altering the contents of
the table once it has been created. Without that flag, pfctl(8)
can be used to add or remove addresses from the table at any
time, even when running with securelevel(7) = 2.
counters
The counters flag enables per-address packet and byte counters,
which can be displayed with pfctl(8).
persist The persist flag forces the kernel to keep the table even when
no rules refer to it. If the flag is not set, the kernel will
automatically remove the table when the last rule referring to
it is flushed.
This example creates a table called private, to hold RFC 1918 private
network blocks, and a table called badhosts, which is initially empty. A
filter rule is set up to block all traffic coming from addresses listed
in either table:
table <private> const { 10/8, 172.16/12, 192.168/16 }
table <badhosts> persist
block on fxp0 from { <private>, <badhosts> } to any
The private table cannot have its contents changed and the badhosts table
will exist even when no active filter rules reference it. Addresses may
later be added to the badhosts table, so that traffic from these hosts
can be blocked by using the following:
# pfctl -t badhosts -Tadd 204.92.77.111
A table can also be initialized with an address list specified in one or
more external files, using the following syntax:
table <spam> persist file "/etc/spammers" file "/etc/openrelays"
block on fxp0 from <spam> to any
The files /etc/spammers and /etc/openrelays list IP addresses, one per
line. Any lines beginning with a `#' are treated as comments and ig-
nored. In addition to being specified by IP address, hosts may also be
specified by their hostname. When the resolver is called to add a host-
name to a table, all resulting IPv4 and IPv6 addresses are placed into
the table. IP addresses can also be entered in a table by specifying a
valid interface name, a valid interface group, or the self keyword, in
which case all addresses assigned to the interface(s) will be added to
the table.
ANCHORS
Besides the main ruleset, pf.conf can specify anchor attachment points.
An anchor is a container that can hold rules, address tables, and other
anchors. When evaluation of the main ruleset reaches an anchor rule,
pf(4) will proceed to evaluate all rules specified in that anchor.
The following example blocks all packets on the external interface by de-
fault, then evaluates all rules in the anchor named "spam", and finally
passes all outgoing connections and incoming connections to port 25:
ext_if = "kue0"
block on $ext_if all
anchor spam
pass out on $ext_if all
pass in on $ext_if proto tcp from any to $ext_if port smtp
Anchors can be manipulated through pfctl(8) without reloading the main
ruleset or other anchors. This loads a single rule into the anchor,
which blocks all packets from a specific address:
# echo "block in quick from 1.2.3.4 to any" | pfctl -a spam -f -
The anchor can also be populated by adding a load anchor rule after the
anchor rule. When pfctl(8) loads pf.conf, it will also load all the
rules from the file /etc/pf-spam.conf into the anchor.
anchor spam
load anchor spam from "/etc/pf-spam.conf"
Filter rule anchors can also be loaded inline in the ruleset within a
brace-delimited block. Brace delimited blocks may contain rules or other
brace-delimited blocks. When anchors are loaded this way the anchor name
becomes optional. Since the parser specification for anchor names is a
string, double quote characters (`"') should be placed around the anchor
name.
anchor "external" on egress {
block
anchor out {
pass proto tcp from any to port { 25, 80, 443 }
}
pass in proto tcp to any port 22
}
Anchor rules can also specify packet filtering parameters using the same
syntax as filter rules. When parameters are used, the anchor rule is on-
ly evaluated for matching packets. This allows conditional evaluation of
anchors, like:
block on $ext_if all
anchor spam proto tcp from any to any port smtp
pass out on $ext_if all
pass in on $ext_if proto tcp from any to $ext_if port smtp
The rules inside anchor "spam" are only evaluated for TCP packets with
destination port 25. Hence, the following will only block connections
from 1.2.3.4 to port 25:
# echo "block in quick from 1.2.3.4 to any" | pfctl -a spam -f -
Matching filter and translation rules marked with the quick option are
final and abort the evaluation of the rules in other anchors and the main
ruleset. If the anchor itself is marked with the quick option, ruleset
evaluation will terminate when the anchor is exited if the packet is
matched by any rule within the anchor.
An anchor references other anchor attachment points using the following
syntax:
anchor <name>
Evaluates the filter rules in the specified anchor.
binat-anchor <name>
Evaluates the binat rules in the specified anchor.
nat-anchor <name>
Evaluates the nat rules in the specified anchor.
rdr-anchor <name>
Evaluates the rdr rules in the specified anchor.
An anchor has a name which specifies the path where pfctl(8) can be used
to access the anchor to perform operations on it, such as attaching child
anchors to it or loading rules into it. Anchors may be nested, with com-
ponents separated by `/' characters, similar to how file system hierar-
chies are laid out. The main ruleset is actually the default anchor, so
filter and translation rules, for example, may also be contained in any
anchor.
Anchor rules are evaluated relative to the anchor in which they are con-
tained. For example, all anchor rules specified in the main ruleset will
reference anchor attachment points underneath the main ruleset, and an-
chor rules specified in a file loaded from a load anchor rule will be at-
tached under that anchor point.
Anchors may end with the asterisk (`*') character, which signifies that
all anchors attached at that point should be evaluated in the alphabeti-
cal ordering of their anchor name. For example, the following will eval-
uate each rule in each anchor attached to the "spam" anchor:
anchor "spam/*"
Note that it will only evaluate anchors that are directly attached to the
"spam" anchor, and will not descend to evaluate anchors recursively.
Since anchors are evaluated relative to the anchor in which they are con-
tained, there is a mechanism for accessing the parent and ancestor an-
chors of a given anchor. Similar to file system path name resolution, if
the sequence `..' appears as an anchor path component, the parent anchor
of the current anchor in the path evaluation at that point will become
the new current anchor. As an example, consider the following:
# printf 'anchor "spam/allowed"\n' | pfctl -f -
# printf 'anchor "../banned"\npass\n' | pfctl -a spam/allowed -f -
Evaluation of the main ruleset will lead into the spam/allowed anchor,
which will evaluate the rules in the spam/banned anchor, if any, before
finally evaluating the pass rule.
STATEFUL FILTERING
pf(4) filters packets statefully, which has several advantages. For TCP
connections, comparing a packet to a state involves checking its sequence
numbers, as well as TCP timestamps if a rule using the reassemble tcp pa-
rameter applies to the connection. If these values are outside the nar-
row windows of expected values, the packet is dropped. This prevents
spoofing attacks, such as when an attacker sends packets with a fake
source address/port but does not know the connection's sequence numbers.
Similarly, pf(4) knows how to match ICMP replies to states. For example,
to allow echo requests (such as those created by ping(8)) out statefully
and match incoming echo replies correctly to states:
pass out inet proto icmp all icmp-type echoreq
Also, looking up states is usually faster than evaluating rules. If
there are 50 rules, all of them are evaluated sequentially in O(n). Even
with 50000 states, only 16 comparisons are needed to match a state, since
states are stored in a binary search tree that allows searches in O(log2
n).
Furthermore, correct handling of ICMP error messages is critical to many
protocols, particularly TCP. pf(4) matches ICMP error messages to the
correct connection, checks them against connection parameters, and passes
them if appropriate. For example if an ICMP source quench message refer-
ring to a stateful TCP connection arrives, it will be matched to the
state and get passed.
Finally, state tracking is required for binat, nat, and rdr rules, in or-
der to track address and port translations and reverse the translation on
returning packets.
pf(4) will also create state for other protocols which are effectively
stateless by nature. UDP packets are matched to states using only host
addresses and ports, and other protocols are matched to states using only
the host addresses.
If stateless filtering of individual packets is desired, the no state
keyword can be used to specify that state will not be created if this is
the last matching rule. A number of parameters can also be set to affect
how pf(4) handles state tracking, as detailed below.
State Modulation
Much of the security derived from TCP is attributable to how well the
initial sequence numbers (ISNs) are chosen. Some popular stack implemen-
tations choose very poor ISNs and thus are normally susceptible to ISN
prediction exploits. By applying a modulate state rule to a TCP connec-
tion, pf(4) will create a high quality random sequence number for each
connection endpoint.
The modulate state directive implicitly keeps state on the rule and is
only applicable to TCP connections.
For instance:
block all
pass out proto tcp from any to any modulate state
pass in proto tcp from any to any port 25 flags S/SFRA \
modulate state
Note that modulated connections will not recover when the state table is
lost (firewall reboot, flushing the state table, etc.). pf(4) will not
be able to infer a connection again after the state table flushes the
connection's modulator. When the state is lost, the connection may be
left dangling until the respective endpoints time out the connection. It
is possible on a fast local network for the endpoints to start an ACK
storm while trying to resynchronize after the loss of the modulator. The
default flags settings (or a more strict equivalent) should be used on
modulate state rules to prevent ACK storms.
Note that alternative methods are available to prevent loss of the state
table and allow for firewall failover. See carp(4) and pfsync(4) for
further information.
SYN Proxy
By default, pf(4) passes packets that are part of a TCP handshake between
the endpoints. The synproxy state option can be used to cause pf(4) it-
self to complete the handshake with the active endpoint, perform a hand-
shake with the passive endpoint, and then forward packets between the
endpoints.
No packets are sent to the passive endpoint before the active endpoint
has completed the handshake, hence so-called SYN floods with spoofed
source addresses will not reach the passive endpoint, as the sender can't
complete the handshake.
The proxy is transparent to both endpoints; they each see a single con-
nection from/to the other endpoint. pf(4) chooses random initial se-
quence numbers for both handshakes. Once the handshakes are completed,
the sequence number modulators (see previous section) are used to trans-
late further packets of the connection. synproxy state includes modulate
state.
Rules with synproxy will not work if pf(4) operates on a bridge(4).
Example:
pass in proto tcp from any to any port www synproxy state
Stateful Tracking Options
A number of options related to stateful tracking can be applied on a per-
rule basis. keep state, modulate state, and synproxy state support these
options, and keep state must be specified explicitly to apply options to
a rule.
max <number>
Limits the number of concurrent states the rule may create. When
this limit is reached, further packets that would create state will
not match this rule until existing states time out.
no-sync
Prevent state changes for states created by this rule from appear-
ing on the pfsync(4) interface.
pflow
States created by this rule are exported on the pflow(4) interface.
sloppy
Uses a sloppy TCP connection tracker that does not check sequence
numbers at all, which makes insertion and ICMP teardown attacks way
easier. This is intended to be used in situations where one does
not see all packets of a connection, e.g. in asymmetric routing
situations. It cannot be used with modulate or synproxy state.
<timeout> <seconds>
Changes the timeout values used for states created by this rule.
For a list of all valid timeout names, see OPTIONS above.
Multiple options can be specified, separated by commas:
pass in proto tcp from any to any \
port www keep state \
(max 100, source-track rule, max-src-nodes 75, \
max-src-states 3, tcp.established 60, tcp.closing 5)
When the source-track keyword is specified, the number of states per
source IP is tracked.
source-track global
The number of states created by all rules that use this option is
limited. Each rule can specify different max-src-nodes and max-
src-states options, however state entries created by any partici-
pating rule count towards each individual rule's limits.
source-track rule
The maximum number of states created by this rule is limited by the
rule's max-src-nodes and max-src-states options. Only state en-
tries created by this particular rule count toward the rule's lim-
its.
The following limits can be set:
max-src-nodes <number>
Limits the maximum number of source addresses which can simultane-
ously have state table entries.
max-src-states <number>
Limits the maximum number of simultaneous state entries that a sin-
gle source address can create with this rule.
For stateful TCP connections, limits on established connections (connec-
tions which have completed the TCP 3-way handshake) can also be enforced
per source IP.
max-src-conn <number>
Limits the maximum number of simultaneous TCP connections which
have completed the 3-way handshake that a single host can make.
max-src-conn-rate <number> / <seconds>
Limit the rate of new connections over a time interval. The con-
nection rate is an approximation calculated as a moving average.
Because the 3-way handshake ensures that the source address is not being
spoofed, more aggressive action can be taken based on these limits. With
the overload <table> state option, source IP addresses which hit either
of the limits on established connections will be added to the named
table. This table can be used in the ruleset to block further activity
from the offending host, redirect it to a tarpit process, or restrict its
bandwidth.
The optional flush keyword kills all states created by the matching rule
which originate from the host which exceeds these limits. The global
modifier to the flush command kills all states originating from the of-
fending host, regardless of which rule created the state.
For example, the following rules will protect the webserver against hosts
making more than 100 connections in 10 seconds. Any host which connects
faster than this rate will have its address added to the <bad_hosts>
table and have all states originating from it flushed. Any new packets
arriving from this host will be dropped unconditionally by the block
rule.
block quick from <bad_hosts>
pass in on $ext_if proto tcp to $webserver port www keep state \
(max-src-conn-rate 100/10, overload <bad_hosts> flush global)
TRAFFIC NORMALISATION
Traffic normalisation is a broad umbrella term for aspects of the packet
filter which deal with verifying packets, packet fragments, spoof traf-
fic, and other irregularities.
Scrub
Scrub involves sanitising packet content in such a way that there are no
ambiguities in packet interpretation on the receiving side. It is in-
voked with the scrub option, added to regular rules.
Parameters are specified enclosed in parentheses. At least one of the
following parameters must be specified:
max-mss <number>
Enforces a maximum MSS for matching TCP packets.
min-ttl <number>
Enforces a minimum TTL for matching IP packets.
no-df
Clears the dont-fragment bit from a matching IP packet. Some oper-
ating systems are known to generate fragmented packets with the
dont-fragment bit set. This is particularly true with NFS. pf(4)
will drop such fragmented dont-fragment packets unless no-df is
specified.
Unfortunately some operating systems also generate their dont-
fragment packets with a zero IP identification field. Clearing the
dont-fragment bit on packets with a zero IP ID may cause deleteri-
ous results if an upstream router later fragments the packet. Us-
ing random-id is recommended in combination with no-df to ensure
unique IP identifiers.
random-id
Replaces the IP identification field with random values to compen-
sate for predictable values generated by many hosts. This option
only applies to packets that are not fragmented after the optional
fragment reassembly.
reassemble tcp
Statefully normalises TCP connections. reassemble tcp performs the
following normalisations:
TTL
Neither side of the connection is allowed to reduce their IP TTL.
An attacker may send a packet such that it reaches the firewall,
affects the firewall state, and expires before reaching the desti-
nation host. reassemble tcp will raise the TTL of all packets back
up to the highest value seen on the connection.
Timestamp Modulation
Modern TCP stacks will send a timestamp on every TCP packet and
echo the other endpoint's timestamp back to them. Many operating
systems will merely start the timestamp at zero when first booted,
and increment it several times a second. The uptime of the host
can be deduced by reading the timestamp and multiplying by a con-
stant. Also observing several different timestamps can be used to
count hosts behind a NAT device. And spoofing TCP packets into a
connection requires knowing or guessing valid timestamps. Times-
tamps merely need to be monotonically increasing and not derived
off a guessable base time. reassemble tcp will cause scrub to mod-
ulate the TCP timestamps with a random number.
Extended PAWS Checks
There is a problem with TCP on long fat pipes, in that a packet
might get delayed for longer than it takes the connection to wrap
its 32-bit sequence space. In such an occurrence, the old packet
would be indistinguishable from a new packet and would be accepted
as such. The solution to this is called PAWS: Protection Against
Wrapped Sequence numbers. It protects against it by making sure
the timestamp on each packet does not go backwards. reassemble tcp
also makes sure the timestamp on the packet does not go forward
more than the RFC allows. By doing this, pf(4) artificially ex-
tends the security of TCP sequence numbers by 10 to 18 bits when
the host uses appropriately randomized timestamps, since a blind
attacker would have to guess the timestamp as well.
set-tos <string> | <number>
Enforces a TOS for matching IP packets. string may be one of
lowdelay, throughput, or reliability; number may be either a hex or
decimal number.
For example:
match in all scrub (no-df max-mss 1440)
Fragment Handling
The size of IP datagrams (packets) can be significantly larger than the
maximum transmission unit (MTU) of the network. In cases when it is nec-
essary or more efficient to send such large packets, the large packet
will be fragmented into many smaller packets that will each fit onto the
wire. Unfortunately for a firewalling device, only the first logical
fragment will contain the necessary header information for the subproto-
col that allows pf(4) to filter on things such as TCP ports or to perform
NAT.
One alternative is to filter individual fragments with filter rules. If
packet reassembly is turned off, it is passed to the filter. Filter
rules with matching IP header parameters decide whether the fragment is
passed or blocked, in the same way as complete packets are filtered.
Without reassembly, fragments can only be filtered based on IP header
fields (source/destination address, protocol), since subprotocol header
fields are not available (TCP/UDP port numbers, ICMP code/type). The
fragment option can be used to restrict filter rules to apply only to
fragments, but not complete packets. Filter rules without the fragment
option still apply to fragments, if they only specify IP header fields.
For instance:
pass in proto tcp from any to any port 80
The rule above never applies to a fragment, even if the fragment is part
of a TCP packet with destination port 80, because without reassembly this
information is not available for each fragment. This also means that
fragments cannot create new or match existing state table entries, which
makes stateful filtering and address translation (NAT, redirection) for
fragments impossible.
In most cases, the benefits of reassembly outweigh the additional memory
cost, so reassembly is on by default.
The memory allocated for fragment caching can be limited using pfctl(8).
Once this limit is reached, fragments that would have to be cached are
dropped until other entries time out. The timeout value can also be ad-
justed.
Currently, only IPv4 fragments are supported and IPv6 fragments are
blocked unconditionally.
Blocking Spoofed Traffic
Spoofing is the faking of IP addresses, typically for malicious purposes.
The antispoof directive expands to a set of filter rules which will block
all traffic with a source IP from the network(s) directly connected to
the specified interface(s) from entering the system through any other in-
terface.
For example:
antispoof for lo0
Expands to:
block drop in on ! lo0 inet from 127.0.0.1/8 to any
block drop in on ! lo0 inet6 from ::1 to any
For non-loopback interfaces, there are additional rules to block incoming
packets with a source IP address identical to the interface's IP(s). For
example, assuming the interface wi0 had an IP address of 10.0.0.1 and a
netmask of 255.255.255.0:
antispoof for wi0 inet
Expands to:
block drop in on ! wi0 inet from 10.0.0.0/24 to any
block drop in inet from 10.0.0.1 to any
Caveat: Rules created by the antispoof directive interfere with packets
sent over loopback interfaces to local addresses. One should pass these
explicitly.
OPERATING SYSTEM FINGERPRINTING
Passive OS fingerprinting is a mechanism to inspect nuances of a TCP con-
nection's initial SYN packet and guess at the host's operating system.
Unfortunately these nuances are easily spoofed by an attacker so the fin-
gerprint is not useful in making security decisions. But the fingerprint
is typically accurate enough to make policy decisions upon.
The fingerprints may be specified by operating system class, by version,
or by subtype/patchlevel. The class of an operating system is typically
the vendor or genre and would be OpenBSD for the pf(4) firewall itself.
The version of the oldest available OpenBSD release on the main FTP site
would be 2.6 and the fingerprint would be written as:
"OpenBSD 2.6"
The subtype of an operating system is typically used to describe the
patchlevel if that patch led to changes in the TCP stack behavior. In
the case of OpenBSD, the only subtype is for a fingerprint that was nor-
malised by the no-df scrub option and would be specified as:
"OpenBSD 3.3 no-df"
Fingerprints for most popular operating systems are provided by pf.os(5).
Once pf(4) is running, a complete list of known operating system finger-
prints may be listed by running:
# pfctl -so
Filter rules can enforce policy at any level of operating system specifi-
cation assuming a fingerprint is present. Policy could limit traffic to
approved operating systems or even ban traffic from hosts that aren't at
the latest service pack.
The unknown class can also be used as the fingerprint which will match
packets for which no operating system fingerprint is known.
Examples:
pass out proto tcp from any os OpenBSD
block out proto tcp from any os Doors
block out proto tcp from any os "Doors PT"
block out proto tcp from any os "Doors PT SP3"
block out from any os "unknown"
pass on lo0 proto tcp from any os "OpenBSD 3.3 lo0"
Operating system fingerprinting is limited only to the TCP SYN packet.
This means that it will not work on other protocols and will not match a
currently established connection.
Caveat: operating system fingerprints are occasionally wrong. There are
three problems: an attacker can trivially craft his packets to appear as
any operating system he chooses; an operating system patch could change
the stack behavior and no fingerprints will match it until the database
is updated; and multiple operating systems may have the same fingerprint.
TRANSLATION EXAMPLES
This example maps incoming requests on port 80 to port 8080, on which a
daemon is running (because, for example, it is not run as root, and
therefore lacks permission to bind to port 80).
rdr on $ext_if proto tcp from any to any port 80 -> 127.0.0.1 \
port 8080
If the pass modifier is given, packets matching the translation rule are
passed without inspecting the filter rules.
rdr pass on $ext_if proto tcp from any to any port 80 -> 127.0.0.1 \
port 8080
In the example below, vlan12 is configured as 192.168.168.1; the machine
translates all packets coming from 192.168.168.0/24 to 204.92.77.111 when
they are going out any interface except vlan12. This has the net effect
of making traffic from the 192.168.168.0/24 network appear as though it
is the Internet routable address 204.92.77.111 to nodes behind any inter-
face on the router except for the nodes on vlan12. Thus, 192.168.168.1
can talk to the 192.168.168.0/24 nodes.
nat on ! vlan12 from 192.168.168.0/24 to any -> 204.92.77.111
In the example below, the machine sits between a fake internal
144.19.74.* network, and a routable external IP of 204.92.77.100. The no
nat rule excludes protocol AH from being translated.
no nat on $ext_if proto ah from 144.19.74.0/24 to any
nat on $ext_if from 144.19.74.0/24 to any -> 204.92.77.100
In the example below, packets bound for one specific server, as well as
those generated by the sysadmins are not proxied; all other connections
are.
no rdr on $int_if proto { tcp, udp } from any to $server port 80
no rdr on $int_if proto { tcp, udp } from $sysadmins to any port 80
rdr on $int_if proto { tcp, udp } from any to any port 80 \
-> 127.0.0.1 port 80
This example maps outgoing packets' source port to an assigned proxy port
instead of an arbitrary port. In this case, proxy outgoing isakmp with
port 500 on the gateway.
nat on $ext_if inet proto udp from any port isakmp to any \
-> ($ext_if) port 500
Two more examples. The first uses binat to translate source and destina-
tion addresses (bidirectional). The second uses rdr to redirect a TCP
and UDP port to an internal machine.
binat on $ext_if from 10.1.2.150 to any -> $ext_if
rdr on $ext_if inet proto tcp from any to ($ext_if) port 8080 \
-> 10.1.2.151 port 22
rdr on $ext_if inet proto udp from any to ($ext_if) port 8080 \
-> 10.1.2.151 port 53
In this example, a NAT gateway is set up to translate internal addresses
using a pool of public addresses (192.0.2.16/28). A given source address
is always translated to the same pool address by using the source-hash
keyword. The gateway also translates incoming web server connections to
a group of web servers on the internal network.
nat on $ext_if inet from any to any -> 192.0.2.16/28 source-hash
rdr on $ext_if proto tcp from any to any port 80 \
-> { 10.1.2.155, 10.1.2.160, 10.1.2.161 } round-robin
FILTER EXAMPLES
In this example, the external interface is kue0. We use a macro for the
interface name, so it can be changed easily. All incoming traffic is
"normalised", and everything is blocked and logged by default.
ext_if = "kue0"
match in all scrub (no-df max-mss 1440)
block return log on $ext_if all
Here we specifically block packets we don't want: anything coming from
source we have no back routes for; packets whose ingress interface does
not match the one in the route back to their source address; anything
that does not have our address (157.161.48.183) as source; broadcasts
(cable modem noise); and anything from reserved address space or invalid
addresses.
block in from no-route to any
block in from urpf-failed to any
block out log quick on $ext_if from ! 157.161.48.183 to any
block in quick on $ext_if from any to 255.255.255.255
block in log quick on $ext_if from { 10.0.0.0/8, 172.16.0.0/12, \
192.168.0.0/16, 255.255.255.255/32 } to any
For ICMP, pass out/in ping queries. State matching is done on host ad-
dresses and ICMP ID (not type/code), so replies (like 0/0 for 8/0) will
match queries. ICMP error messages (which always refer to a TCP/UDP
packet) are handled by the TCP/UDP states.
pass on $ext_if inet proto icmp all icmp-type 8 code 0
For UDP, pass out all UDP connections. DNS connections are passed in.
pass out on $ext_if proto udp all
pass in on $ext_if proto udp from any to any port domain
For TCP, pass out all TCP connections and modulate state. SSH, SMTP,
DNS, and IDENT connections are passed in. We do not allow Windows 9x
SMTP connections since they are typically a viral worm.
pass out on $ext_if proto tcp all modulate state
pass in on $ext_if proto tcp from any to any \
port { ssh, smtp, domain, auth }
block in on $ext_if proto tcp from any \
os { "Windows 95", "Windows 98" } to any port smtp
Here we pass in/out all IPv6 traffic: note that we have to enable this in
two different ways, on both our physical interface and our tunnel.
pass quick on gif0 inet6
pass quick on $ext_if proto ipv6
This example illustrates packet tagging. There are three interfaces:
$int_if, $ext_if, and $wifi_if (wireless). NAT is being done on $ext_if
for all outgoing packets. Packets in on $int_if are tagged and passed
out on $ext_if. All other outgoing packets (i.e. packets from the wire-
less network) are only permitted to access port 80.
pass in on $int_if from any to any tag INTNET
pass in on $wifi_if from any to any
block out on $ext_if from any to any
pass out quick on $ext_if tagged INTNET
pass out on $ext_if proto tcp from any to any port 80
In this example, we tag incoming packets as they are redirected to
spamd(8). The tag is used to pass those packets through the packet fil-
ter.
rdr on $ext_if inet proto tcp from <spammers> to port smtp \
tag SPAMD -> 127.0.0.1 port spamd
block in on $ext_if
pass in on $ext_if inet proto tcp tagged SPAMD
GRAMMAR
Syntax for pf.conf in BNF:
line = ( option | pf-rule | nat-rule | binat-rule | rdr-rule |
antispoof-rule | altq-rule | queue-rule | trans-anchors |
anchor-rule | anchor-close | load-anchor | table-rule |
include )
option = "set" ( [ "timeout" ( timeout | "{" timeout-list "}" ) ] |
[ "ruleset-optimization" [ "none" | "basic" |
"profile" ] ] |
[ "optimization" [ "default" | "normal" | "high-latency" |
"satellite" | "aggressive" | "conservative" ] ]
[ "limit" ( limit-item | "{" limit-list "}" ) ] |
[ "loginterface" ( interface-name | "none" ) ] |
[ "block-policy" ( "drop" | "return" ) ] |
[ "state-policy" ( "if-bound" | "floating" ) ]
[ "state-defaults" state-opts ]
[ "require-order" ( "yes" | "no" ) ]
[ "fingerprints" filename ] |
[ "skip on" ifspec ] |
[ "debug" ( "none" | "urgent" | "misc" | "loud" ) ] |
[ "reassemble" ( "yes" | "no" ) [ "no-df" ] )
pf-rule = action [ ( "in" | "out" ) ]
[ "log" [ "(" logopts ")"] ] [ "quick" ]
[ "on" ifspec ] [ "fastroute" | route ] [ af ]
[ protospec ] hosts [ filteropts ]
logopts = logopt [ [ "," ] logopts ]
logopt = "all" | "user" | "to" interface-name
filteropts = filteropt [ [ "," ] filteropts ]
filteropt = user | group | flags | icmp-type | icmp6-type |
"tos" tos |
( "no" | "keep" | "modulate" | "synproxy" ) "state"
[ "(" state-opts ")" ] | "scrub" "(" scrubopts ")" |
"fragment" | "allow-opts" |
"label" string | "tag" string | [ ! ] "tagged" string |
"queue" ( string | "(" string [ [ "," ] string ] ")" ) |
"rtable" number | "probability" number"%"
scrubopts = scrubopt [ [ "," ] scrubopts ]
scrubopt = "no-df" | "min-ttl" number | "max-mss" number |
"set-tos" tos | "reassemble tcp" | "random-id"
nat-rule = [ "no" ] "nat" [ "pass" [ "log" [ "(" logopts ")" ] ] ]
[ "on" ifspec ] [ af ]
[ protospec ] hosts [ "tag" string ] [ "tagged" string ]
[ "->" ( redirhost | "{" redirhost-list "}" )
[ portspec ] [ pooltype ] [ "static-port" ] ]
binat-rule = [ "no" ] "binat" [ "pass" [ "log" [ "(" logopts ")" ] ] ]
[ "on" interface-name ] [ af ]
[ "proto" ( proto-name | proto-number ) ]
"from" address [ "/" mask-bits ] "to" ipspec
[ "tag" string ] [ "tagged" string ]
[ "->" address [ "/" mask-bits ] ]
rdr-rule = [ "no" ] "rdr" [ "pass" [ "log" [ "(" logopts ")" ] ] ]
[ "on" ifspec ] [ af ]
[ protospec ] hosts [ "tag" string ] [ "tagged" string ]
[ "->" ( redirhost | "{" redirhost-list "}" )
[ portspec ] [ pooltype ] ]
antispoof-rule = "antispoof" [ "log" ] [ "quick" ]
"for" ifspec [ af ] [ "label" string ]
table-rule = "table" "<" string ">" [ tableopts ]
tableopts = tableopt [ tableopts ]
tableopt = "persist" | "const" | "counters" | "file" string |
"{" [ tableaddrs ] "}"
tableaddrs = tableaddr-spec [ [ "," ] tableaddrs ]
tableaddr-spec = [ "!" ] tableaddr [ "/" mask-bits ]
tableaddr = hostname | ifspec | "self" |
ipv4-dotted-quad | ipv6-coloned-hex
altq-rule = "altq on" interface-name queueopts-list
"queue" subqueue
queue-rule = "queue" string [ "on" interface-name ] queueopts-list
subqueue
anchor-rule = "anchor" [ string ] [ ( "in" | "out" ) ] [ "on" ifspec ]
[ af ] [ protospec ] [ hosts ] [ filteropt-list ] [ "{" ]
anchor-close = "}"
trans-anchors = ( "nat-anchor" | "rdr-anchor" | "binat-anchor" ) string
[ "on" ifspec ] [ af ] [ "proto" ] [ protospec ] [ hosts ]
load-anchor = "load anchor" string "from" filename
queueopts-list = queueopts-list queueopts | queueopts
queueopts = [ "bandwidth" bandwidth-spec ] |
[ "qlimit" number ] | [ "tbrsize" number ] |
[ "priority" number ] | [ schedulers ]
schedulers = ( cbq-def | priq-def | hfsc-def )
bandwidth-spec = "number" ( "b" | "Kb" | "Mb" | "Gb" | "%" )
action = "pass" | "match" | "block" [ return ]
return = "drop" | "return" |
"return-rst" [ "(" "ttl" number ")" ] |
"return-icmp" [ "(" icmpcode [ [ "," ] icmp6code ] ")" ] |
"return-icmp6" [ "(" icmp6code ")" ]
icmpcode = ( icmp-code-name | icmp-code-number )
icmp6code = ( icmp6-code-name | icmp6-code-number )
ifspec = ( [ "!" ] ( interface-name | interface-group ) ) |
"{" interface-list "}"
interface-list = [ "!" ] ( interface-name | interface-group )
[ [ "," ] interface-list ]
route = ( "route-to" | "reply-to" | "dup-to" )
( routehost | "{" routehost-list "}" )
[ pooltype ]
af = "inet" | "inet6"
protospec = "proto" ( proto-name | proto-number |
"{" proto-list "}" )
proto-list = ( proto-name | proto-number ) [ [ "," ] proto-list ]
hosts = "all" |
"from" ( "any" | "no-route" | "urpf-failed" | "self" |
host | "{" host-list "}" | "route" string ) [ port ]
[ os ]
"to" ( "any" | "no-route" | "self" | host |
"{" host-list "}" | "route" string ) [ port ]
ipspec = "any" | host | "{" host-list "}"
host = [ "!" ] ( address [ "/" mask-bits ] | "<" string ">" )
redirhost = address [ "/" mask-bits ]
routehost = "(" interface-name [ address [ "/" mask-bits ] ] ")"
address = ( interface-name | interface-group |
"(" ( interface-name | interface-group ) ")" |
hostname | ipv4-dotted-quad | ipv6-coloned-hex )
host-list = host [ [ "," ] host-list ]
redirhost-list = redirhost [ [ "," ] redirhost-list ]
routehost-list = routehost [ [ "," ] routehost-list ]
port = "port" ( unary-op | binary-op | "{" op-list "}" )
portspec = "port" ( number | name ) [ ":" ( "*" | number | name ) ]
os = "os" ( os-name | "{" os-list "}" )
user = "user" ( unary-op | binary-op | "{" op-list "}" )
group = "group" ( unary-op | binary-op | "{" op-list "}" )
unary-op = [ "=" | "!=" | "<" | "<=" | ">" | ">=" ]
( name | number )
binary-op = number ( "<>" | "><" | ":" ) number
op-list = ( unary-op | binary-op ) [ [ "," ] op-list ]
os-name = operating-system-name
os-list = os-name [ [ "," ] os-list ]
flags = "flags" ( [ flag-set ] "/" flag-set | "any" )
flag-set = [ "F" ] [ "S" ] [ "R" ] [ "P" ] [ "A" ] [ "U" ] [ "E" ]
[ "W" ]
icmp-type = "icmp-type" ( icmp-type-code | "{" icmp-list "}" )
icmp6-type = "icmp6-type" ( icmp-type-code | "{" icmp-list "}" )
icmp-type-code = ( icmp-type-name | icmp-type-number )
[ "code" ( icmp-code-name | icmp-code-number ) ]
icmp-list = icmp-type-code [ [ "," ] icmp-list ]
tos = ( "lowdelay" | "throughput" | "reliability" |
[ "0x" ] number )
state-opts = state-opt [ [ "," ] state-opts ]
state-opt = ( "max" number | "no-sync" | timeout | "sloppy" |
"pflow" | "source-track" [ ( "rule" | "global" ) ] |
"max-src-nodes" number | "max-src-states" number |
"max-src-conn" number |
"max-src-conn-rate" number "/" number |
"overload" "<" string ">" [ "flush" ] |
"if-bound" | "floating" )
timeout-list = timeout [ [ "," ] timeout-list ]
timeout = ( "tcp.first" | "tcp.opening" | "tcp.established" |
"tcp.closing" | "tcp.finwait" | "tcp.closed" |
"udp.first" | "udp.single" | "udp.multiple" |
"icmp.first" | "icmp.error" |
"other.first" | "other.single" | "other.multiple" |
"frag" | "interval" | "src.track" |
"adaptive.start" | "adaptive.end" ) number
limit-list = limit-item [ [ "," ] limit-list ]
limit-item = ( "states" | "frags" | "src-nodes" ) number
pooltype = ( "bitmask" | "random" |
"source-hash" [ ( hex-key | string-key ) ] |
"round-robin" ) [ sticky-address ]
subqueue = string | "{" queue-list "}"
queue-list = string [ [ "," ] string ]
cbq-def = "cbq" [ "(" cbq-opt [ [ "," ] cbq-opt ] ")" ]
priq-def = "priq" [ "(" priq-opt [ [ "," ] priq-opt ] ")" ]
hfsc-def = "hfsc" [ "(" hfsc-opt [ [ "," ] hfsc-opt ] ")" ]
cbq-opt = ( "default" | "borrow" | "red" | "ecn" | "rio" )
priq-opt = ( "default" | "red" | "ecn" | "rio" )
hfsc-opt = ( "default" | "red" | "ecn" | "rio" |
linkshare-sc | realtime-sc | upperlimit-sc )
linkshare-sc = "linkshare" sc-spec
realtime-sc = "realtime" sc-spec
upperlimit-sc = "upperlimit" sc-spec
sc-spec = ( bandwidth-spec |
"(" bandwidth-spec number bandwidth-spec ")" )
include = "include" filename
FILES
/etc/hosts Host name database.
/etc/pf.conf Default location of the ruleset file.
/etc/pf.os Default location of OS fingerprints.
/etc/protocols Protocol name database.
/etc/services Service name database.
SEE ALSO
pf(4), pflow(4), pfsync(4), pf.os(5), pfctl(8), pflogd(8)
HISTORY
The pf.conf file format first appeared in OpenBSD 3.0.
OpenBSD 4.6 May 30, 2009 31
[Unix Hosting |
Open-Source |
Contact Us]
[Engineering & Automation |
Software Development |
Server Applications]