The ACCOUNT is designed to be queried for data every second or at least every ten seconds. It is written as kernel module to handle high bandwidths without packet loss.
The largest possible subnet size is 24 bit, meaning for example 10.0.0.0/8 network. ACCOUNT uses fixed internal data structures which speeds up the processing of each packet. Furthermore, accounting data for one complete 192.168.1.X/24 network takes 4 KB of memory. Memory for 16 or 24 bit networks is only allocated when needed.
To optimize the kernel<->userspace data transfer a bit more, the kernel module only transfers information about IPs, where the src/dst packet counter is not 0. This saves precious kernel time.
There is no /proc interface as it would be too slow for continuous access. The read-and-flush query operation is the fastest, as no internal data snapshot needs to be created&copied for all data. Use the "read" operation without flush only for debugging purposes!
Usage:
ACCOUNT takes two mandatory parameters:
The subnet 0.0.0.0/0 is a special case: all data are then stored in the src_bytes and src_packets structure of slot "0". This is useful if you want to account the overall traffic to/from your internet provider.
The data can be queried using the userspace libxt_ACCOUNT_cl library, and by the reference implementation to show usage of this library, the iptaccount(8) tool, which features following options:
[-u] show kernel handle usage
[-h] free all kernel handles (experts only!)
[-a] list all table names
[-l name] show data in table name
[-f] flush data after showing
[-c] loop every second (abort with CTRL+C)
Here is an example of use:
iptables -A FORWARD -j ACCOUNT --addr 0.0.0.0/0 --tname all_outgoing; iptables -A FORWARD -j ACCOUNT --addr 192.168.1.0/24 --tname sales;
This creates two tables called "all_outgoing" and "sales" which can be queried using the userspace library/iptaccount tool.
Note that this target is non-terminating --- the packet destined to it will continue traversing the chain in which it has been used.
Also note that once a table has been defined for specific CIDR address/netmask block, it can be referenced multiple times using -j ACCOUNT, provided that both the original table name and address/netmask block are specified.
For more information go to http://www.intra2net.com/en/developer/ipt_ACCOUNT/
The randomness factor of not replying vs. replying can be set during load-time of the xt_CHAOS module or during runtime in /sys/modules/xt_CHAOS/parameters.
See http://jengelh.medozas.de/projects/chaostables/ for more information about CHAOS, DELUDE and lscan.
EXAMPLE, replacing all addresses from one of VMware's assigned vendor IDs (00:50:56) addresses with something else:
iptables -t mangle -A FORWARD -p udp --dport 67 -m physdev --physdev-in vmnet1 -m dhcpmac --mac 00:50:56:00:00:00/24 -j DHCPMAC --set-mac ab:cd:ef:00:00:00/24
iptables -t mangle -A FORWARD -p udp --dport 68 -m physdev --physdev-out vmnet1 -m dhcpmac --mac ab:cd:ef:00:00:00/24 -j DHCPMAC --set-mac 00:50:56:00:00:00/24
(This assumes there is a bridge interface that has vmnet1 as a port. You will also need to add appropriate ebtables rules to change the MAC address of the Ethernet headers.)
This target is to be used inside the mangle table.
The order of IP address bytes is reversed to meet "human order of bytes": 192.168.0.1 is 0xc0a80001. At first the "AND" operation is performed, then "OR".
Examples:
We create a queue for each user, the queue number is adequate to the IP address of the user, e.g.: all packets going to/from 192.168.5.2 are directed to 1:0502 queue, 192.168.5.12 -> 1:050c etc.
We have one classifier rule:
Earlier we had many rules just like below:
Using IPMARK target we can replace all the mangle/mark rules with only one:
On the routers with hundreds of users there should be significant load decrease (e.g. twice).
(IPv6 example) If the source address is of the form 2001:db8:45:1d:20d:93ff:fe9b:e443 and the resulting mark should be 0x93ff, then a right-shift of 16 is needed first:
See the RAWSNAT help entry for examples and constraints.
The RAWSNAT target will rewrite the source address in the IP header, much like the NETMAP target. RAWSNAT (and RAWDNAT) may only be used in the raw or rawpost tables, but can be used in all chains, which makes it possible to change the source address either when the packet enters the machine or when it leaves it. The reason for this table constraint is that RAWNAT must happen outside of connection tracking.
As an example, changing the destination for packets forwarded from an internal LAN to the internet:
Note that changing addresses may influence the route selection! Specifically, it statically NATs packets, not connections, like the normal DNAT/SNAT targets would do. Also note that it can transform already-NATed connections --- as said, it is completely external to Netfilter's connection tracking/NAT.
If the machine itself generates packets that are to be rawnat'ed, you need a rule in the OUTPUT chain instead, just like you would with the stateful NAT targets.
It may be necessary that in doing so, you also need an extra RAWSNAT rule, to override the automatic source address selection that the routing code does before passing packets to iptables. If the connecting socket has not been explicitly bound to an address, as is the common mode of operation, the address that will be chosen is the primary address of the device through which the packet would be routed with its initial destination address - the address as seen before any RAWNAT takes place.
The xt_SYSRQ implementation uses a salted hash and a sequence number to prevent network sniffers from either guessing the password or replaying earlier requests. The initial sequence number comes from the time of day so you will have a small window of vulnerability should time go backwards at a reboot. However, the file /sys/module/xt_SYSREQ/seqno can be used to both query and update the current sequence number. Also, you should limit as to who can issue commands using -s and/or -m mac, and also that the destination is correct using -d (to protect against potential broadcast packets), noting that it is still short of MAC/IP spoofing:
You should also limit the rate at which connections can be received to limit the CPU time taken by illegal requests, for example:
This extension does not take any options. The -p udp options are required.
The SYSRQ password can be changed through /sys/module/xt_SYSRQ/parameters/password, for example:
Alternatively, the password may be specified at modprobe time, but this is insecure as people can possible see it through ps(1). You can use an option line in e.g. /etc/modprobe.d/xt_sysrq if it is properly guarded, that is, only readable by root.
The hash algorithm can also be specified as a module option, for example, to use SHA-256 instead of the default SHA-1:
The xt_SYSRQ module is normally silent unless a successful request is received, but the debug module parameter can be used to find exactly why a seemingly correct request is not being processed.
To trigger SYSRQ from a remote host, just use netcat or socat:
sysrq_key="s" # the SysRq key(s)
password="password"
seqno="$(date +%s)"
salt="$(dd bs=12 count=1 if=/dev/urandom 2>/dev/null |
openssl enc -base64)"
req="$sysrq_key,$seqno,$salt"
req="$req,$(echo -n "$req,$password" | sha1sum | cut -c1-40)"
echo "$req" | socat stdin udp-sendto:10.10.25.7:9
# or
echo "$req" | netcat -uw1 10.10.25.7 9
See the Linux docs for possible sysrq keys. Important ones are: re(b)oot, power(o)ff, (s)ync filesystems, (u)mount and remount readonly. More than one sysrq key can be used at once, but bear in mind that, for example, a sync may not complete before a subsequent reboot or poweroff.
The hashing scheme should be enough to prevent mis-use of SYSRQ in many environments, but it is not perfect: take reasonable precautions to protect your machines. Most importantly ensure that each machine has a different password; there is scant protection for a SYSRQ packet being applied to a machine that happens to have the same password.
This offers similar functionality to LaBrea <http://www.hackbusters.net/LaBrea/> but does not require dedicated hardware or IPs. Any TCP port that you would normally DROP or REJECT can instead become a tarpit.
To tarpit connections to TCP port 80 destined for the current machine:
To significantly slow down Code Red/Nimda-style scans of unused address space, forward unused ip addresses to a Linux box not acting as a router (e.g. "ip route 10.0.0.0 255.0.0.0 ip.of.linux.box" on a Cisco), enable IP forwarding on the Linux box, and add:
NOTE: If you use the conntrack module while you are using TARPIT, you should also use the NOTRACK target, or the kernel will unnecessarily allocate resources for each TARPITted connection. To TARPIT incoming connections to the standard IRC port while using conntrack, you could:
To forward all incoming traffic on eth0 to an Network Layer logging box:
-t mangle -A PREROUTING -i eth0 -j TEE --gateway 2001:db8::1
The extra files you will need is the binary database files. They are generated from a country-subnet database with the geoip_build_db.pl tool that is shipped with the source package, and which should be available in compiled packages in /usr/lib(exec)/xtables-addons/. The first command retrieves CSV files from MaxMind, while the other two build packed bisectable range files:
cd /tmp; $path/to/geoip_download.sh;
$path/to/geoip_build_db.pl -D /usr/share/xt_geoip/LE $path/to/geoip_build_db.pl -D /usr/share/xt_geoip/BE -b
The shared library is hardcoded to look in these paths, so use them.
Following that, one can select the interface properties to check for:
Use it together with -p tcp or -p udp to search these protocols only or without -p switch to search packets of both protocols.
IPP2P provides the following options, of which one or more may be specified on the command line:
Note that ipp2p may not (and often, does not) identify all packets that are exchanged as a result of running filesharing programs.
There is more information on http://ipp2p.org/ , but it has not been updated since September 2006, and the syntax there is different from the ipp2p.c provided in Xtables-addons; most importantly, the --ipp2p flag was removed due to its ambiguity to match "all known" protocols.
Known symbol names (and their number):
1 --- nop
2 --- security --- RFC 1108
3 --- lsrr --- Loose Source Routing, RFC 791
4 --- timestamp --- RFC 781, 791
7 --- record-route \m RFC 791
9 --- ssrr --- Strict Source Routing, RFC 791
11 --- mtu-probe --- RFC 1063
12 --- mtu-reply --- RFC 1063
18 --- traceroute --- RFC 1393
20 --- router-alert --- RFC 2113
Examples:
Match packets that have both Timestamp and NOP: -m ipv4options --flags nop,timestamp
~ that have either of Timestamp or NOP, or both: --flags nop,timestamp --any
~ that have Timestamp and no NOP: --flags '!nop,timestamp'
~ that have either no NOP or a timestamp (or both conditions): --flags '!nop,timestamp' --any
If no --layer* option is given, --layer3 is assumed by default. Note that using --layer5 may not match a packet if it is not one of the recognized types (currently TCP, UDP, UDPLite, ICMP, AH and ESP) or which has no 5th layer.
NOTE: Some clients (Windows XP for example) may do what looks like a SYN scan, so be advised to carefully use xt_lscan in conjunction with blocking rules, as it may lock out your very own internal network.
When counting down from the initial quota, the counter will stop at 0 and the match will return false, just like the original "quota" match. In growing (upcounting) mode, it will always return true.
Because counters in quota2 can be shared, you can combine them for various purposes, for example, a bytebucket filter that only lets as much traffic go out as has come in:
-A INPUT -p tcp --dport 6881 -m quota --name bt --grow; -A OUTPUT -p tcp --sport 6881 -m quota --name bt;
Example prerequisites:
Example 1 (TCP mode, manual closing of opened port not possible):
The rule will allow tcp port 22 for the attempting IP address after the successful reception of TCP SYN packets to ports 4002, 4001 and 4004, in this order (a.k.a. port-knocking). Port numbers in the connect sequence must follow the exact specification, no other ports may be "knocked" inbetween. The rule is named 'SSH' --- a file of the same name for tracking port knocking states will be created in /proc/net/xt_pknock . Successive port knocks must occur with delay of at most 10 seconds. Port 22 (from the example) will be automatiaclly dropped after 60 minutes after it was previously allowed.
Example 2 (UDP mode --- non-replayable and non-spoofable, manual closing of opened port possible, secure, also called "SPA" = Secure Port Authorization):
The first rule will create an "ALLOWED" record in /proc/net/xt_pknock/FTP after the successful reception of an UDP packet to port 4000. The packet payload must be constructed as a HMAC256 using "foo" as a key. The HMAC content is the particular client's IP address as a 32-bit network byteorder quantity, plus the number of minutes since the Unix epoch, also as a 32-bit value. (This is known as Simple Packet Authorization, also called "SPA".) In such case, any subsequent attempt to connect to port 21 from the client's IP address will cause such packets to be accepted in the second rule.
Similarly, upon reception of an UDP packet constructed the same way, but with the key "bar", the first rule will remove a previously installed "ALLOWED" state record from /proc/net/xt_pknock/FTP, which means that the second rule will stop matching for subsequent connection attempts to port 21. In case no close-secret packet is received within 4 hours, the first rule will remove "ALLOWED" record from /proc/net/xt_pknock/FTP itself.
Things worth noting:
General:
Specifying --autoclose 0 means that no automatic close will be performed at all.
xt_pknock is capable of sending information about successful matches via a netlink socket to userspace, should you need to implement your own way of receiving and handling portknock notifications. Be sure to read the documentation in the doc/pknock/ directory, or visit the original site --- http://portknocko.berlios.de/ .
TCP mode:
This mode is not immune against eavesdropping, spoofing and replaying of the port knock sequence by someone else (but its use may still be sufficient for scenarios where these factors are not necessarily this important, such as bare shielding of the SSH port from brute-force attacks). However, if you need these features, you should use UDP mode.
It is always wise to specify three or more ports that are not monotonically increasing or decreasing with a small stepsize (e.g. 1024,1025,1026) to avoid accidentally triggering the rule by a portscan.
Specifying the inter-knock timeout with --time is mandatory in TCP mode, to avoid permanent denial of services by clogging up the peer knock-state tracking table that xt_pknock internally keeps, should there be a DDoS on the first-in-row knock port from more hostile IP addresses than what the actual size of this table is (defaults to 16, can be changed via the "peer_hasht_ents" module parameter). It is also wise to use as short a time as possible (1 second) for --time for this very reason. You may also consider increasing the size of the peer knock-state tracking table. Using --strict also helps, as it requires the knock sequence to be exact. This means that if the hostile client sends more knocks to the same port, xt_pknock will mark such attempt as failed knock sequence and will forget it immediately. To completely thwart this kind of DDoS, knock-ports would need to have an additional rate-limit protection. Or you may consider using UDP mode.
UDP mode:
This mode is immune against eavesdropping, replaying and spoofing attacks. It is also immune against DDoS attack on the knockport.
For this mode to work, the clock difference on the client and on the server must be below 1 minute. Synchronizing time on both ends by means of NTP or rdate is strongly suggested.
There is a rate limiter built into xt_pknock which blocks any subsequent open attempt in UDP mode should the request arrive within less than one minute since the first successful open. This is intentional; it thwarts eventual spoofing attacks.
Because the payload value of an UDP knock packet is influenced by client's IP address, UDP mode cannot be used across NAT.
For sending UDP "SPA" packets, you may use either knock.sh or knock-orig.sh. These may be found in doc/pknock/util.
For developers, the book "Writing Netfilter modules" at http://jengelh.medozas.de/documents/Netfilter_Modules.pdf provides detailed information on how to write such modules/extensions.