[lxc-users] lxc-security: iptables audit with nflog not working with default settings (insecure)

Wed Mar 11 12:59:58 UTC 2015

> Von: lxc-users [mailto:lxc-users-bounces at lists.linuxcontainers.org] Im 
> Auftrag
>
> On Wed, Mar 11, 2015 at 7:02 PM, Fiedler Roman <Roman.Fiedler at ait.ac.at>
> wrote:
> > This should be exactly the configuration I have tested  so far. But that 
> > did
> > not yet solve my problem ...
> >
> > * If some process in guest registers for the same NFLOG queue, he can
> "steal"
> > the messages from the host queue, thus removing traces of his activity 
> > from
> > host logging. SECURITY-ASPECT: apart from log corruption, the guest can 
> > get
> > knowledge about any other connection to/from other containers and the
> host and
> > as they include  sequence numbers, may be able to inject spoofed data into
> any
> > other unencrypted TCP connection or at least interrupt the connection
> using
> > another helper machine.
>
> No. What makes you believe that?

Well, for the NFLOG message leak to guest I have a running demo on a test 
machine. It is still possible (and likely), that I did some kind of 
configuration error that causes the logging misbehaviour. I guess with that 
current behaviour, TCP connection manipulation using packet spoofing could be 
left as exercise to the experienced reader (or hacker).

> Host and containers does not share iptables rules. Their entire
> network stack is separated thru network namespace. There's no such
> thing as "stealing the message".

How could we sort  that out? On my test  setup the stealing rate is 11.8% for 
~1400 NFLOG messages, so there is definitely something wrong. With some 
CPU-scheduling optimization exploit code or registering of more hooks, a 
malicious guest might be able steal even far more.

> A test would probably prove my statement faster. Try this on your
> container, while keeping the same rules on the host side:
>
> iptables -I INPUT 1 -d 192.168.124.173 -j NFLOG --nflog-group 0
> --nflog-prefix lxc-v
> iptables -I OUTPUT 1 -s 192.168.124.173 -j NFLOG --nflog-group 0
> --nflog-prefix lxc-v

No luck with that: I've switched both guest/host to the loggroup 0. After that 
change, no messages from guest are logged but host messages (which are now 
clearly distinguishable) continue to be logged within the guest.

Did you try to install ulogd2 on host and guest and put it to the same NFLOG 
group?

What about kernel versions? I'm using "Linux version 3.13.0-46-generic 
(buildd at tipua) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #77-Ubuntu SMP 
Mon Mar 2 18:23:39 UTC 2015". Could it be,  that network-ns isolation was not 
completed back then?

Could it be, that the CPU-scheduler interferes with NFLOG queuing/dequeuing so 
that with some number of CPU cores or other scheduling influences, the 
probability for stealing is nearly zero, while on my setup (vbox with single 
CPU) the rate is nearly 12%?

> Note that on the container the chain has to be INPUT on OUTPUT instead
> of FORWARD. Then test it. The container will logs this
>
> ... while at the same time, the host also logs it
> [..snip ..]
>
> (note that my host and container uses different timezone, so the times
> look 7-hours apart)

I see. but  still fail to reproduce here.

> > * If a guest should be protected with iptables also, e.g. to avoid Apache 
> > or
> > Tomcat to connect to the local SSH port, those error logs - which are 
> > useful
> > to detect ongoing guest intrusions - do not make it to any log-file.
>
> No.
>
> Whatever the container does inside it does not matter to the host. The
> host simply capture the packets passing thru the bridge (with
> bridge-nf-call-iptables on), and doesn't really care what the
> container does with it (block it, drop it, accept it, whatever).
>
> On the host side, nflog does NOT stop iptables processing, so the
> chains further down (e.g. the ones doing ACCEPT, or REJECT if you
> want) will still see the packets. You can verify it with "iptables -nL
> -v", and see that the hit counters for both the nflog rule AND
> ACCEPT/REJECT rule increase.

Right. From the documentation I understood it exactly in that way, but my test 
system fails to behave according to that.

> >> You might only be missing the "bridge-nf-call-iptables" part. Note
> >> that you shouldn't need it IF you use a custom lxc network setup which
> >> doesn't use bridges:
> >> https://www.mail-archive.com/lxc-
> users at lists.linuxcontainers.org/msg02587.html
> >
> > I've tried various network configurations also. I fear that effort here is
> > quite futile since I do not yet understand the core kernel namespace
> concepts,
>
> That will be your biggest problem.
>
> Short version: when using containers installed using templates
> (preferably download template), you need to treat container's
> networking stack as separate (e.g. on a different "server") with the
> one on the host.

I do that. I used the same firewall/logging setup scripts on host and guest 
and hoped that they are separate. The are in all points (connection tracking, 
iptables rules, connection lists, ...) but EXCEPT for NFLOG when using the 
same logging group.

> The only "link" the host has to the container is the veth pair
> (vethXXXX by default on host side, usually eth0 on the container
> side). Think of lxcbr0 (or whatever bridge you use) as a "switch",
> with vethXXXX as the switch port that is connected to the container.

I started with that. At  the moment for testing, I use an internal (host-only) 
bridge and route external traffic from bridge via host. I would hope, that 
this difference does not matter.

# Network configuration
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxc-lo-br0
lxc.network.veth.pair = lxc-lo-br0-255
lxc.network.hwaddr = "xxxxx"
# lxc.network.name = eth0
lxc.network.ipv4 = xxxxx/8
lxc.network.ipv4.gateway = xxxx.1

> > e.g. how nflink and user namespaces work together. Hence everything I
> could do
> > is (inefficient) trial and error instead of controlled engineering. And in 
> > the
> > end I cannot be sure that there are not reliability/security-relevant 
> > holes
> > left open with trial/error.
> >
> > Without any other clues from the mailing list, I'll still try out this
> > procedure also and see if it would change the nflog behavior.
>
>
> Then spend some time to learn.
>
> Googling "network namespace" is a good start. The lwn article should
> be very helpful.

Of course I understand network namespaces on that level. What I meant is that 
I do not yet look up the kernel source code, e.g. for the  NFLOG/namespace 
interaction and I do not know all the parameters to influence it.

Perhaps you could point me out something that could influence the breakage of 
only NFLOG while all other ns-namespace features seem to work without 
problems.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6344 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20150311/404c7820/attachment.bin>