User namespaces – available to play!

Posted on May 10, 2012by s3hh

Over the past few months, Eric Biederman has been working on completing the user namespace. Briefly, unprivileged users can create a user namespace, where he can pretend to be root and start new namespaces (i.e. network and pid) which he will own (Note, creating namespaces in child user namespaces isn’t yet allowed, but will be). With respect to anything he owns – for instance new network interfaces which he creates in his own network namespace – he should have privilege. But he should not be able to escape his existing privileges in the parent user namespace. This finally should allow an unprivileged user to create a new filesystem tree and chroot into it, without risk of maliciously confusing setuid applications on the host (for instance by bind mounting his own /etc/passwd).

Eric’s new design is based on a 1-1 uid mapping (by ranges) from uids
in the container to uids on the host. For instance, uid 0 in the namespace may really be uid 999990 on the host. Users can be pre-allocated their own private ranges to use however they please. For instance each user may get 10,000 uids, with the first user’s range starting at 100,000.  The uid and gid mappings are exposed and manipulated through /proc/pid/uid_map and /proc/pid/gid_map, which contain:

namespace_first_uid host_first_uid number_of_uids

For instance if it contains “0 100000 1000″, then uids 0 through 1000 in the namespace will map to uids 100000 through 101000 on the host, respectively. To write to the uid map, you must be privileged in your namespace, and your namespace must have the source ids mapped. (The mappings can be nested in the obvious way). In userspace, we expect to have a small setuid-root program which unprivileged users can call to map uids. That program will consult a root owned file which lists the permitted mappings. Right now we are using /etc/id_permission/uids and /etc/id_permission/gids. If /etc/id_permission/uids has

1000:100000:9999
1001:110000:9999

then uid 1000 (user hallyn) will be allowed to map the uids 100000 through 109999, and 1001 (user jschmoe) will be allowed to map uids 110000 through 119999.

Eric’s git tree is here. His patchset applied to the ubuntu quantal kernel tree is here, and the resulting kernel is built and available at ppa:serge-hallyn/userns-natty.

So you can try it out! Like so:

Start an amazon ec2 instance of precise. Find an ami to use (ami=`ubuntu-cloudimg-query precise`) and start it up (ec2-run-instances -k myid $ami). Log in and update /etc/apt/sources.list to look as follows:

deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ quantal main universe
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ quantal main universe

then update (sudo apt-get update && sudo apt-get -y dist-upgrade). Add my userns-natty ppa (sudo add-apt-repository ppa:serge-hallyn/userns-natty) and update again (sudo apt-get update && sudo apt-get -y dist-upgrade), then reboot into the new kernel.

As I’ve said, the uid mapping is in /proc/self/uid_map. On the host that looks like

0 0 4294967295

Grab nsexec from my ppa to create new namespaces (sudo apt-get install nsexec) and run

sudo nsexec -cU /bin/bash

Inside the new namespace, /proc/self/uid_map is empty. So we need to add some mappings. From a root terminal on the host (not in the new namespace), do

echo “0 555550 10″ > /proc/$pid/uid_map
echo “0 555550 10″ > /proc/$pid/gid_map

Where $pid is the process id of the shell in the namespace. The nsexec package includes a utility called uidmap which will do this for you, so you can just do

sudo uidmap $pid 555550 10

(This utility will soon support being run setuid-root and consulting the above-mentioned /etc/id_permission/files)

Now back in the nsexec shell, switch to the new namespaced root userid using newuidexec (from the nsexec package) using:

newuidexec 0

Now you can do:

#id
uid=0(root) gid=0(root) groups=0(root)
#touch /tmp/zzz
#ls -l /root
ls: cannot open directory /root: Permission denied
#ls -l /tmp/zzz
-rw-r--r-- 1 root root 0 May 9 16:45 zzz

while back in your host root shell, you see:

#ls -l /tmp
-rw-r--r-- 1 55550 55550 0 May 9 16:45 zzz

The same thing will happen with all cases where a uid crosses the user->kernel api. For instance if you send credentials over a unix socket to a task in another user namespace, the uid will be converted to a valid mapping in the other user namespace, or, if none exists, to the overflowuid.

So, after many years, user namespaces are real! Perhaps the biggest remaining obstacle to using user namespaces for a real distro container is converting more capable() calls to ns_capable(). Soon.

最新文章

  1. 使用Lamda生成函数
  2. 可拖动的DIV续
  3. iOS 学习 - 10下载(2) NSURLSession 图片 篇
  4. poj2486Apple Tree[树形背包!!!]
  5. 基本套接字编程(3) -- select篇
  6. Android 系统的四层结构
  7. Linux系统Wpa_supplicant用法小结
  8. 查找附近网点geohash算法及实现 (Java版本号)
  9. cocos2d-x于android在call to OpenGL ES API with no current context
  10. 大数据时代:基于微软案例数据库数据挖掘知识点总结(Microsoft 聚类分析算法)
  11. Linux学习 -- 日志管理
  12. 结构化CSS设计思维
  13. win7中python3.4下安装scrapy爬虫框架(亲测可用)
  14. Ext.form.RadioGroup
  15. PCA实现教程
  16. 【转载】Oracle 中count(1) 、count(*) 和count(列名) 函数的区别
  17. 项目的发布(nginx、uwsgi、django、virtualenv、supervisor)
  18. lamp docker apache2 supervisor monitor
  19. css布局笔记(三)圣杯布局,双飞翼布局
  20. linux expect 的使用

热门文章

  1. link address
  2. C#反射的实现
  3. 选择器与过滤器(全)————JQ
  4. Vue中子组件数据跟着父组件改变和父组件数据跟着子组件改变的方法
  5. 11.Container With Most Water (Array; Two-Pointers)
  6. java “+”运算
  7. 设置php的环境变量 php: command not found
  8. php 字符转成数字
  9. 【集群】JedisCluster 原理
  10. padding 填充