分类: 未分类

  • LXD源码解析

    工作中忙的项目和lxd打交道比较多,因此我利用闲暇时间阅读了一下lxd的源码,以加深对于lxd的理解,顺便学习一些写golang的技巧。

    关于lxd

    lxd是lxc的第二版,和docker类似,也是一个利用Linux容器的管理工具。Linux容器可以实现一个类似与Linux虚拟机类似的环境,不同点是,牺牲了一定的隔离性的情况下运行开销更低。

    而lxd相较于lxc来说,相当于在管理方式上进行了一层封装。lxc的配置文件完全依赖人工编写,支持的存储后端只有dir(也就是在原有系统存储中的目录),网络的管理方式也非常匮乏,基本上只能使用手动配合外部工具才能有效地利用容器。而lxd,也就是官方所说的2.0,在众多方面都做出了改进。

    1. 全新的C/S架构。客户端为一个名字叫lxc的工具(注意这个和原先那个lxc不是一个东西),服务端叫lxd,两者之间可以使用unix socket或者https的方式,通过RESTful API进行通信。这种设计让用户可以利用lxc工具对多个lxd进行远程管理,更重要的是,第三方程序也可以完全不依赖与lxc工具,直接使用API对lxd进行管理。相比lxc而言这种管理方式灵活了许多。
    2. 更方便的配置项。lxd提出了许多的新概念,可以让lxc容器的配置显得更加条理化。例如,lxd引入了镜像库,镜像库可以是在本地也可以在远程,支持镜像在两者之间的转移、导出、导入,也可以将停止运行的容器打包为镜像;lxd引入了profile的概念,profile在容器创建时被指定,而基于同一个profile的容器在初始化时具有同样的配置参数。
    3. 更加丰富的存储后端。lxd除了原有的dir类型外,还支持一些现代的高效存储后端,例如btrfs、zfs、ceph等,只需要在启用时安装配套工具即可。
    4. 原生的网络配置。单个容器能够发挥的功能十分有限,如何将多个容器进行连接是非常关键的。lxd直接融合了多种网络的配置功能,例如创建bridge、ovs、veth,以及overlay类型的接口GRE、Vxlan以及Ubuntu fan等等。
    5. 更加方便的设备管理。对于宿主机上的物理资源,lxd也直接提供了device的配置方法,用户可以按照需求直接将多种类型的设备绑定到容器中,例如GPU、物理网络接口、磁盘、infiniteband以及其他的字符型设备和块设备等。
    6. 自带的集群模式。可以将多个运行lxd的服务器组合为集群进行管理,数据一致性由raft保证,这样可以提高lxd的稳定性。

    此外,lxd还提供了容器的迁移,并且在保证这些特性的同时,原有的lxc配置参数在lxd中得以保留。可以说,lxd的出现极大地提高了用户管理lxc的灵活度。

    lxc

    这里的lxc特指lxd的客户端,在lxd源码中的位置是lxc。这里额外提一下,在lxd(也就是lxc 2.0)中,一般使用的工具是lxc,运行的命令一般是lxc startlxc stop等等;而在lxc(也就是lxc 1.0)中,一般使用的工具是lxc-***,运行的命令一般是lxc-startlxc-stop。由于这里lxc经常出现,注意区分不要弄错了。

    系统 服务名 开启容器的命令
    lxc lxc lxc-start
    lxd lxd lxc start

    golang的程序一般是从包中的main函数开始的,因此这里首先看lxc/main.go文件。

    func main() {
    	// 定位配置文件,从配置文件中获得一些预制的运行参数
        err := execIfAliases()
    	if err != nil {
    		fmt.Fprintf(os.Stderr, "Error: %v\n", err)
    		os.Exit(1)
    	}
    
    	// 配置解析器
    	app := &cobra.Command{}
    	app.Use = "lxc"
    	app.Short = i18n.G("Command line client for LXD")
    	app.Long = cli.FormatSection(i18n.G("Description"), i18n.G(
    		`Command line client for LXD
    
    All of LXD's features can be driven through the various commands below.
    For help with any of those, simply call them with --help.`))
    	app.SilenceUsage = true
    	app.SilenceErrors = true
    
    	// Global flags
        globalCmd := cmdGlobal{cmd: app}
        // 添加全局对象的处理逻辑
    	app.PersistentFlags().BoolVar(&globalCmd.flagVersion, "version", false, i18n.G("Print version number"))
    	app.PersistentFlags().BoolVarP(&globalCmd.flagHelp, "help", "h", false, i18n.G("Print help"))
    	app.PersistentFlags().BoolVar(&globalCmd.flagForceLocal, "force-local", false, i18n.G("Force using the local unix socket"))
    	app.PersistentFlags().StringVar(&globalCmd.flagProject, "project", "", i18n.G("Override the source project"))
    	app.PersistentFlags().BoolVar(&globalCmd.flagLogDebug, "debug", false, i18n.G("Show all debug messages"))
    	app.PersistentFlags().BoolVarP(&globalCmd.flagLogVerbose, "verbose", "v", false, i18n.G("Show all information messages"))
    	app.PersistentFlags().BoolVarP(&globalCmd.flagQuiet, "quiet", "q", false, i18n.G("Don't show progress information"))
    
        // Wrappers
        // 配置运行前后的钩子函数
    	app.PersistentPreRunE = globalCmd.PreRun
    	app.PersistentPostRunE = globalCmd.PostRun
    
    	// Version handling
    	app.SetVersionTemplate("{{.Version}}\n")
    	app.Version = version.Version
    
    	// alias sub-command
    	aliasCmd := cmdAlias{global: &globalCmd}
    	app.AddCommand(aliasCmd.Command())
    
    	// cluster sub-command
    	clusterCmd := cmdCluster{global: &globalCmd}
    	app.AddCommand(clusterCmd.Command())
    
        // ... 中间这部分和alias,cluster一样,都是在绑定子命令的入口
        
    	// version sub-command
    	versionCmd := cmdVersion{global: &globalCmd}
    	app.AddCommand(versionCmd.Command())
    
    	// Get help command
    	app.InitDefaultHelpCmd()
    	var help *cobra.Command
    	for _, cmd := range app.Commands() {
    		if cmd.Name() == "help" {
    			help = cmd
    			break
    		}
    	}
    
    	// Help flags
    	app.Flags().BoolVar(&globalCmd.flagHelpAll, "all", false, i18n.G("Show less common commands"))
    	help.Flags().BoolVar(&globalCmd.flagHelpAll, "all", false, i18n.G("Show less common commands"))
    
    	// Deal with --all flag
    	err = app.ParseFlags(os.Args[1:])
    	if err == nil {
    		if globalCmd.flagHelpAll {
    			// Show all commands
    			for _, cmd := range app.Commands() {
    				cmd.Hidden = false
    			}
    		}
    	}
    
    	// Run the main command and handle errors
    	err = app.Execute()
    	if err != nil {
    		// Handle non-Linux systems
    		if err == config.ErrNotLinux {
    			fmt.Fprintf(os.Stderr, i18n.G(`This client hasn't been configured to use a remote LXD server yet.
    As your platform can't run native Linux containers, you must connect to a remote LXD server.
    
    If you already added a remote server, make it the default with "lxc remote switch NAME".
    To easily setup a local LXD server in a virtual machine, consider using: https://multipass.run`)+"\n")
    			os.Exit(1)
    		}
    
    		if err == cobra.ErrSubCommandRequired {
    			os.Exit(0)
    		}
    
    		// Default error handling
    		fmt.Fprintf(os.Stderr, "Error: %v\n", err)
    		os.Exit(1)
    	}
    
    	if globalCmd.ret != 0 {
    		os.Exit(globalCmd.ret)
    	}
    }
    
    

    根据我的使用经验,lxc这个客户端是唯一一个能够与交换机终端媲美的一个客户端程序。我们都知道很多有众多命令参数的程序如果不阅读手册是完全没法使用的,而lxc无论在任何时候子命令状态下,都可以通过不敲后面的参数来获取帮助,而且说明信息非常的详细,只有在某些特定的参数单位不太清楚时才需要查阅手册。这部分源码可以说是显示出它神奇的奥秘。原来,lxc使用了一个名字叫cobra的库,这个库可以对命令行参数进行非常华丽的处理。这段逻辑中我们唯独需要关注的,就是一开头的execIfAliases,这个函数主要就是定位lxd使用的config.yml文件,然后从文件中读取配置信息,填充到全局使用的结构体cmdGlobal中。

    type cmdGlobal struct {
    	conf     *config.Config
    	confPath string
    	cmd      *cobra.Command
    	ret      int
    
    	flagForceLocal bool
    	flagHelp       bool
    	flagHelpAll    bool
    	flagLogDebug   bool
    	flagLogVerbose bool
    	flagProject    string
    	flagQuiet      bool
    	flagVersion    bool
    }
    

    解析配置文件的逻辑在lxc/config目录下,入口点是file.go文件中的LoadConfig函数。值得一提的是,lxc工具还支持命令的简写,通过源码我们可以发现只需要用lxc alias工具管理一个alias的键值映射即可。

    这里我们深入分析一下,lxc是如何将config作为一个全局变量进行传递的。以刚才提到的alias命令为例,在添加子命令的时候main.go中的有这样的代码。

        // alias sub-command
    	aliasCmd := cmdAlias{global: &globalCmd}
    	app.AddCommand(aliasCmd.Command())
    

    cmdAlias是在alias.go中的一个结构体,结构中包括一个名叫globalcmdGlobal对象,此外还有Command函数。添加子命令时程序先将globalCmd传递到cmdAlias中,再将Command函数注册到alias子命令的映射中,这样比较巧妙地将全局参数传递到了alias子命令中。

    当完成了参数解析后,具体的执行逻辑将会转移到各个子命令的Command函数中,我们这里挑选最常见的launch命令,完整地来看一次容器的创建流程。

    首先,我们假定运行的命令为lxc launch ubuntu:16.04 u1,看看这条命令在launch.go是如何处理的。 首先,我们发现cmdLaunch这个结构体与其他文件有一定的差异。

    type cmdLaunch struct {
    	global *cmdGlobal
    	init   *cmdInit
    }
    

    看见了吗,多了一个cmdInit对象。不难发现这个对象就是init.go中的对象,结合launch的具体过程我们可以想象,launch.go中可能是分两步执行launch过程,首先是lxc init ubuntu:16.04 u1,接下来lxc start u1

    
    func (c *cmdLaunch) Command() *cobra.Command {
    	cmd := c.init.Command()
    	cmd.Use = i18n.G("launch [<remote>:]<image> [<remote>:][<name>]")
    	cmd.Short = i18n.G("Create and start containers from images")
    	cmd.Long = cli.FormatSection(i18n.G("Description"), i18n.G(
    		`Create and start containers from images`))
    	cmd.Example = cli.FormatSection("", i18n.G(
    		`lxc launch ubuntu:16.04 u1
    
    lxc launch ubuntu:16.04 u1 < config.yaml
        Create and start the container with configuration from config.yaml`))
    	cmd.Hidden = false
    
    	cmd.RunE = c.Run
    
    	return cmd
    }
    
    

    源码中我们可以发现果然情况如此,launch.go的注册直接复用了init.go,只是将描述信息和运行函数进行了复写,这样可以将一些init的逻辑直接复用。我们定位到运行函数Run

    func (c *cmdLaunch) Run(cmd *cobra.Command, args []string) error {
    	conf := c.global.conf
    
    	// Sanity checks
    	exit, err := c.global.CheckArgs(cmd, args, 1, 2)
    	if exit {
    		return err
    	}
    
    	// Call the matching code from init
    	d, name, err := c.init.create(conf, args)
    	if err != nil {
    		return err
    	}
    
    	// Get the remote
    	var remote string
    	if len(args) == 2 {
    		remote, _, err = conf.ParseRemote(args[1])
    		if err != nil {
    			return err
    		}
    	} else {
    		remote, _, err = conf.ParseRemote("")
    		if err != nil {
    			return err
    		}
    	}
    
    	// Start the container
    	if !c.global.flagQuiet {
    		fmt.Printf(i18n.G("Starting %s")+"\n", name)
    	}
    
    	req := api.InstanceStatePut{
    		Action:  "start",
    		Timeout: -1,
    	}
    
    	op, err := d.UpdateInstanceState(name, req, "")
    	if err != nil {
    		return err
    	}
    
    	progress := utils.ProgressRenderer{
    		Quiet: c.global.flagQuiet,
    	}
    	_, err = op.AddHandler(progress.UpdateOp)
    	if err != nil {
    		progress.Done("")
    		return err
    	}
    
    	// Wait for operation to finish
    	err = utils.CancelableWait(op, &progress)
    	if err != nil {
    		progress.Done("")
    		prettyName := name
    		if remote != "" {
    			prettyName = fmt.Sprintf("%s:%s", remote, name)
    		}
    
    		return fmt.Errorf("%s\n"+i18n.G("Try `lxc info --show-log %s` for more info"), err, prettyName)
    	}
    
    	progress.Done("")
    	return nil
    }
    

    这段逻辑中除了输出显示的部分外,首先检查了一下参数数量和类型,接下来调用init逻辑中的create函数,接下来解析remote参数,然后构造了一个请求,针对该请求注册一个回调函数,最后执行同步性请求。那么这个部分中我们首先会关注init.gocreate函数。

    func (c *cmdInit) create(conf *config.Config, args []string) (lxd.InstanceServer, string, error) {
    	var name string
    	var image string
    	var remote string
    	var iremote string
    	var err error
    	var stdinData api.InstancePut
    	var devicesMap map[string]map[string]string
    	var configMap map[string]string
    
    	// If stdin isn't a terminal, read text from it
    	// ...
    
    	if len(args) > 0 {
    		// ... 指定的是remote:container的容器,解析出remote
    	}
    
    	if c.flagEmpty {
    		if len(args) > 1 {
    			return nil, "", fmt.Errorf(i18n.G("--empty cannot be combined with an image name"))
    		}
    
    		if len(args) == 0 {
    			remote, name, err = conf.ParseRemote("")
    			if err != nil {
    				return nil, "", err
    			}
    		} else if len(args) == 1 {
    			// Switch image / container names
    			name = image
    			remote = iremote
    			image = ""
    			iremote = ""
    		}
    	}
    
    	d, err := conf.GetInstanceServer(remote)
    	if err != nil {
    		return nil, "", err
    	}
    
    	if c.flagTarget != "" {
    		d = d.UseTarget(c.flagTarget)
    	}
    
    	profiles := []string{}
    	for _, p := range c.flagProfile {
    		profiles = append(profiles, p)
    	}
    
    	// 打印开始创建的信息
    
    	if len(stdinData.Devices) > 0 {
    		devicesMap = stdinData.Devices
    	} else {
    		devicesMap = map[string]map[string]string{}
    	}
    
    	if c.flagNetwork != "" {
    		network, _, err := d.GetNetwork(c.flagNetwork)
    		if err != nil {
    			return nil, "", err
    		}
    
    		if network.Type == "bridge" {
    			devicesMap[c.flagNetwork] = map[string]string{"type": "nic", "nictype": "bridged", "parent": c.flagNetwork}
    		} else {
    			devicesMap[c.flagNetwork] = map[string]string{"type": "nic", "nictype": "macvlan", "parent": c.flagNetwork}
    		}
    	}
    
    	if len(stdinData.Config) > 0 {
    		configMap = stdinData.Config
    	} else {
    		configMap = map[string]string{}
    	}
    	for _, entry := range c.flagConfig {
    		if !strings.Contains(entry, "=") {
    			return nil, "", fmt.Errorf(i18n.G("Bad key=value pair: %s"), entry)
    		}
    
    		fields := strings.SplitN(entry, "=", 2)
    		configMap[fields[0]] = fields[1]
    	}
    
    	// Check if the specified storage pool exists.
    	if c.flagStorage != "" {
    		_, _, err := d.GetStoragePool(c.flagStorage)
    		if err != nil {
    			return nil, "", err
    		}
    
    		devicesMap["root"] = map[string]string{
    			"type": "disk",
    			"path": "/",
    			"pool": c.flagStorage,
    		}
    	}
    
    	// Decide whether we are creating a container or a virtual machine.
    	instanceDBType := api.InstanceTypeContainer
    	if c.flagVM {
    		instanceDBType = api.InstanceTypeVM
    	}
    
    	// Setup instance creation request
    	req := api.InstancesPost{
    		Name:         name,
    		InstanceType: c.flagType,
    		Type:         instanceDBType,
    	}
    	req.Config = configMap
    	req.Devices = devicesMap
    
    	if !c.flagNoProfiles && len(profiles) == 0 {
    		if len(stdinData.Profiles) > 0 {
    			req.Profiles = stdinData.Profiles
    		} else {
    			req.Profiles = nil
    		}
    	} else {
    		req.Profiles = profiles
    	}
    	req.Ephemeral = c.flagEphemeral
    
    	var opInfo api.Operation
    	if !c.flagEmpty {
    		// Get the image server and image info
    		iremote, image = c.guessImage(conf, d, remote, iremote, image)
    		var imgRemote lxd.ImageServer
    		var imgInfo *api.Image
    
    		// Connect to the image server
    		if iremote == remote {
    			imgRemote = d
    		} else {
    			imgRemote, err = conf.GetImageServer(iremote)
    			if err != nil {
    				return nil, "", err
    			}
    		}
    
    		// Deal with the default image
    		if image == "" {
    			image = "default"
    		}
    
    		// Optimisation for simplestreams
    		if conf.Remotes[iremote].Protocol == "simplestreams" {
    			imgInfo = &api.Image{}
    			imgInfo.Fingerprint = image
    			imgInfo.Public = true
    			req.Source.Alias = image
    		} else {
    			// Attempt to resolve an image alias
    			alias, _, err := imgRemote.GetImageAlias(image)
    			if err == nil {
    				req.Source.Alias = image
    				image = alias.Target
    			}
    
    			// Get the image info
    			imgInfo, _, err = imgRemote.GetImage(image)
    			if err != nil {
    				return nil, "", err
    			}
    		}
    
    		// Create the instance
    		op, err := d.CreateInstanceFromImage(imgRemote, *imgInfo, req)
    		if err != nil {
    			return nil, "", err
    		}
    
    		// Watch the background operation
    		progress := utils.ProgressRenderer{
    			Format: i18n.G("Retrieving image: %s"),
    			Quiet:  c.global.flagQuiet,
    		}
    
    		_, err = op.AddHandler(progress.UpdateOp)
    		if err != nil {
    			progress.Done("")
    			return nil, "", err
    		}
    
    		err = utils.CancelableWait(op, &progress)
    		if err != nil {
    			progress.Done("")
    			return nil, "", err
    		}
    		progress.Done("")
    
    		// Extract the container name
    		info, err := op.GetTarget()
    		if err != nil {
    			return nil, "", err
    		}
    
    		opInfo = *info
    	} else {
    		req.Source.Type = "none"
    
    		op, err := d.CreateInstance(req)
    		if err != nil {
    			return nil, "", err
    		}
    
    		err = op.Wait()
    		if err != nil {
    			return nil, "", err
    		}
    
    		opInfo = op.Get()
    	}
    
    	instances, ok := opInfo.Resources["instances"]
    	if !ok || len(instances) == 0 {
    		// Try using the older "containers" field
    		instances, ok = opInfo.Resources["containers"]
    		if !ok || len(instances) == 0 {
    			return nil, "", fmt.Errorf(i18n.G("Didn't get any affected image, instance or snapshot from server"))
    		}
    	}
    
    	if len(instances) == 1 && name == "" {
    		fields := strings.Split(instances[0], "/")
    		name = fields[len(fields)-1]
    		fmt.Printf(i18n.G("Instance name is: %s")+"\n", name)
    	}
    
    	// Validate the network setup
    	c.checkNetwork(d, name)
    
    	return d, name, nil
    }
    
    

    可以看到,这部分虽然代码很长,但是大部分逻辑都是在构造容器的config,如果用户没有指定参数的话使用什么默认参数,例如镜像、网络、存储池、profile等等。核心代码是使用GetInstanceServer命令得到了一个InstanceServer对象,构造参数时也会通过这个对象查询,最后使用该对象的CreateInstanceFromImage(无镜像时用CreateInstance生成容器。那么,这里我们看看remote.go中的GetInstanceServer

    // GetInstanceServer returns a InstanceServer struct for the remote
    func (c *Config) GetInstanceServer(name string) (lxd.InstanceServer, error) {
    	// Handle "local" on non-Linux
    	if name == "local" && runtime.GOOS != "linux" {
    		return nil, ErrNotLinux
    	}
    
    	// Get the remote
    	remote, ok := c.Remotes[name]
    	if !ok {
    		return nil, fmt.Errorf("The remote \"%s\" doesn't exist", name)
    	}
    
    	// Sanity checks
    	if remote.Public || remote.Protocol == "simplestreams" {
    		return nil, fmt.Errorf("The remote isn't a private LXD server")
    	}
    
    	// Get connection arguments
    	args, err := c.getConnectionArgs(name)
    	if err != nil {
    		return nil, err
    	}
    
    	// Unix socket
    	if strings.HasPrefix(remote.Addr, "unix:") {
    		d, err := lxd.ConnectLXDUnix(strings.TrimPrefix(strings.TrimPrefix(remote.Addr, "unix:"), "//"), args)
    		if err != nil {
    			return nil, err
    		}
    
    		if remote.Project != "" && remote.Project != "default" {
    			d = d.UseProject(remote.Project)
    		}
    
    		if c.ProjectOverride != "" {
    			d = d.UseProject(c.ProjectOverride)
    		}
    
    		return d, nil
    	}
    
    	// HTTPs
    	if remote.AuthType != "candid" && (args.TLSClientCert == "" || args.TLSClientKey == "") {
    		return nil, fmt.Errorf("Missing TLS client certificate and key")
    	}
    
    	d, err := lxd.ConnectLXD(remote.Addr, args)
    	if err != nil {
    		return nil, err
    	}
    
    	if remote.Project != "" && remote.Project != "default" {
    		d = d.UseProject(remote.Project)
    	}
    
    	if c.ProjectOverride != "" {
    		d = d.UseProject(c.ProjectOverride)
    	}
    
    	return d, nil
    }
    

    针对两种连接模式,该函数使用lxd包中的ConnectLXDUnixConnectLXD两个函数连接。值得注意的是,这里的lxd并不是指的lxd这个目录,仔细看会发现这里的lxd包实际上在client这个目录下,而lxd目录下的包名实际上叫main。个人认为lxd在命名方面确实存在着很多的混淆点,除去lxd的命令行工具叫做lxc很可能与lxc 1.0让人产生误解,这里的包名稍不注意也会弄错。话说回来,这样我们知道了client这个目录下的代码应当是用来生成客户端向服务端发起的请求的。

    client/connect.go中,我们找到了ConnectLXD函数。

    // ConnectLXD lets you connect to a remote LXD daemon over HTTPs.
    //
    // A client certificate (TLSClientCert) and key (TLSClientKey) must be provided.
    //
    // If connecting to a LXD daemon running in PKI mode, the PKI CA (TLSCA) must also be provided.
    //
    // Unless the remote server is trusted by the system CA, the remote certificate must be provided (TLSServerCert).
    func ConnectLXD(url string, args *ConnectionArgs) (InstanceServer, error) {
    	logger.Debugf("Connecting to a remote LXD over HTTPs")
    
    	// Cleanup URL
    	url = strings.TrimSuffix(url, "/")
    
    	return httpsLXD(url, args)
    }
    
    // Internal function called by ConnectLXD and ConnectPublicLXD
    func httpsLXD(url string, args *ConnectionArgs) (InstanceServer, error) {
    	// Use empty args if not specified
    	if args == nil {
    		args = &ConnectionArgs{}
    	}
    
    	// Initialize the client struct
    	server := ProtocolLXD{
    		httpCertificate:  args.TLSServerCert,
    		httpHost:         url,
    		httpProtocol:     "https",
    		httpUserAgent:    args.UserAgent,
    		bakeryInteractor: args.AuthInteractor,
    		chConnected:      make(chan struct{}, 1),
    	}
    
    	if args.AuthType == "candid" {
    		server.RequireAuthenticated(true)
    	}
    
    	// Setup the HTTP client
    	httpClient, err := tlsHTTPClient(args.HTTPClient, args.TLSClientCert, args.TLSClientKey, args.TLSCA, args.TLSServerCert, args.InsecureSkipVerify, args.Proxy)
    	if err != nil {
    		return nil, err
    	}
    
    	if args.CookieJar != nil {
    		httpClient.Jar = args.CookieJar
    	}
    
    	server.http = httpClient
    	if args.AuthType == "candid" {
    		server.setupBakeryClient()
    	}
    
    	// Test the connection and seed the server information
    	if !args.SkipGetServer {
    		_, _, err := server.GetServer()
    		if err != nil {
    			return nil, err
    		}
    	}
    	return &server, nil
    }
    
    

    而这里已经接近https请求发送的底层了,我们不再深入分析了,只是需要注意的是lxd使用的是位于util.go中几乎自己实现的tlsHttpClient,而不像我想象的那样使用了第三方的https请求库。至于unix socket部分和https请求类似。

    那么,在获取了这个InstanceServer后,程序使用了对象的CreateInstanceFromImage函数来创建容器。找到interfaces.go中的InstanceServer后我们发现这是一个接口,由lxd.go中的ProtocolLXD实现。我们来看看这个CreateInstanceFromImage函数(该函数实现在lxd_instances.go中)。

    
    // CreateInstanceFromImage is a convenience function to make it easier to create a instance from an existing image.
    func (r *ProtocolLXD) CreateInstanceFromImage(source ImageServer, image api.Image, req api.InstancesPost) (RemoteOperation, error) {
    	// Set the minimal source fields
    	req.Source.Type = "image"
    
    	// Optimization for the local image case
    	if r == source {
    		// Always use fingerprints for local case
    		req.Source.Fingerprint = image.Fingerprint
    		req.Source.Alias = ""
    
    		op, err := r.CreateInstance(req)
    		if err != nil {
    			return nil, err
    		}
    
    		rop := remoteOperation{
    			targetOp: op,
    			chDone:   make(chan bool),
    		}
    
    		// Forward targetOp to remote op
    		go func() {
    			rop.err = rop.targetOp.Wait()
    			close(rop.chDone)
    		}()
    
    		return &rop, nil
    	}
    
    	// Minimal source fields for remote image
    	req.Source.Mode = "pull"
    
    	// If we have an alias and the image is public, use that
    	if req.Source.Alias != "" && image.Public {
    		req.Source.Fingerprint = ""
    	} else {
    		req.Source.Fingerprint = image.Fingerprint
    		req.Source.Alias = ""
    	}
    
    	// Get source server connection information
    	info, err := source.GetConnectionInfo()
    	if err != nil {
    		return nil, err
    	}
    
    	req.Source.Protocol = info.Protocol
    	req.Source.Certificate = info.Certificate
    
    	// Generate secret token if needed
    	if !image.Public {
    		secret, err := source.GetImageSecret(image.Fingerprint)
    		if err != nil {
    			return nil, err
    		}
    
    		req.Source.Secret = secret
    	}
    
    	return r.tryCreateInstance(req, info.Addresses)
    }
    

    这里简单的对镜像进行获取后,调用tryCreateInstance

    
    func (r *ProtocolLXD) tryCreateInstance(req api.InstancesPost, urls []string) (RemoteOperation, error) {
    	if len(urls) == 0 {
    		return nil, fmt.Errorf("The source server isn't listening on the network")
    	}
    
    	rop := remoteOperation{
    		chDone: make(chan bool),
    	}
    
    	operation := req.Source.Operation
    
    	// Forward targetOp to remote op
    	go func() {
    		success := false
    		errors := map[string]error{}
    		for _, serverURL := range urls {
    			if operation == "" {
    				req.Source.Server = serverURL
    			} else {
    				req.Source.Operation = fmt.Sprintf("%s/1.0/operations/%s", serverURL, url.PathEscape(operation))
    			}
    
    			op, err := r.CreateInstance(req)
    			if err != nil {
    				errors[serverURL] = err
    				continue
    			}
    
    			rop.targetOp = op
    
    			for _, handler := range rop.handlers {
    				rop.targetOp.AddHandler(handler)
    			}
    
    			err = rop.targetOp.Wait()
    			if err != nil {
    				errors[serverURL] = err
    				continue
    			}
    
    			success = true
    			break
    		}
    
    		if !success {
    			rop.err = remoteOperationError("Failed instance creation", errors)
    		}
    
    		close(rop.chDone)
    	}()
    
    	return &rop, nil
    }
    

    这里面涉及到了golang的并发实现go func() {}(),以建立一个异步的请求,并注册回调函数到rop中,在请求回复后执行。

  • org.onosproject.fwd 应用解析

    ONOS 二层转发应用

    org.onosproject.fwd应用应该说是ONOS中最核心的应用了,要想让我们创建的Mininet虚拟网络实现二层互通,就需要激活这个官方应用,因此从这个应用中我们能够学习到ONOS对网络的抽象方式,以及二层转发功能实现方式。截至本文发布之时ONOS的最新版本是1.13.0-SNAPSHOT,因此这里的源码也截至最新开发版。首先我们还是看一下应用的pom.xml文件。

    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <parent>
            <groupId>org.onosproject</groupId>
            <artifactId>onos-apps</artifactId>
            <version>1.13.0-SNAPSHOT</version>
        </parent>
    
        <artifactId>onos-app-fwd</artifactId>
        <packaging>bundle</packaging>
    
        <description>Reactive forwarding application using flow subsystem</description>
    
        <properties>
            <onos.app.name>org.onosproject.fwd</onos.app.name>
            <onos.app.title>Reactive Forwarding App</onos.app.title>
            <onos.app.category>Traffic Steering</onos.app.category>
            <onos.app.url>http://onosproject.org</onos.app.url>
            <onos.app.readme>Reactive forwarding application using flow subsystem.</onos.app.readme>
        </properties>
    
        <dependencies>
            ...
        </dependencies>
    
    </project>
    
    

    根据简介,我们可以得知这里的二层转发方式是使用Flow子系统来实现的,具体细化一下应该是FlowObjectiveService。更加详尽的介绍我们可以在BUCK文件中找到(值得一提的是,ONOS已经使用BUCK作为默认构建方式,虽然Maven仍然可以使用)。

    Provisions traffic between end-stations using hop-by-hop flow programming by intercepting packets for which there are currently no matching flow objectives on the data plane. 
    The paths paved in this manner are short-lived, i.e. they expire a few seconds after the flow on whose behalf they were programmed stops. 
    The application relies on the ONOS path service to compute the shortest paths. 
    In the event of negative topology events (link loss, device disconnect, etc.), the application will proactively invalidate any paths that it had programmed to lead through the resources that are no longer available.
    

    应用结构

    org.onosproject.fwd下包含几个文件

    MacAddressCompleter.java
    ReactiveForwarding.java
    ReactiveForwardingCommand.java
    ReactiveForwardMetrics.java
    

    MacAddressCompleter

    public class MacAddressCompleter implements Completer {
        @Override
        public int complete(String buffer, int cursor, List<String> candidates) {
            // Delegate string completer
            StringsCompleter delegate = new StringsCompleter();
            EventuallyConsistentMap<MacAddress, ReactiveForwardMetrics> macAddress;
            // Fetch our service and feed it's offerings to the string completer
            ReactiveForwarding reactiveForwardingService = AbstractShellCommand.get(ReactiveForwarding.class);
            macAddress = reactiveForwardingService.getMacAddress();
            SortedSet<String> strings = delegate.getStrings();
            for (MacAddress key : macAddress.keySet()) {
                strings.add(key.toString());
            }
            // Now let the completer do the work for figuring out what to offer.
            return delegate.complete(buffer, cursor, candidates);
        }
    }
    

    很容易看出来,这个类是用来CLI下补全MAC地址的,在resources/OSGI-INF.blueprint/shell-config.xml文件中我们可以看到命令的定义方式。

    <blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">
        <command-bundle xmlns="http://karaf.apache.org/xmlns/shell/v1.1.0">
            <command>
                <action class="org.onosproject.fwd.ReactiveForwardingCommand"/>
                <completers>
                <ref component-id="MacAddressCompleter"/>
                </completers>
            </command>
        </command-bundle>
        <bean id="MacAddressCompleter" class="org.onosproject.fwd.MacAddressCompleter"/>
    </blueprint>
    

    Completer声明为一个bean,引用在org.onosproject.fwd.ReactiveForwardingCommand中。那么我们接下来看一下Command的实现。

    ReactiveForwardingCommand

    @Command(scope = "onos", name = "reactive-fwd-metrics",
            description = "List all the metrics of reactive fwd app based on mac address")
    public class ReactiveForwardingCommand extends AbstractShellCommand {
        @Argument(index = 0, name = "mac", description = "One Mac Address",
                required = false, multiValued = false)
        String mac = null;
        @Override
        protected void execute() {
            ReactiveForwarding reactiveForwardingService = AbstractShellCommand.get(ReactiveForwarding.class);
            MacAddress macAddress = null;
            if (mac != null) {
                macAddress = MacAddress.valueOf(mac);
            }
            reactiveForwardingService.printMetric(macAddress);
        }
    }
    

    可以看到,这里用注解创建了一个CLI命令,名为onos:reactive-fwd-metrics,后面加mac地址,可以打印出对应主机的metrics。

    onos> onos:reactive-fwd-metrics aa:0e:a8:c8:c9:a8
    -----------------------------------------------------------------------------------------
     MACADDRESS 						 Metrics
     AA:0E:A8:C8:C9:A8 			 null
    

    ReactiveForwardMetrics

    public class ReactiveForwardMetrics {
        private Long replyPacket = null;
        private Long inPacket = null;
        private Long droppedPacket = null;
        private Long forwardedPacket = null;
        private MacAddress macAddress;
    }
    

    可以看到,ReactiveForwardMetrics这个应用是用来统计Packet的处理情况的,上面的代码中省略了对数量进行更新的函数以及toString函数。

    ReactiveForwarding

    这个类是fwd应用中最核心的文件,实现了转发的具体逻辑。

  • DPDK Pktgen和Testpmd验证试验

    Ref: Version: DPDK 19.08 / Pktgen 3.7.2

    +--------+---------------+               +-------------------+---------------+
    |        | socket file 1 |   <------->   | vhost-user port 1 |               |
    |        +---------------+               +-------------------+     Docker    |
    | host   |     pktgen    |               |      testpmd      |   container   |
    |        +---------------+               +-------------------+               |
    |        | socket file 0 |   <------->   | vhost-user port 0 |               |
    +--------+---------------+               +-------------------+---------------+
    

    Compile DPDK and Pktgen

    DPDK

    export RTE_SDK=~/dpdk/dpdk-19.08
    export RTE_TARGET=x86_64-native-linuxapp-gcc
    sed -ri  's,(CONFIG_RTE_LIBRTE_VHOST).*,\1y' config/common_base
    make config T=$RTE_TARGET
    sed -ri 's,(PMD_PCAP).*,\1y' build/.config
    make
    

    Pktgen

    export RTE_SDK=~/dpdk/dpdk-19.08
    export RTE_TARGET=build
    make
    

    Build a Docker image

    Create a dockerfile in the directory contains DPDK_SDK.

    FROM ubuntu:16.04
    WORKDIR /root/dpdk
    COPY dpdk-19.08 /root/dpdk/.
    ENV PATH "$PATH:/root/dpdk/$RTE_TARGET/app/"
    RUN sed -i 's/archive.ubuntu.com/mirrors.tuna.tsinghua.edu.cn/g' /etc/apt/sources.list && \
        apt update && apt install -y libnuma-dev libpcap-dev
    ENTRYPOINT ["/bin/bash"]
    

    Allocate HugePage

    Modify /etc/default/grub.

    GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1GB hugepagesz=1G hugepages=8"
    

    Update the grub file and reboot to take effect.

    sudo update-grub
    reboot
    mkdir -p /dev/hugepages
    sudo mount -t hugetlbfs none /dev/hugepages
    

    Run the Testpmd container

    sudo docker run -ti --rm --name=test \
    -v /dev/hugepages:/dev/hugepages \
    -v /tmp/virtio/:/tmp/virtio/ \
    --privileged dpdk
    

    Type the commands below inside the container shell

    testpmd -l 0-1 -n 1 --socket-mem 1024,1024 \
    --vdev 'eth_vhost0,iface=/tmp/virtio/sock0' --vdev 'eth_vhost1,iface=/tmp/virtio/sock1' \
    --file-prefix=test --no-pci \
    -- -i --forward-mode=io --auto-start
    

    Some usefule runtime functions

    show port stats all
    

    Generate the packets

    sudo pktgen -l 2,3,4 -n 2 --vdev=virtio_user0,path=/tmp/virtio/sock0 --vdev=virtio_user1,path=/tmp/virtio/sock1 -- -P -m "3.0,4.1"
    

    Some useful runtime functions

    set all rate 10 # set the sending rate at 10%
    set 0 count 100 # request the channel 0 to send 100 packets in total
    str # start
    
  • Cisco IOS命令参考

    ip subset-zero

    show ip route

    <C-a> 跳转到开头 <C-e> 跳转到结尾 <C-z> 退出特权模式

    • 设置主机名 hostname Router

    • 设置banner banner motd 登录时显示 banner exec 创建vty连接时显示 banner login 在motd之后显示

    • 设置密码

    1. 设置enable时的密码的 enable secret 设置启用密码 enable password 设置启用加密密码(优先级高于启用密码)
    2. 设置用户模式的密码 line console 0 控制器端口的用户模式密码 line aux 0 辅助端口密码 line vty 0 15 Telnet连接到路由器的密码
    Router(config)#line console 0
    Router(config-line)#password console
    Router(config-line)#login
    

    exec-timeout <minutes> <seconds> 会话的超时时间 logging synchronous 输出不会中断输入

    • 设置域名 ip domain-name xxx.com

    • 配置ssh登录

    Router(config)#hostname r1           
    r1(config)#ip domain-name barrygates.cn
    r1(config)#crypto key generate rsa
    The name for the keys will be: r1.barrygates.cn
    Choose the size of the key modulus in the range of 360 to 4096 for your
      General Purpose Keys. Choosing a key modulus greater than 512 may take
      a few minutes.
    
    How many bits in the modulus [512]:
    % Generating 512 bit RSA keys, keys will be non-exportable...
    [OK] (elapsed time was 0 seconds)
    
    r1(config)#
    *Feb 14 11:42:15.394:  RSA key size needs to be atleast 768 bits for ssh version 2
    r1(config)#
    *Feb 14 11:42:15.402: %SSH-5-ENABLED: SSH 1.5 has been enabled
    r1(config)#ip ssh version 2
    Please create RSA keys to enable SSH (and of atleast 768 bits for SSH v2).
    r1(config)#line vty 0 15
    r1(config-line)#transport input ssh
    r1(config-line)#
    
    
    • 对密码加密 默认情况下只有启用加密密码是加密的,如果要让所有的密码都加密 service password-encryption

    • 端口描述

    r1(config)#int fastEthernet 0/0
    r1(config-if)#ip address 172.16.0.1 255.255.0.0
    r1(config-if)#description for test
    r1(config-if)#exit
    r1(config)#do show interfaces description
    Interface                      Status         Protocol Description
    Fa0/0                          admin down     down     for test
    r1(config)#
    
    

    辅助IP地址

    r1(config-if)#ip address 172.16.1.1 255.255.0.0 secondary
    
    • 管道
    r1#sh run | ?
      append    Append redirected output to URL (URLs supporting append operation
                only)
      begin     Begin with the line that matches
      count     Count number of lines which match regexp
      exclude   Exclude lines that match
      format    Format the output using the specified spec file
      include   Include lines that match
      redirect  Redirect output to URL
      section   Filter a section of output
      tee       Copy output to URL
    
    
    • 保存配置
    copy running-config startup-config
    
    • 删除配置
    erase startup-config
    
    • 重置端口计数器
    clear counters e0/0
    

    show protocols 接口1、2层情况,IP地址 show controllers 物理接口情况

    DHCP设置

    IOU1(config)#ip dhcp excluded-address 192.168.10.1 192.168.10.10
    IOU1(config)#ip dhcp pool MyNetwork
    IOU1(dhcp-config)#network 192.168.10.0 255.255.255.0
    IOU1(dhcp-config)#default-router 192.168.10.1
    IOU1(dhcp-config)#dns-server 8.8.8.8
    IOU1(dhcp-config)#lease 3 12 15
    

    上面表示创建了一个192.168.10.0/24下的地址池,DNS服务器为8.8.8.8,默认网关为192.168.10.1,排除两个地址,地址租期为3天12小时15分钟。

    DHCP中继

    如果不配置,路由器默认情况对DHCP广播丢弃。

    IOU1(config)#int f0/0
    IOU1(config-if)#ip helper-address 10.10.10.254
    

    将DHCP广播转发到10.10.10.254。

    对于DHCP的信息验证

    show ip dhcp binding 已分配的IP状态

    show ip dhcp pool [poolname] IP地址池情况

    show ip dhcp server statistics DHCP统计情况

    show ip dhcp conflict 冲突情况

    NTP

    IOU1(config)#ntp server 172.16.10.1 version 4
    IOU1(config)#ntp master
    IOU1#show ntp status
    IOU1#show ntp associations
    

    CDP

    show cdp会显示CDP定时器、CDP信息在表中的保持时间

    • cdp holdtime
    • cdp timer no cdp run 关闭cdp show cdp neighbors显示直连设备的信息,cdp不会穿越思科交换机。详细信息包括show cdp entry *show cdp neighbors detail