作者： Sa1ka

Suricata学习

Suricata是一款开源免费的网络威胁检测系统，可以在网络中作为IDS（Intrusion Detection System，入侵检测系统）、IPS（Intrusion Prevention System，入侵防御系统）和NSM（Network Security Monitoring，网络安全监控）使用，同样还可以离线分析pcap文件。Suricata使用专门的语言编写的规则来对网络流量进行分析，还可以利用Lua脚本来更加精确地分析，并以类似YAML或JSON的形式输出，可以方便存储在数据库中。目前Suricata项目属于OISF所有，OISF是一个非营利组织。

Installation

Suricata和其他的Linux软件类似，也具有两种安装方法，即直接安装发行版和编译安装。

PPA安装法

以下操作均在Ubuntu 16.04中，其他发行版可以查阅官方wiki。

sudo add-apt-repository ppa:oisf/suricata-stable
sudo apt-get update 
sudo apt-get install suricata

编译安装法

首先需要安装程序依赖库

sudo apt-get -y install libpcre3 libpcre3-dbg libpcre3-dev \
build-essential autoconf automake libtool libpcap-dev libnet1-dev \
libyaml-0-2 libyaml-dev zlib1g zlib1g-dev libcap-ng-dev libcap-ng0 \
make libmagic-dev libjansson-dev libjansson4 pkg-config

下载源码

VER=3.1
wget "http://www.openinfosecfoundation.org/download/suricata-$VER.tar.gz" 
tar -xvzf "suricata-$VER.tar.gz" 
cd "suricata-$VER"

配置安装

./configure --prefix=/usr --sysconfdir=/etc --localstatedir=/var
make
sudo make install
sudo ldconfig

Suricata还提供了一些自动安装的脚本

make install-conf 自动创建和安装配置文件
make install-rules 自动从Emergeing Threats下载最新的规则集
make install-full 将上面两者都包括

Setup

接下来我们需要部署Suricata，保证下面的命令均具备管理员权限。

mkdir /var/log/suricata # 日志信息
mkdir /etc/suricata # 配置文件
cp classification.config /etc/suricata
cp reference.config /etc/suricata
cp suricata.yaml /etc/suricata

在/etc/suricata/suricata.yaml文件中正确配置好变量。HOME_NET设置为本地网络的IP地址，而EXTERNAL_NET建议的设置值是!$HOME_SET，这样所有不是本地IP的流量均被当作外界地址，当然设置成any也是可以的，只是这样的话会产生一些假的警报。下面的一些服务器均会被默认设置为$HOME_NET。AIM_SERVERS设置为any。

Run

Suricata的运行方式比较简单，只需要选择需要监听的网络接口，使用类似于下面的命令即可。

sudo suricata -c /etc/suricata/suricata.yaml -i wlan0

运行后产生的日志文件在/var/log/suricata目录下，我们可以使用类似于tail -f http.log stats.log的命令来监视程序的执行结果。

Rules

Suricata中最重要的就是关于规则的指定，使用特定的规则集就可以对特定的流量进行分析和处理，如果使用了IPS模式那么还可以直接处理报文内容。一般来说，我们会可以从互联网上下载最新的规则集，一般来自于Emerging Threats(Pro)和Sourcefire的VRT。手动管理的方式比较麻烦，我们可以使用到一款工具Oinkmaster。oinkmaster可以自动化下载、管理rules。一般来说，规则由三个部分构成：Action、Header和Rule options。例如下面这条

alert tcp $EXTERNAL_NET any -> $HOME_NET 8888 (msg: "meow"; content: "meow"; )

alert表示动作，表示匹配后将发出警报。
tcp表示是TCP报文，还可以是ip、udp、icmp等，还包括一些常见的应用层协议。
$EXTERNAL表示使用前面定义的外部地址，可以使用!1.1.1.1、![1.1.1.1, 1.1.1.2]、[10.0.0.0/24, !10.0.0.5]等形式。
any表示端口，有[79,80:82,83]这样的形式。
->表示方向，可以是->或<>。
(msg: "meow"; content: "meow"; )表示规则选项，中间使用分号断开，包括meta-information、headers、payloads和flows等选项。具体内容将在后面说明。

Meta-settings

Meta-settings不会影响检测过程，只是用来完成记录等附属功能。

msg: "some description"; 将显示在日志中
sid: 123; 每条规则的编号
rev: 123; 规则的版本号
gid: 1; 组编号
classtype: trojan-activity; 规则的分类
reference: bugtraq, 123; http://www.securityfocus.com/bid; 规则的参考位置
priority:1; 规则优先级
metadata: ...;
target: [src_ip|dest_ip];

Header Keywords

ttl: 10;
ipopts: lsrr; IP选项
sameip; 源IP和目的IP相同
ip_proto: TCP;
id: 1;
geoip: src, RU;
fragbits:[*+!]<[MDR]>;
fragoffset:[!|<|>]<number>;
seq:0;
ack:1;
window:[!]<number>;
itype:min<>max;
itype:[<|>]<number>;
icode:min<>max;
icode:[<|>]<number>;
icmp_id:<number>;
icmp_seq:<number>;

Payload Keywords

content:"a|0D|bc";
content:"|61 0D 62 63|";
content:"a|0D|b|63|";
nocase;
depth:12;
offset:3;

Flowbits

通过在Suricata中保存标志位来判断若干个流量的关联性

flowbits: set, name                设置name指定的条件
flowbits: isset, name              检查是否有name指定的条件设置
flowbits: toggle, name             切换name指定的条件设置情况
flowbits: unset, name              取消设置name指定的条件
flowbits: isnotset, name           检查是否没有name指定的条件设置
flowbits: noalert                  不产生alert

Flow

匹配流的方向，是否建立连接等

flow:to_client, established
flow:to_server, established, only_stream
flow:to_server, not_established, no_frag

原理

Suricata有几个关键组件构成：线程、线程模块和队列。Suricata以多线程的方式运行，而线程模块即对应其包获取、解码、检测和输出模块。一个包在Suricata会以类似流水线的方式一级一级地传递给下一个线程模块处理，而在这里的“传送带”就是队列。一个线程可以包含多个线程模块，这就是Runmode。使用suricata --list-runmodes可以看到Suricata目前可以使用的runmodes。

------------------------------------- Runmodes ------------------------------------------
| RunMode Type      | Custom Mode       | Description 
|----------------------------------------------------------------------------------------
| PCAP_DEV          | single            | Single threaded pcap live mode 
|                   ---------------------------------------------------------------------
|                   | autofp            | Multi threaded pcap live mode.  Packets from each flow are assigned to a single detect thread, unlike "pcap_live_auto" where packe
ts from the same flow can be processed by any detect thread 
|                   ---------------------------------------------------------------------
|                   | workers           | Workers pcap live mode, each thread does all tasks from acquisition to logging 
|----------------------------------------------------------------------------------------
| PCAP_FILE         | single            | Single threaded pcap file mode 
|                   ---------------------------------------------------------------------
|                   | autofp            | Multi threaded pcap file mode.  Packets from each flow are assigned to a single detect thread, unlike "pcap-file-auto" where packe
ts from the same flow can be processed by any detect thread 
|----------------------------------------------------------------------------------------
| PFRING(DISABLED)  | autofp            | Multi threaded pfring mode.  Packets from each flow are assigned to a single detect thread, unlike "pfring_auto" where packets fro
m the same flow can be processed by any detect thread 
|                   ---------------------------------------------------------------------
|                   | single            | Single threaded pfring mode 
|                   ---------------------------------------------------------------------
|                   | workers           | Workers pfring mode, each thread does all tasks from acquisition to logging 
|----------------------------------------------------------------------------------------
| NFQ               | autofp            | Multi threaded NFQ IPS mode with respect to flow 
|                   ---------------------------------------------------------------------
|                   | workers           | Multi queue NFQ IPS mode with one thread per queue 
|----------------------------------------------------------------------------------------
| NFLOG             | autofp            | Multi threaded nflog mode   
|                   ---------------------------------------------------------------------
|                   | single            | Single threaded nflog mode  
|                   ---------------------------------------------------------------------
|                   | workers           | Workers nflog mode          
|----------------------------------------------------------------------------------------
| IPFW              | autofp            | Multi threaded IPFW IPS mode with respect to flow 
|                   ---------------------------------------------------------------------
|                   | workers           | Multi queue IPFW IPS mode with one thread per queue 
|----------------------------------------------------------------------------------------
| ERF_FILE          | single            | Single threaded ERF file mode 
|                   ---------------------------------------------------------------------
|                   | autofp            | Multi threaded ERF file mode.  Packets from each flow are assigned to a single detect thread 
|----------------------------------------------------------------------------------------
| ERF_DAG           | autofp            | Multi threaded DAG mode.  Packets from each flow are assigned to a single detect thread, unlike "dag_auto" where packets from the 
same flow can be processed by any detect thread 
|                   ---------------------------------------------------------------------
|                   | single            | Singled threaded DAG mode   
|                   ---------------------------------------------------------------------
|                   | workers           | Workers DAG mode, each thread does all  tasks from acquisition to logging 
|----------------------------------------------------------------------------------------
| AF_PACKET_DEV     | single            | Single threaded af-packet mode 
|                   ---------------------------------------------------------------------
|                   | workers           | Workers af-packet mode, each thread does all tasks from acquisition to logging 
|                   ---------------------------------------------------------------------
|                   | autofp            | Multi socket AF_PACKET mode.  Packets from each flow are assigned to a single detect thread. 
|----------------------------------------------------------------------------------------
| NETMAP(DISABLED)  | single            | Single threaded netmap mode 
|                   ---------------------------------------------------------------------
|                   | workers           | Workers netmap mode, each thread does all tasks from acquisition to logging 
|                   ---------------------------------------------------------------------
|                   | autofp            | Multi threaded netmap mode.  Packets from each flow are assigned to a single detect thread. 
|----------------------------------------------------------------------------------------
| UNIX_SOCKET       | single            | Unix socket mode            
|                   ---------------------------------------------------------------------
|                   | autofp            | Unix socket mode            
|----------------------------------------------------------------------------------------

可以看到，在Suricata中包含三种Custom Mode，single/workers/autofp，根据右边的介绍我们能够知道当前模式的运行特点。在workers模式下，每一个线程上包含一个完整的包处理模块，也就是说将获取到的报文将分发到包处理线程中，而Suricata将会将属于同一个flow的流量放在一个线程中避免出现问题。

其他支持软件

Oinkmaster

oinkmaster.pl -C /etc/oinkmaster.conf -o /etc/suricata/rules -i

Suricata配置文件suricata.yaml中的outputs2 > unified2-alert可以设定在产生alert时dump出可疑数据包的信息，这个格式的好处是：

方便归档管理
生成速度快。

Barnyard2

Barnyard2就是个类似Syslog的东西，从Snort/Suricata处取得unified2格式的输入，产生其他格式的输出，比如给Prelude Hybrid IDS system、Syslog、MySQL。

2021年6月16日

OpenFlow协议学习

OpenFlow 1.0 协议总结

OpenFlow交换机中包含了1个以上的流表，流表又包含多个流表项。对于接收到的数据包，OpenFlow交换机在流表中选择一个最合适的流表项，并按流表项的内容对数据包进行处理。

流表中包含的流表项由以下3个基本要素组成：Head Field, Counter, Action。Head Field即匹配规则，Counter即匹配次数，Action即匹配后所需要采取的操作。

流表项

Head Field

Ingress Port
Ethernet source address
Ethernet destination address
Ethernet type
VLAN id
VLAN Priority(802.1q PCP)
IP source address
IP destination address
ToS
Transport source port/ICMP type
Transport destination port/ICMP code

在1.0版本中，STP(802.1d)数据包不执行流表匹配。完成STP处理后，执行“头字段解析之后”，会在此基础上进行数据包与流表的匹配。当数据包与刘表内设置的多条流表项匹配时，优先级最高的流表项即为匹配结果。OpenFlow中可以存在多个流表，但必须从流表0开始匹配。若编号为0的流表中不存在与数据包相匹配的流表项，而在其他流表中存在的话，则前进到下一流表中执行匹配。数据包与OpenFlow交换机中任何一个流表项都不匹配的情况称之为Table-miss，这种情况下会利用Packet-In消息将数据包转发至控制器，或者丢弃。具体情况会根据设置决定。

Counter

4种计数器 Per Table, Per Port, Per Flow, Per Queue。

Per Table: Active Entries, Packet Lookups, Packet Matches

Per Flow: Received Packets/Bytes, Duration(msec/usec)

Per Queue: Transmit Packets/Bytes, Transmit Overrun Errors

Per Port: Received Packets/Bytes/Drops/Errors, Transmit Packets/Bytes/Drops/Errors, Receive Frame Alignment Errors, Received Overrun Errors, Received CRC Errors, Collisions

Action

Forward, Drop, Enqueue(Optional), Modify-Field(Optional)

Forward

名称	说明	虚拟端口
(port)	转发至指定端口	port
ALL	除接收端口以外的所有端口	0xfffc
CONTROLLER	控制器	0xfffd
LOCAL	发送至本地网络栈	0xfffe
TABLE	执行流表中行动（仅Packet-Out）	0xfff9
IN_PORT	从输入端口发出	0xfff8
NORMAL(optional)	传统L2或L3交换机动作	0xfffa
FLOOD(optional)	按照STP发送	0xfffb

Drop

未在协议中明确说明

Enqueue

在等待队列的末尾添加数据包

Modify-Field

设置VLAN ID
设置VLAN priority
去掉VLAN头
修改源MAC
修改目的MAC
修改源IP
修改目的IP
修改ToS
修改源端口
修改目的端口

控制器和交换机之间的安全通道

安全通道通过控制面网络来建立，不受OpenFlow交换机中的流表项的影响。规范中安全通道应该使用TLS实现，但在OpenFlow 1.1开始添加了TCP明文的方式，默认的TCP端口为6653。

struct openflow_header {
    uint8_t version; // 版本，OpenFlow 1.0-1.3分别对应0x01-0x04
    uint8_t type; // 消息类型，表明OpenFlow消息的类型
    uint16_t length; // Byte数
    uint32_t xid; // 事务ID
}

安全通道的建立过程

建立TCP/TLS连接
确定使用的OpenFlow版本，type字段值为OFPT_HELLO，在version中放入各自支持的最大版本号。
握手，type字段值为OFPT_FEATURES_REQUEST和OFPT_FEATURES_RESPONSE

struct features_response {
    uint64_t datapath_id; // datapath id唯一标识OpenFlow交换机
    uint32_t n_buffer; // 缓存数据包的最大个数
    uint8_t n_tables; // 支持的流表个数
    char pad[24]; 
    uint32_t capabilities; // 支持的容量
    uint32_t actions; // 支持的行动
    struct ofp_phy_port ports[??]; // 物理端口信息
}

struct ofp_phy_port {
    uint16_t port_no; // 物理端口号
    uint8_t hw_addr[OFP_ETH_LENGTH]; // 48位的以太网地址
    uint8_t name[OFP_MAX_PORT_NAME_LEN]; // 128位的端口名称
    uint32_t config; // 端口设置bitmap
    uint32_t state; // 端口状态bitmap
    uint32_t curr; // 当前功能
    uint32_t advertised; // 广播功能
    uint32_t supported;  // 支持功能
    uint32_t peer;  // 连接方广播功能
}

交换设置(optional)。type为OFPT_SET_CONFIG/OFPT_GET_CONFIG。

struct set_config_message {
    uint16_t flags; // IP碎片处理方法
    uint16_t miss_send_len; // Table-miss数据包个数
}

enum set_config_flag {
    OFPC_FRAG_NORMAL = 0, // 依据流表处理
    OFPC_FRAG_DROP, // 丢弃
    OFPC_FRAG_REASM, // 重组
    OFPC_FRAG_MASK // unknown
}

其他可能交换的内容 STATS，QUEUE_GET_CONFIG，Vendor, …

Flow-Mod

对于流表进行修改的消息，可以进行添加、删除和变更设置等操作。种类包括以下几种

OFPFC_ADD
OFPFC_MODIFY
OFPFC_MODIFY_STRICT
OFPFC_DELETE
OFPFC_DELETE_STRICT 后面加上STRICT表示要完全匹配。如果有错误发生，那么将回复错误信息。如果在流表中存在相同项，那么会将原有计数器清零。

struct flow_mod_message {
    struct ofp_match match; // 数据包匹配信息
    uint8_t cookie[64];
    uint8_t action[??]; 
    
}

struct ofp_match {
    uint32_t wildcards; // 无视哪个字段的通配符
    uint16_t in_port; // 输入物理端口
    uint8_t dl_src[OFP_ETH_ALEN]; // 源MAC地址
    uint8_t dl_dst[OFP_ETH_ALEN]; // 目的MAC地址
    uint16_t dl_vlan; // VLAN id
    uint8_t dl_vlan_pcp;  // VLAN 优先级
    uint8_t pad1; 
    uint16_t dl_type; // 以太网帧类型 
    uint8_t nw_tos; // ToS字段
    uint8_t nw_proto; // IP协议号
    uint16_t pad2;
    uint32_t nw_src; // 源IPv4地址
    uint32_t nw_dst; // 目的IPv4地址
    uint16_t tp_src; // 源端口或ICMP类型
    uint16_t tp_dst; // 目的端口或ICMP代码
}

struct flow_mod_action {
    uint16_t command; // 即上面提到的action种类
    uint16_t idle_timeout; // 如果本条规则在idle_timeout时间内没有应用成功，那么将删除该规则
    uint16_t hard_timeout; // 如果本条规则在hard_timeout时间内还未添加成功，那么将取消添加该规则。
    uint16_t priority; // 规则优先级
    uint32_t buffer_id; // 缓存ID
    uint16_t out_port; // 输出端口号
    uint16_t flags;
    uint8_t actions[??]; // TLV结构，一个头部包含修改种类，后面接上具体的数据
}

Packet-In

Packet-In消息用于将到达OpenFlow交换机的数据包发送至OpenFlow交换机。根据交换机是否缓存数据包来设置buffer_id的值：不缓存则设置为-1，将整个数据包发送；缓存则会根据SET_CONFIG消息设置的miss_send_len为最大值的数据包发送，默认值为128。

struct packet_in_message {
    uint32_t buffer_id; // 缓存ID
    uint16_t total_len; // 帧长度
    uint16_t in_port; // 接受帧端口
    uint8_t reason; // 发送原因（不匹配0，流表指定1）
    uint8_t pad;
    uint8_t data[??]; // 具体数据包
}

Packet-Out

Packet-Out即为控制器向交换机发送的消息。

struct packet_out_message {
    uint32_t buffer_id; // 缓存ID
    uint16_t in_port; // 输入端口，用OFPP_NONE表示未指定，用OFPP_CONTROLLER表示是控制器创建的数据包
    uint16_t actions_len;
    uint8_t actions[actions_len]; // 类似于Flow-Mod时的action
    uint8_t data[??]; // 当buffer_id为-1的时候，即不指定交换器缓存时
}

Port-Status

在OpenFlow交换机添加、删除或修改物理端口时，需要发送Port-Status消息来通知OpenFlow控制器。

struct port_status_message {
    uint8_t reason; // OFPPR_ADD(0)、OFPPR_DELETE(1)、OFPPR_MODIFY(2)
    uint8_t pad[56];
    struct ofp_phy_port; 
}

Flow-Removed

当OpenFlow交换机设置的流表项超时时，会向控制器发送Flow-Removed消息。

struct flow_remove_message {
    struct ofp_match match;
    uint8_t cookie[64];
    uint16_t priority;
    uint8_t reason; // OFPRR_IDLE_TIMEOUT(0)、OFPRR_HARD_TIMEOUT(1)、OFPRR_DELETE(2)
    uint8_t pad;
    uint32_t duration_sec; // 有效时间（秒）
    uint32_t duration_nsec; // 纳秒
    uint16_t idle_timeout;
    uint8_t pad2;
    uint64_t packet_count; // 数据包数
    uint64_t byte_count; // 总字节数
}

大部分字段可以直接复制于Flow-Mod的信息。

Error

当在处理过程中出现错误的时候发送，控制器和交换机均可使用，type为OFPT_ERROR_MSG。

struct error_message {
    uint16_t type;
    uint16_t code;
    uint8_t data[??];
}

enum error_type {
    OFPET_HELLO_FAILED,
    OFPET_BAD_REQUEST,
    OFPET_BAD_ACTION,
    OFPET_FLOW_MOD_FAILED,
    OFPET_QUEUE_OP_FAILED
}

具体code参考OpenFlow说明书。

Barrier

用于双方对于事务的完成程度的通信，避免发生有顺序的事务因执行程度未达到而造成错误的情况。例如在发送了三条Flow-Mod信息，xid分别为1、2、3后，可以发送xid为4的Barrier请求，如果可以得到对方xid为4的Barrier响应，则表示前面的消息已经处理完毕。

Echo

用于测试两方的连接情况、通信延迟和通信带宽等。

LLDP

Link Layer Discovery Protocol

IEEE 802.1ab

许多的OpenFlow控制器利用LLDP来检测网络拓扑结构。一般来说，OpenFlow控制器会利用Packet-Out消息向OpenFlow交换机下达发送LLDP帧的命令，接下来利用接收到LLDP帧的OpenFlow交换机向控制器发送的Packet-In消息中得到拓扑消息，构建整个网络拓扑结构。

LLDP的机制

每隔一段时间（标准中建议是30秒）向L2的组播地址发送LLDP帧，这个特性无视链路状态。
发送源地址为发送设备的MAC地址。
单向发送设备的识别号、端口识别号，这样接收方就可以得知发送设备和端口的信息。

LLDP使用三种全局性组播群地址

名称	数值	说明
Nearest Bridge	01-80-C2-00-00-0E	所有的桥和路由设备不转发
Nearest non-TPMR Bridge	01-80-C2-00-00-C3	跨越TPMR桥
Nearest Customer Bridge	01-80-C2-00-00-00	跨越TPMR和S-VLAN桥

一般来说，仅OpenFlow交换机构成的网络拓扑，会使用Nearest Bridge地址，而如果在中间存在有非OpenFlow的普通L2交换机则使用Nearest non-TPMR Bridge地址，借助WAN远程使用OpenFlow交换机则使用Nearest Customer Bridge地址。

-----------------------------------------------------
| 组播地址 | 设备的以太网地址 | 0x88cc(LLDP) | LLDPDU |
-----------------------------------------------------

OpenFlow 1.1 更新

头字段变更为匹配字段

在匹配字段中添加了MPLS标签、MPLS流量类别、元数据

多级流表

OpenFlow交换机可以设置多个流表，还可为一个数据包匹配多个流表项，但是在流表内还是只选择一个流表项。

采用流水线处理的方式，新定义了“行动集”的概念，即将所有的行动统一添加到行动集中，但是执行顺序与计入顺序不相同，采用copy TTL inwards->pop->push->copy TTL outwards->decrement TTL->set->qos->group->output的方式，这里面每种类型仅能设置一个。

如果发生了Table-miss的情况，那么将根据OFPT_TABLE_MOD对流表的设置来决定，具体的方法有发送至控制器、前进到下一流表和丢弃。

2021年6月16日

ONOS官方示例应用解析

onos-app-calendar

一个RESTful的web应用，提供添加链路时延、带宽限制的Intent。用到了ConnectivityIntent。完成功能的逻辑

    /**
     * Create an Intent for a bidirectional path with constraints.
     *
     * @param key optional intent key
     * @param src the path source (DPID or hostID)
     * @param dst the path destination (DPID or hostID)
     * @param srcPort the source port (-1 if src/dest is a host)
     * @param dstPort the destination port (-1 if src/dest is a host)
     * @param bandwidth the bandwidth (mbps) requirement for the path
     * @param latency the latency (micro sec) requirement for the path
     * @return the appropriate intent
     */
    private Intent createIntent(Key key,
                                String src,
                                String dst,
                                String srcPort,
                                String dstPort,
                                Long bandwidth,
                                Long latency) {

        TrafficSelector selector = buildTrafficSelector();
        TrafficTreatment treatment = builder().build();

        final Constraint constraintBandwidth =
                new BandwidthConstraint(Bandwidth.mbps(bandwidth));
        final Constraint constraintLatency =
                new LatencyConstraint(Duration.of(latency, ChronoUnit.MICROS));
        final List<Constraint> constraints = new LinkedList<>();

        constraints.add(constraintBandwidth);
        constraints.add(constraintLatency);

        if (srcPort.equals("-1")) {
            HostId srcPoint = HostId.hostId(src);
            HostId dstPoint = HostId.hostId(dst);
            return HostToHostIntent.builder()
                    .appId(appId())
                    .key(key)
                    .one(srcPoint)
                    .two(dstPoint)
                    .selector(selector)
                    .treatment(treatment)
                    .constraints(constraints)
                    .build();

        } else {
            ConnectPoint srcPoint = new ConnectPoint(deviceId(src), portNumber(srcPort));
            ConnectPoint dstPoint = new ConnectPoint(deviceId(dst), portNumber(dstPort));
            return TwoWayP2PIntent.builder()
                    .appId(appId())
                    .key(key)
                    .one(srcPoint)
                    .two(dstPoint)
                    .selector(selector)
                    .treatment(treatment)
                    .constraints(constraints)
                    .build();
        }
    }


    /**
     * Synchronously submits an intent to the Intent Service.
     *
     * @param intent intent to submit
     * @return true if operation succeed, false otherwise
     */
    private boolean submitIntent(Intent intent)
            throws InterruptedException {
        IntentService service = get(IntentService.class);

        CountDownLatch latch = new CountDownLatch(1);
        InternalIntentListener listener = new InternalIntentListener(intent, service, latch);
        service.addListener(listener);
        service.submit(intent);
        log.info("Submitted Calendar App intent and waiting: {}", intent);
        if (latch.await(TIMEOUT, TimeUnit.SECONDS) &&
                listener.getState() == INSTALLED) {
            return true;
        }
        return false;
    }

onos-app-carrierethernet

根据pom.xml文件的内容，我们可以知道这个应用是用于运营以太网服务（Carrier Ethernet），具体介绍在Carrier Ethernet，由城域以太网论坛（MEF）建立。CE包含五个模块：保护、QoS、扩展、业务管理、TDM。个人理解是，希望以太网上的流量，能够在SDN环境下加入一定的识别特征，这样才能方便城域网中的网络设备根据这些特征进行服务的定制。这个项目相对来说比较复杂，而且更新也非常的频繁，先挖个坑在这里，以后有时间的话可以认真阅读一下其实现。

onos-app-database-perf

一个用来测试ONOS集群存储数据性能的应用，在activate中包含有对于StorageService的使用，包括创建并发式Map的方法，注册序列化器KryoNamespace等，之后创建多个线程测试具体的性能指标。

onos-app-ecord-co

一个CORD的实现应用，CORD（Central Office Re-architected as a DataCenter），意为在家庭、公司等网络边界的基础网络设备，实现一个数据中心的服务功能。CORD目前有三种类型：ECORD、RCORD和MCORD，具体的详情可以查看CORD。这个应用给出了一个ECORD的实现方案。这个应用比较全面地展示了ONOS的抽象子系统概念，我们可以从项目的结构看出来。

CentralOffice.java文件是组件的主文件，主要工作是注册应用，并且创建了一个名为BigSwitchDeviceProvider的对象，那么接下来我们去找找实现。BigSwitchDeviceProvider继承于DeviceProvider。从前面两次的loadConfig操作中我们可以看出来这个应用支持修改config文件，并且提供有RPC和RESTful两种修改模式，然而目前我对于ConfigService还没有太深的了解，因此这里填个坑等到以后再来了解。在activate方法中基本就是向deviceProvider中注册了一个设备，接下来使用LLDP协议发现网络拓扑。为了完成LLDP的工作，程序中使用到了onlab-misc中的ONOSLLDP工具类，方便使用LLDP协议。

    @Activate
    public void activate(ComponentContext context) {
        cfgService.registerProperties(getClass()); // 在ComponentConfigService上进行注册
        loadRpcConfig(context); 
        loadRestConfig(context); 

        // setup service to, and register with, providers
        try {
            remoteServiceContext = rpcService.get(URI.create(remoteUri));
        } catch (UnsupportedOperationException e) {
            log.warn("Unsupported URI: {}", remoteUri);
        }
        providerId = new ProviderId(schemeProp, idProp);
        executor = newSingleThreadScheduledExecutor(groupedThreads("onos/bigswitch", "discovery-%d"));
        registerToDeviceProvider();
        prepareProbe();
        registerToLinkServices();

        // start listening to config changes
        NetworkConfigListener cfglistener = new InternalConfigListener();
        cfgRegistry.addListener(cfglistener);
        cfgRegistry.registerConfigFactory(xcConfigFactory);
        log.info("Started");
    }

    @Deactivate
    public void deactivate() {
        packetService.removeProcessor(packetProcessor);


        // advertise all Links as vanished
        knownLinks.invalidateAll();

        cfgRegistry.unregisterConfigFactory(xcConfigFactory);
        cfgService.unregisterProperties(getClass(), false);
        unregisterFromLinkServices();
        executor.shutdownNow();
        unregisterFromDeviceProvider();
        // Won't hurt but necessary?
        deviceProviderService = null;
        providerId = null;
        log.info("Stopped");
    }

    @Modified
    public void modified(ComponentContext context) {
        log.info("Reloading config...");
        // Needs re-registration to DeviceProvider
        if (loadRpcConfig(context)) {
            // unregister from Device and Link Providers with old parameters
            unregisterFromLinkServices();
            unregisterFromDeviceProvider();
            // register to Device and Link Providers with new parameters
            try {
                remoteServiceContext = rpcService.get(URI.create(remoteUri));
                providerId = new ProviderId(schemeProp, idProp);
                registerToDeviceProvider();
                registerToLinkServices();
            } catch (UnsupportedOperationException e) {
                log.warn("Unsupported URI: {}", remoteUri);
            }
            log.info("Re-registered with Device and Link Providers");
        }

        // Needs to advertise cross-connect links
        if (loadRestConfig(context)) {
            advertiseCrossConnectLinksOnAllPorts();
        }
    }

2021年6月16日

mininet 学习

命令语法

$ 这个符号代表现在处于 Linux 的shell 交互下，需要使用的是 Linux 命令 mininet> 这个符号表示现在处于 Mininet 交互下，需要使用的是 Mininet 的命令＃这个符号表示的是现在处于 Linux 的 root 权限下。 sudo mn -h 命令用于显示mininet的帮助信息 Mininet使用基于过程虚拟化和网络命名空间的特性来创建虚拟网络，并且创建的网络在当前的Linux内核中是可用的

sudo mn 启动Mininet

查看全部节点：nodes

查看链路信息：net

输出各节点信息：dump

sudo mn –test pingpair 直接对主机连通性进行测试

sudo mn –test iperf启动后直接进行性能测试

主机端

sudo mn -x ，通过使用-x参数，Mininet在启动后会在每个节点上自动打开一个XTerm，方便某些情况下对多个节点分别进行操作。

在进入mn cli 之后，也可以使用xterm node(s1 h2)命令指定启动某些节点上的xterm，如分别启动s1和h2上的xterm

禁用或启用某条链路，格式为：link node1 node2 up/down

–switch选项和–controller选项，可以指定采用哪种类型的交换机跟控制器

–innamespace参数，可以让所有结点拥有各自的名字空间

启动参数总结 -h, –help 打印帮助信息

–switch=SWITCH 交换机类型，包括 [kernel user ovsk]

–host=HOST 模拟主机类型，包括 [process]

–controller=CONTROLLER 控制器类型，包括 [nox_dump none ref remote nox_pysw]

–topo=TOPO,arg1,arg2,…argN 指定自带拓扑，包括 [tree reversed single linear minimal]

-c, –clean清理环境

–custom=CUSTOM 使用自定义拓扑和节点参数

–test=TEST 测试命令，包括 [cli build pingall pingpair iperf all iperfudp none]

-x, –xterms 在每个节点上打开 xterm

–mac 让MAC 地址跟 DP ID 相同

–arp 配置所有 ARP 项

-v VERBOSITY, –verbosity=VERBOSITY [info warning critical error debug output] 输出日志级别

–ip=IP 远端控制器的IP地址

–port=PORT 远端控制器监听端口

–innamespace 在独立的名字空间内

–listenport=LISTENPORT 被动监听的起始端口

–nolistenport 不使用被动监听端口

–pre=PRE 测试前运行的 CLI 脚本

–post=POST 测试后运行的 CLI 脚本

常用命令总结 help 默认列出所有命令文档，后面加命令名将介绍该命令用法

dump 打印节点信息

gterm 给定节点上开启 gnome-terminal。注：可能导致 Mininet 崩溃

xterm 给定节点上开启 xterm

intfs 列出所有的网络接口

iperf 两个节点之间进行简单的 iperf TCP测试

iperfudp 两个节点之间用指定带宽 udp 进行测试

net 显示网络链接情况

noecho 运行交互式窗口，关闭回应（echoing）

pingpair 在前两个主机之间互 ping 测试

source 从外部文件中读入命令

dpctl 在所有交换机上用 dptcl 执行相关命令，本地为 tcp 127.0.0.1:6634

link 禁用或启用两个节点之间的链路

nodes 列出所有的节点信息

pingall 所有 host 节点之间互 ping

py 执行 Python 表达式

sh 运行外部 shell 命令

quit/exit 退出

2021年6月16日

LXD源码解析

工作中忙的项目和lxd打交道比较多，因此我利用闲暇时间阅读了一下lxd的源码，以加深对于lxd的理解，顺便学习一些写golang的技巧。

关于lxd

lxd是lxc的第二版，和docker类似，也是一个利用Linux容器的管理工具。Linux容器可以实现一个类似与Linux虚拟机类似的环境，不同点是，牺牲了一定的隔离性的情况下运行开销更低。

而lxd相较于lxc来说，相当于在管理方式上进行了一层封装。lxc的配置文件完全依赖人工编写，支持的存储后端只有dir（也就是在原有系统存储中的目录），网络的管理方式也非常匮乏，基本上只能使用手动配合外部工具才能有效地利用容器。而lxd，也就是官方所说的2.0，在众多方面都做出了改进。

全新的C/S架构。客户端为一个名字叫lxc的工具（注意这个和原先那个lxc不是一个东西），服务端叫lxd，两者之间可以使用unix socket或者https的方式，通过RESTful API进行通信。这种设计让用户可以利用lxc工具对多个lxd进行远程管理，更重要的是，第三方程序也可以完全不依赖与lxc工具，直接使用API对lxd进行管理。相比lxc而言这种管理方式灵活了许多。
更方便的配置项。lxd提出了许多的新概念，可以让lxc容器的配置显得更加条理化。例如，lxd引入了镜像库，镜像库可以是在本地也可以在远程，支持镜像在两者之间的转移、导出、导入，也可以将停止运行的容器打包为镜像；lxd引入了profile的概念，profile在容器创建时被指定，而基于同一个profile的容器在初始化时具有同样的配置参数。
更加丰富的存储后端。lxd除了原有的dir类型外，还支持一些现代的高效存储后端，例如btrfs、zfs、ceph等，只需要在启用时安装配套工具即可。
原生的网络配置。单个容器能够发挥的功能十分有限，如何将多个容器进行连接是非常关键的。lxd直接融合了多种网络的配置功能，例如创建bridge、ovs、veth，以及overlay类型的接口GRE、Vxlan以及Ubuntu fan等等。
更加方便的设备管理。对于宿主机上的物理资源，lxd也直接提供了device的配置方法，用户可以按照需求直接将多种类型的设备绑定到容器中，例如GPU、物理网络接口、磁盘、infiniteband以及其他的字符型设备和块设备等。
自带的集群模式。可以将多个运行lxd的服务器组合为集群进行管理，数据一致性由raft保证，这样可以提高lxd的稳定性。

此外，lxd还提供了容器的迁移，并且在保证这些特性的同时，原有的lxc配置参数在lxd中得以保留。可以说，lxd的出现极大地提高了用户管理lxc的灵活度。

lxc

这里的lxc特指lxd的客户端，在lxd源码中的位置是lxc。这里额外提一下，在lxd（也就是lxc 2.0）中，一般使用的工具是lxc，运行的命令一般是lxc start、lxc stop等等；而在lxc（也就是lxc 1.0）中，一般使用的工具是lxc-***，运行的命令一般是lxc-start、lxc-stop。由于这里lxc经常出现，注意区分不要弄错了。

系统	服务名	开启容器的命令
lxc	lxc	lxc-start
lxd	lxd	lxc start

golang的程序一般是从包中的main函数开始的，因此这里首先看lxc/main.go文件。

func main() {
	// 定位配置文件，从配置文件中获得一些预制的运行参数
    err := execIfAliases()
	if err != nil {
		fmt.Fprintf(os.Stderr, "Error: %v\n", err)
		os.Exit(1)
	}

	// 配置解析器
	app := &cobra.Command{}
	app.Use = "lxc"
	app.Short = i18n.G("Command line client for LXD")
	app.Long = cli.FormatSection(i18n.G("Description"), i18n.G(
		`Command line client for LXD

All of LXD's features can be driven through the various commands below.
For help with any of those, simply call them with --help.`))
	app.SilenceUsage = true
	app.SilenceErrors = true

	// Global flags
    globalCmd := cmdGlobal{cmd: app}
    // 添加全局对象的处理逻辑
	app.PersistentFlags().BoolVar(&globalCmd.flagVersion, "version", false, i18n.G("Print version number"))
	app.PersistentFlags().BoolVarP(&globalCmd.flagHelp, "help", "h", false, i18n.G("Print help"))
	app.PersistentFlags().BoolVar(&globalCmd.flagForceLocal, "force-local", false, i18n.G("Force using the local unix socket"))
	app.PersistentFlags().StringVar(&globalCmd.flagProject, "project", "", i18n.G("Override the source project"))
	app.PersistentFlags().BoolVar(&globalCmd.flagLogDebug, "debug", false, i18n.G("Show all debug messages"))
	app.PersistentFlags().BoolVarP(&globalCmd.flagLogVerbose, "verbose", "v", false, i18n.G("Show all information messages"))
	app.PersistentFlags().BoolVarP(&globalCmd.flagQuiet, "quiet", "q", false, i18n.G("Don't show progress information"))

    // Wrappers
    // 配置运行前后的钩子函数
	app.PersistentPreRunE = globalCmd.PreRun
	app.PersistentPostRunE = globalCmd.PostRun

	// Version handling
	app.SetVersionTemplate("{{.Version}}\n")
	app.Version = version.Version

	// alias sub-command
	aliasCmd := cmdAlias{global: &globalCmd}
	app.AddCommand(aliasCmd.Command())

	// cluster sub-command
	clusterCmd := cmdCluster{global: &globalCmd}
	app.AddCommand(clusterCmd.Command())

    // ... 中间这部分和alias，cluster一样，都是在绑定子命令的入口
    
	// version sub-command
	versionCmd := cmdVersion{global: &globalCmd}
	app.AddCommand(versionCmd.Command())

	// Get help command
	app.InitDefaultHelpCmd()
	var help *cobra.Command
	for _, cmd := range app.Commands() {
		if cmd.Name() == "help" {
			help = cmd
			break
		}
	}

	// Help flags
	app.Flags().BoolVar(&globalCmd.flagHelpAll, "all", false, i18n.G("Show less common commands"))
	help.Flags().BoolVar(&globalCmd.flagHelpAll, "all", false, i18n.G("Show less common commands"))

	// Deal with --all flag
	err = app.ParseFlags(os.Args[1:])
	if err == nil {
		if globalCmd.flagHelpAll {
			// Show all commands
			for _, cmd := range app.Commands() {
				cmd.Hidden = false
			}
		}
	}

	// Run the main command and handle errors
	err = app.Execute()
	if err != nil {
		// Handle non-Linux systems
		if err == config.ErrNotLinux {
			fmt.Fprintf(os.Stderr, i18n.G(`This client hasn't been configured to use a remote LXD server yet.
As your platform can't run native Linux containers, you must connect to a remote LXD server.

If you already added a remote server, make it the default with "lxc remote switch NAME".
To easily setup a local LXD server in a virtual machine, consider using: https://multipass.run`)+"\n")
			os.Exit(1)
		}

		if err == cobra.ErrSubCommandRequired {
			os.Exit(0)
		}

		// Default error handling
		fmt.Fprintf(os.Stderr, "Error: %v\n", err)
		os.Exit(1)
	}

	if globalCmd.ret != 0 {
		os.Exit(globalCmd.ret)
	}
}

根据我的使用经验，lxc这个客户端是唯一一个能够与交换机终端媲美的一个客户端程序。我们都知道很多有众多命令参数的程序如果不阅读手册是完全没法使用的，而lxc无论在任何时候子命令状态下，都可以通过不敲后面的参数来获取帮助，而且说明信息非常的详细，只有在某些特定的参数单位不太清楚时才需要查阅手册。这部分源码可以说是显示出它神奇的奥秘。原来，lxc使用了一个名字叫cobra的库，这个库可以对命令行参数进行非常华丽的处理。这段逻辑中我们唯独需要关注的，就是一开头的execIfAliases，这个函数主要就是定位lxd使用的config.yml文件，然后从文件中读取配置信息，填充到全局使用的结构体cmdGlobal中。

type cmdGlobal struct {
	conf     *config.Config
	confPath string
	cmd      *cobra.Command
	ret      int

	flagForceLocal bool
	flagHelp       bool
	flagHelpAll    bool
	flagLogDebug   bool
	flagLogVerbose bool
	flagProject    string
	flagQuiet      bool
	flagVersion    bool
}

解析配置文件的逻辑在lxc/config目录下，入口点是file.go文件中的LoadConfig函数。值得一提的是，lxc工具还支持命令的简写，通过源码我们可以发现只需要用lxc alias工具管理一个alias的键值映射即可。

这里我们深入分析一下，lxc是如何将config作为一个全局变量进行传递的。以刚才提到的alias命令为例，在添加子命令的时候main.go中的有这样的代码。

    // alias sub-command
	aliasCmd := cmdAlias{global: &globalCmd}
	app.AddCommand(aliasCmd.Command())

cmdAlias是在alias.go中的一个结构体，结构中包括一个名叫global的cmdGlobal对象，此外还有Command函数。添加子命令时程序先将globalCmd传递到cmdAlias中，再将Command函数注册到alias子命令的映射中，这样比较巧妙地将全局参数传递到了alias子命令中。

当完成了参数解析后，具体的执行逻辑将会转移到各个子命令的Command函数中，我们这里挑选最常见的launch命令，完整地来看一次容器的创建流程。

首先，我们假定运行的命令为lxc launch ubuntu:16.04 u1，看看这条命令在launch.go是如何处理的。首先，我们发现cmdLaunch这个结构体与其他文件有一定的差异。

type cmdLaunch struct {
	global *cmdGlobal
	init   *cmdInit
}

看见了吗，多了一个cmdInit对象。不难发现这个对象就是init.go中的对象，结合launch的具体过程我们可以想象，launch.go中可能是分两步执行launch过程，首先是lxc init ubuntu:16.04 u1，接下来lxc start u1。


func (c *cmdLaunch) Command() *cobra.Command {
	cmd := c.init.Command()
	cmd.Use = i18n.G("launch [<remote>:]<image> [<remote>:][<name>]")
	cmd.Short = i18n.G("Create and start containers from images")
	cmd.Long = cli.FormatSection(i18n.G("Description"), i18n.G(
		`Create and start containers from images`))
	cmd.Example = cli.FormatSection("", i18n.G(
		`lxc launch ubuntu:16.04 u1

lxc launch ubuntu:16.04 u1 < config.yaml
    Create and start the container with configuration from config.yaml`))
	cmd.Hidden = false

	cmd.RunE = c.Run

	return cmd
}

源码中我们可以发现果然情况如此，launch.go的注册直接复用了init.go，只是将描述信息和运行函数进行了复写，这样可以将一些init的逻辑直接复用。我们定位到运行函数Run。

func (c *cmdLaunch) Run(cmd *cobra.Command, args []string) error {
	conf := c.global.conf

	// Sanity checks
	exit, err := c.global.CheckArgs(cmd, args, 1, 2)
	if exit {
		return err
	}

	// Call the matching code from init
	d, name, err := c.init.create(conf, args)
	if err != nil {
		return err
	}

	// Get the remote
	var remote string
	if len(args) == 2 {
		remote, _, err = conf.ParseRemote(args[1])
		if err != nil {
			return err
		}
	} else {
		remote, _, err = conf.ParseRemote("")
		if err != nil {
			return err
		}
	}

	// Start the container
	if !c.global.flagQuiet {
		fmt.Printf(i18n.G("Starting %s")+"\n", name)
	}

	req := api.InstanceStatePut{
		Action:  "start",
		Timeout: -1,
	}

	op, err := d.UpdateInstanceState(name, req, "")
	if err != nil {
		return err
	}

	progress := utils.ProgressRenderer{
		Quiet: c.global.flagQuiet,
	}
	_, err = op.AddHandler(progress.UpdateOp)
	if err != nil {
		progress.Done("")
		return err
	}

	// Wait for operation to finish
	err = utils.CancelableWait(op, &progress)
	if err != nil {
		progress.Done("")
		prettyName := name
		if remote != "" {
			prettyName = fmt.Sprintf("%s:%s", remote, name)
		}

		return fmt.Errorf("%s\n"+i18n.G("Try `lxc info --show-log %s` for more info"), err, prettyName)
	}

	progress.Done("")
	return nil
}

这段逻辑中除了输出显示的部分外，首先检查了一下参数数量和类型，接下来调用init逻辑中的create函数，接下来解析remote参数，然后构造了一个请求，针对该请求注册一个回调函数，最后执行同步性请求。那么这个部分中我们首先会关注init.go的create函数。

func (c *cmdInit) create(conf *config.Config, args []string) (lxd.InstanceServer, string, error) {
	var name string
	var image string
	var remote string
	var iremote string
	var err error
	var stdinData api.InstancePut
	var devicesMap map[string]map[string]string
	var configMap map[string]string

	// If stdin isn't a terminal, read text from it
	// ...

	if len(args) > 0 {
		// ... 指定的是remote:container的容器，解析出remote
	}

	if c.flagEmpty {
		if len(args) > 1 {
			return nil, "", fmt.Errorf(i18n.G("--empty cannot be combined with an image name"))
		}

		if len(args) == 0 {
			remote, name, err = conf.ParseRemote("")
			if err != nil {
				return nil, "", err
			}
		} else if len(args) == 1 {
			// Switch image / container names
			name = image
			remote = iremote
			image = ""
			iremote = ""
		}
	}

	d, err := conf.GetInstanceServer(remote)
	if err != nil {
		return nil, "", err
	}

	if c.flagTarget != "" {
		d = d.UseTarget(c.flagTarget)
	}

	profiles := []string{}
	for _, p := range c.flagProfile {
		profiles = append(profiles, p)
	}

	// 打印开始创建的信息

	if len(stdinData.Devices) > 0 {
		devicesMap = stdinData.Devices
	} else {
		devicesMap = map[string]map[string]string{}
	}

	if c.flagNetwork != "" {
		network, _, err := d.GetNetwork(c.flagNetwork)
		if err != nil {
			return nil, "", err
		}

		if network.Type == "bridge" {
			devicesMap[c.flagNetwork] = map[string]string{"type": "nic", "nictype": "bridged", "parent": c.flagNetwork}
		} else {
			devicesMap[c.flagNetwork] = map[string]string{"type": "nic", "nictype": "macvlan", "parent": c.flagNetwork}
		}
	}

	if len(stdinData.Config) > 0 {
		configMap = stdinData.Config
	} else {
		configMap = map[string]string{}
	}
	for _, entry := range c.flagConfig {
		if !strings.Contains(entry, "=") {
			return nil, "", fmt.Errorf(i18n.G("Bad key=value pair: %s"), entry)
		}

		fields := strings.SplitN(entry, "=", 2)
		configMap[fields[0]] = fields[1]
	}

	// Check if the specified storage pool exists.
	if c.flagStorage != "" {
		_, _, err := d.GetStoragePool(c.flagStorage)
		if err != nil {
			return nil, "", err
		}

		devicesMap["root"] = map[string]string{
			"type": "disk",
			"path": "/",
			"pool": c.flagStorage,
		}
	}

	// Decide whether we are creating a container or a virtual machine.
	instanceDBType := api.InstanceTypeContainer
	if c.flagVM {
		instanceDBType = api.InstanceTypeVM
	}

	// Setup instance creation request
	req := api.InstancesPost{
		Name:         name,
		InstanceType: c.flagType,
		Type:         instanceDBType,
	}
	req.Config = configMap
	req.Devices = devicesMap

	if !c.flagNoProfiles && len(profiles) == 0 {
		if len(stdinData.Profiles) > 0 {
			req.Profiles = stdinData.Profiles
		} else {
			req.Profiles = nil
		}
	} else {
		req.Profiles = profiles
	}
	req.Ephemeral = c.flagEphemeral

	var opInfo api.Operation
	if !c.flagEmpty {
		// Get the image server and image info
		iremote, image = c.guessImage(conf, d, remote, iremote, image)
		var imgRemote lxd.ImageServer
		var imgInfo *api.Image

		// Connect to the image server
		if iremote == remote {
			imgRemote = d
		} else {
			imgRemote, err = conf.GetImageServer(iremote)
			if err != nil {
				return nil, "", err
			}
		}

		// Deal with the default image
		if image == "" {
			image = "default"
		}

		// Optimisation for simplestreams
		if conf.Remotes[iremote].Protocol == "simplestreams" {
			imgInfo = &api.Image{}
			imgInfo.Fingerprint = image
			imgInfo.Public = true
			req.Source.Alias = image
		} else {
			// Attempt to resolve an image alias
			alias, _, err := imgRemote.GetImageAlias(image)
			if err == nil {
				req.Source.Alias = image
				image = alias.Target
			}

			// Get the image info
			imgInfo, _, err = imgRemote.GetImage(image)
			if err != nil {
				return nil, "", err
			}
		}

		// Create the instance
		op, err := d.CreateInstanceFromImage(imgRemote, *imgInfo, req)
		if err != nil {
			return nil, "", err
		}

		// Watch the background operation
		progress := utils.ProgressRenderer{
			Format: i18n.G("Retrieving image: %s"),
			Quiet:  c.global.flagQuiet,
		}

		_, err = op.AddHandler(progress.UpdateOp)
		if err != nil {
			progress.Done("")
			return nil, "", err
		}

		err = utils.CancelableWait(op, &progress)
		if err != nil {
			progress.Done("")
			return nil, "", err
		}
		progress.Done("")

		// Extract the container name
		info, err := op.GetTarget()
		if err != nil {
			return nil, "", err
		}

		opInfo = *info
	} else {
		req.Source.Type = "none"

		op, err := d.CreateInstance(req)
		if err != nil {
			return nil, "", err
		}

		err = op.Wait()
		if err != nil {
			return nil, "", err
		}

		opInfo = op.Get()
	}

	instances, ok := opInfo.Resources["instances"]
	if !ok || len(instances) == 0 {
		// Try using the older "containers" field
		instances, ok = opInfo.Resources["containers"]
		if !ok || len(instances) == 0 {
			return nil, "", fmt.Errorf(i18n.G("Didn't get any affected image, instance or snapshot from server"))
		}
	}

	if len(instances) == 1 && name == "" {
		fields := strings.Split(instances[0], "/")
		name = fields[len(fields)-1]
		fmt.Printf(i18n.G("Instance name is: %s")+"\n", name)
	}

	// Validate the network setup
	c.checkNetwork(d, name)

	return d, name, nil
}

可以看到，这部分虽然代码很长，但是大部分逻辑都是在构造容器的config，如果用户没有指定参数的话使用什么默认参数，例如镜像、网络、存储池、profile等等。核心代码是使用GetInstanceServer命令得到了一个InstanceServer对象，构造参数时也会通过这个对象查询，最后使用该对象的CreateInstanceFromImage（无镜像时用CreateInstance生成容器。那么，这里我们看看remote.go中的GetInstanceServer。

// GetInstanceServer returns a InstanceServer struct for the remote
func (c *Config) GetInstanceServer(name string) (lxd.InstanceServer, error) {
	// Handle "local" on non-Linux
	if name == "local" && runtime.GOOS != "linux" {
		return nil, ErrNotLinux
	}

	// Get the remote
	remote, ok := c.Remotes[name]
	if !ok {
		return nil, fmt.Errorf("The remote \"%s\" doesn't exist", name)
	}

	// Sanity checks
	if remote.Public || remote.Protocol == "simplestreams" {
		return nil, fmt.Errorf("The remote isn't a private LXD server")
	}

	// Get connection arguments
	args, err := c.getConnectionArgs(name)
	if err != nil {
		return nil, err
	}

	// Unix socket
	if strings.HasPrefix(remote.Addr, "unix:") {
		d, err := lxd.ConnectLXDUnix(strings.TrimPrefix(strings.TrimPrefix(remote.Addr, "unix:"), "//"), args)
		if err != nil {
			return nil, err
		}

		if remote.Project != "" && remote.Project != "default" {
			d = d.UseProject(remote.Project)
		}

		if c.ProjectOverride != "" {
			d = d.UseProject(c.ProjectOverride)
		}

		return d, nil
	}

	// HTTPs
	if remote.AuthType != "candid" && (args.TLSClientCert == "" || args.TLSClientKey == "") {
		return nil, fmt.Errorf("Missing TLS client certificate and key")
	}

	d, err := lxd.ConnectLXD(remote.Addr, args)
	if err != nil {
		return nil, err
	}

	if remote.Project != "" && remote.Project != "default" {
		d = d.UseProject(remote.Project)
	}

	if c.ProjectOverride != "" {
		d = d.UseProject(c.ProjectOverride)
	}

	return d, nil
}

针对两种连接模式，该函数使用lxd包中的ConnectLXDUnix和ConnectLXD两个函数连接。值得注意的是，这里的lxd包并不是指的lxd这个目录，仔细看会发现这里的lxd包实际上在client这个目录下，而lxd目录下的包名实际上叫main。个人认为lxd在命名方面确实存在着很多的混淆点，除去lxd的命令行工具叫做lxc很可能与lxc 1.0让人产生误解，这里的包名稍不注意也会弄错。话说回来，这样我们知道了client这个目录下的代码应当是用来生成客户端向服务端发起的请求的。

在client/connect.go中，我们找到了ConnectLXD函数。

// ConnectLXD lets you connect to a remote LXD daemon over HTTPs.
//
// A client certificate (TLSClientCert) and key (TLSClientKey) must be provided.
//
// If connecting to a LXD daemon running in PKI mode, the PKI CA (TLSCA) must also be provided.
//
// Unless the remote server is trusted by the system CA, the remote certificate must be provided (TLSServerCert).
func ConnectLXD(url string, args *ConnectionArgs) (InstanceServer, error) {
	logger.Debugf("Connecting to a remote LXD over HTTPs")

	// Cleanup URL
	url = strings.TrimSuffix(url, "/")

	return httpsLXD(url, args)
}

// Internal function called by ConnectLXD and ConnectPublicLXD
func httpsLXD(url string, args *ConnectionArgs) (InstanceServer, error) {
	// Use empty args if not specified
	if args == nil {
		args = &ConnectionArgs{}
	}

	// Initialize the client struct
	server := ProtocolLXD{
		httpCertificate:  args.TLSServerCert,
		httpHost:         url,
		httpProtocol:     "https",
		httpUserAgent:    args.UserAgent,
		bakeryInteractor: args.AuthInteractor,
		chConnected:      make(chan struct{}, 1),
	}

	if args.AuthType == "candid" {
		server.RequireAuthenticated(true)
	}

	// Setup the HTTP client
	httpClient, err := tlsHTTPClient(args.HTTPClient, args.TLSClientCert, args.TLSClientKey, args.TLSCA, args.TLSServerCert, args.InsecureSkipVerify, args.Proxy)
	if err != nil {
		return nil, err
	}

	if args.CookieJar != nil {
		httpClient.Jar = args.CookieJar
	}

	server.http = httpClient
	if args.AuthType == "candid" {
		server.setupBakeryClient()
	}

	// Test the connection and seed the server information
	if !args.SkipGetServer {
		_, _, err := server.GetServer()
		if err != nil {
			return nil, err
		}
	}
	return &server, nil
}

而这里已经接近https请求发送的底层了，我们不再深入分析了，只是需要注意的是lxd使用的是位于util.go中几乎自己实现的tlsHttpClient，而不像我想象的那样使用了第三方的https请求库。至于unix socket部分和https请求类似。

那么，在获取了这个InstanceServer后，程序使用了对象的CreateInstanceFromImage函数来创建容器。找到interfaces.go中的InstanceServer后我们发现这是一个接口，由lxd.go中的ProtocolLXD实现。我们来看看这个CreateInstanceFromImage函数（该函数实现在lxd_instances.go中）。


// CreateInstanceFromImage is a convenience function to make it easier to create a instance from an existing image.
func (r *ProtocolLXD) CreateInstanceFromImage(source ImageServer, image api.Image, req api.InstancesPost) (RemoteOperation, error) {
	// Set the minimal source fields
	req.Source.Type = "image"

	// Optimization for the local image case
	if r == source {
		// Always use fingerprints for local case
		req.Source.Fingerprint = image.Fingerprint
		req.Source.Alias = ""

		op, err := r.CreateInstance(req)
		if err != nil {
			return nil, err
		}

		rop := remoteOperation{
			targetOp: op,
			chDone:   make(chan bool),
		}

		// Forward targetOp to remote op
		go func() {
			rop.err = rop.targetOp.Wait()
			close(rop.chDone)
		}()

		return &rop, nil
	}

	// Minimal source fields for remote image
	req.Source.Mode = "pull"

	// If we have an alias and the image is public, use that
	if req.Source.Alias != "" && image.Public {
		req.Source.Fingerprint = ""
	} else {
		req.Source.Fingerprint = image.Fingerprint
		req.Source.Alias = ""
	}

	// Get source server connection information
	info, err := source.GetConnectionInfo()
	if err != nil {
		return nil, err
	}

	req.Source.Protocol = info.Protocol
	req.Source.Certificate = info.Certificate

	// Generate secret token if needed
	if !image.Public {
		secret, err := source.GetImageSecret(image.Fingerprint)
		if err != nil {
			return nil, err
		}

		req.Source.Secret = secret
	}

	return r.tryCreateInstance(req, info.Addresses)
}

这里简单的对镜像进行获取后，调用tryCreateInstance。


func (r *ProtocolLXD) tryCreateInstance(req api.InstancesPost, urls []string) (RemoteOperation, error) {
	if len(urls) == 0 {
		return nil, fmt.Errorf("The source server isn't listening on the network")
	}

	rop := remoteOperation{
		chDone: make(chan bool),
	}

	operation := req.Source.Operation

	// Forward targetOp to remote op
	go func() {
		success := false
		errors := map[string]error{}
		for _, serverURL := range urls {
			if operation == "" {
				req.Source.Server = serverURL
			} else {
				req.Source.Operation = fmt.Sprintf("%s/1.0/operations/%s", serverURL, url.PathEscape(operation))
			}

			op, err := r.CreateInstance(req)
			if err != nil {
				errors[serverURL] = err
				continue
			}

			rop.targetOp = op

			for _, handler := range rop.handlers {
				rop.targetOp.AddHandler(handler)
			}

			err = rop.targetOp.Wait()
			if err != nil {
				errors[serverURL] = err
				continue
			}

			success = true
			break
		}

		if !success {
			rop.err = remoteOperationError("Failed instance creation", errors)
		}

		close(rop.chDone)
	}()

	return &rop, nil
}

这里面涉及到了golang的并发实现go func() {}()，以建立一个异步的请求，并注册回调函数到rop中，在请求回复后执行。

2021年6月16日

org.onosproject.fwd 应用解析

ONOS 二层转发应用

org.onosproject.fwd应用应该说是ONOS中最核心的应用了，要想让我们创建的Mininet虚拟网络实现二层互通，就需要激活这个官方应用，因此从这个应用中我们能够学习到ONOS对网络的抽象方式，以及二层转发功能实现方式。截至本文发布之时ONOS的最新版本是1.13.0-SNAPSHOT，因此这里的源码也截至最新开发版。首先我们还是看一下应用的pom.xml文件。

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>org.onosproject</groupId>
        <artifactId>onos-apps</artifactId>
        <version>1.13.0-SNAPSHOT</version>
    </parent>

    <artifactId>onos-app-fwd</artifactId>
    <packaging>bundle</packaging>

    <description>Reactive forwarding application using flow subsystem</description>

    <properties>
        <onos.app.name>org.onosproject.fwd</onos.app.name>
        <onos.app.title>Reactive Forwarding App</onos.app.title>
        <onos.app.category>Traffic Steering</onos.app.category>
        <onos.app.url>http://onosproject.org</onos.app.url>
        <onos.app.readme>Reactive forwarding application using flow subsystem.</onos.app.readme>
    </properties>

    <dependencies>
        ...
    </dependencies>

</project>

根据简介，我们可以得知这里的二层转发方式是使用Flow子系统来实现的，具体细化一下应该是FlowObjectiveService。更加详尽的介绍我们可以在BUCK文件中找到（值得一提的是，ONOS已经使用BUCK作为默认构建方式，虽然Maven仍然可以使用）。

Provisions traffic between end-stations using hop-by-hop flow programming by intercepting packets for which there are currently no matching flow objectives on the data plane. 
The paths paved in this manner are short-lived, i.e. they expire a few seconds after the flow on whose behalf they were programmed stops. 
The application relies on the ONOS path service to compute the shortest paths. 
In the event of negative topology events (link loss, device disconnect, etc.), the application will proactively invalidate any paths that it had programmed to lead through the resources that are no longer available.

应用结构

org.onosproject.fwd下包含几个文件

MacAddressCompleter.java
ReactiveForwarding.java
ReactiveForwardingCommand.java
ReactiveForwardMetrics.java

MacAddressCompleter

public class MacAddressCompleter implements Completer {
    @Override
    public int complete(String buffer, int cursor, List<String> candidates) {
        // Delegate string completer
        StringsCompleter delegate = new StringsCompleter();
        EventuallyConsistentMap<MacAddress, ReactiveForwardMetrics> macAddress;
        // Fetch our service and feed it's offerings to the string completer
        ReactiveForwarding reactiveForwardingService = AbstractShellCommand.get(ReactiveForwarding.class);
        macAddress = reactiveForwardingService.getMacAddress();
        SortedSet<String> strings = delegate.getStrings();
        for (MacAddress key : macAddress.keySet()) {
            strings.add(key.toString());
        }
        // Now let the completer do the work for figuring out what to offer.
        return delegate.complete(buffer, cursor, candidates);
    }
}

很容易看出来，这个类是用来CLI下补全MAC地址的，在resources/OSGI-INF.blueprint/shell-config.xml文件中我们可以看到命令的定义方式。

<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">
    <command-bundle xmlns="http://karaf.apache.org/xmlns/shell/v1.1.0">
        <command>
            <action class="org.onosproject.fwd.ReactiveForwardingCommand"/>
            <completers>
            <ref component-id="MacAddressCompleter"/>
            </completers>
        </command>
    </command-bundle>
    <bean id="MacAddressCompleter" class="org.onosproject.fwd.MacAddressCompleter"/>
</blueprint>

将Completer声明为一个bean，引用在org.onosproject.fwd.ReactiveForwardingCommand中。那么我们接下来看一下Command的实现。

ReactiveForwardingCommand

@Command(scope = "onos", name = "reactive-fwd-metrics",
        description = "List all the metrics of reactive fwd app based on mac address")
public class ReactiveForwardingCommand extends AbstractShellCommand {
    @Argument(index = 0, name = "mac", description = "One Mac Address",
            required = false, multiValued = false)
    String mac = null;
    @Override
    protected void execute() {
        ReactiveForwarding reactiveForwardingService = AbstractShellCommand.get(ReactiveForwarding.class);
        MacAddress macAddress = null;
        if (mac != null) {
            macAddress = MacAddress.valueOf(mac);
        }
        reactiveForwardingService.printMetric(macAddress);
    }
}

可以看到，这里用注解创建了一个CLI命令，名为onos:reactive-fwd-metrics，后面加mac地址，可以打印出对应主机的metrics。

onos> onos:reactive-fwd-metrics aa:0e:a8:c8:c9:a8
-----------------------------------------------------------------------------------------
 MACADDRESS 						 Metrics
 AA:0E:A8:C8:C9:A8 			 null

ReactiveForwardMetrics

public class ReactiveForwardMetrics {
    private Long replyPacket = null;
    private Long inPacket = null;
    private Long droppedPacket = null;
    private Long forwardedPacket = null;
    private MacAddress macAddress;
}

可以看到，ReactiveForwardMetrics这个应用是用来统计Packet的处理情况的，上面的代码中省略了对数量进行更新的函数以及toString函数。

ReactiveForwarding

这个类是fwd应用中最核心的文件，实现了转发的具体逻辑。

2021年6月16日

DPDK Pktgen和Testpmd验证试验

Ref: Version: DPDK 19.08 / Pktgen 3.7.2

+--------+---------------+               +-------------------+---------------+
|        | socket file 1 |   <------->   | vhost-user port 1 |               |
|        +---------------+               +-------------------+     Docker    |
| host   |     pktgen    |               |      testpmd      |   container   |
|        +---------------+               +-------------------+               |
|        | socket file 0 |   <------->   | vhost-user port 0 |               |
+--------+---------------+               +-------------------+---------------+

Compile DPDK and Pktgen

DPDK

export RTE_SDK=~/dpdk/dpdk-19.08
export RTE_TARGET=x86_64-native-linuxapp-gcc
sed -ri  's,(CONFIG_RTE_LIBRTE_VHOST).*,\1y' config/common_base
make config T=$RTE_TARGET
sed -ri 's,(PMD_PCAP).*,\1y' build/.config
make

Pktgen

export RTE_SDK=~/dpdk/dpdk-19.08
export RTE_TARGET=build
make

Build a Docker image

Create a dockerfile in the directory contains DPDK_SDK.

FROM ubuntu:16.04
WORKDIR /root/dpdk
COPY dpdk-19.08 /root/dpdk/.
ENV PATH "$PATH:/root/dpdk/$RTE_TARGET/app/"
RUN sed -i 's/archive.ubuntu.com/mirrors.tuna.tsinghua.edu.cn/g' /etc/apt/sources.list && \
    apt update && apt install -y libnuma-dev libpcap-dev
ENTRYPOINT ["/bin/bash"]

Allocate HugePage

Modify /etc/default/grub.

GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1GB hugepagesz=1G hugepages=8"

Update the grub file and reboot to take effect.

sudo update-grub
reboot
mkdir -p /dev/hugepages
sudo mount -t hugetlbfs none /dev/hugepages

Run the Testpmd container

sudo docker run -ti --rm --name=test \
-v /dev/hugepages:/dev/hugepages \
-v /tmp/virtio/:/tmp/virtio/ \
--privileged dpdk

Type the commands below inside the container shell

testpmd -l 0-1 -n 1 --socket-mem 1024,1024 \
--vdev 'eth_vhost0,iface=/tmp/virtio/sock0' --vdev 'eth_vhost1,iface=/tmp/virtio/sock1' \
--file-prefix=test --no-pci \
-- -i --forward-mode=io --auto-start

Some usefule runtime functions

show port stats all

Generate the packets

sudo pktgen -l 2,3,4 -n 2 --vdev=virtio_user0,path=/tmp/virtio/sock0 --vdev=virtio_user1,path=/tmp/virtio/sock1 -- -P -m "3.0,4.1"

Some useful runtime functions

set all rate 10 # set the sending rate at 10%
set 0 count 100 # request the channel 0 to send 100 packets in total
str # start

2021年6月16日

Cisco IOS命令参考

ip subset-zero

show ip route

<C-a> 跳转到开头 <C-e> 跳转到结尾 <C-z> 退出特权模式

设置主机名 hostname Router
设置banner banner motd 登录时显示 banner exec 创建vty连接时显示 banner login 在motd之后显示
设置密码

设置enable时的密码的 enable secret 设置启用密码 enable password 设置启用加密密码（优先级高于启用密码）
设置用户模式的密码 line console 0 控制器端口的用户模式密码 line aux 0 辅助端口密码 line vty 0 15 Telnet连接到路由器的密码

Router(config)#line console 0
Router(config-line)#password console
Router(config-line)#login

exec-timeout <minutes> <seconds> 会话的超时时间 logging synchronous 输出不会中断输入

设置域名 ip domain-name xxx.com
配置ssh登录

Router(config)#hostname r1           
r1(config)#ip domain-name barrygates.cn
r1(config)#crypto key generate rsa
The name for the keys will be: r1.barrygates.cn
Choose the size of the key modulus in the range of 360 to 4096 for your
  General Purpose Keys. Choosing a key modulus greater than 512 may take
  a few minutes.

How many bits in the modulus [512]:
% Generating 512 bit RSA keys, keys will be non-exportable...
[OK] (elapsed time was 0 seconds)

r1(config)#
*Feb 14 11:42:15.394:  RSA key size needs to be atleast 768 bits for ssh version 2
r1(config)#
*Feb 14 11:42:15.402: %SSH-5-ENABLED: SSH 1.5 has been enabled
r1(config)#ip ssh version 2
Please create RSA keys to enable SSH (and of atleast 768 bits for SSH v2).
r1(config)#line vty 0 15
r1(config-line)#transport input ssh
r1(config-line)#

对密码加密默认情况下只有启用加密密码是加密的，如果要让所有的密码都加密 service password-encryption。
端口描述

r1(config)#int fastEthernet 0/0
r1(config-if)#ip address 172.16.0.1 255.255.0.0
r1(config-if)#description for test
r1(config-if)#exit
r1(config)#do show interfaces description
Interface                      Status         Protocol Description
Fa0/0                          admin down     down     for test
r1(config)#

辅助IP地址

r1(config-if)#ip address 172.16.1.1 255.255.0.0 secondary

管道

r1#sh run | ?
  append    Append redirected output to URL (URLs supporting append operation
            only)
  begin     Begin with the line that matches
  count     Count number of lines which match regexp
  exclude   Exclude lines that match
  format    Format the output using the specified spec file
  include   Include lines that match
  redirect  Redirect output to URL
  section   Filter a section of output
  tee       Copy output to URL

保存配置

copy running-config startup-config

删除配置

erase startup-config

重置端口计数器

clear counters e0/0

show protocols 接口1、2层情况，IP地址 show controllers 物理接口情况

DHCP设置

IOU1(config)#ip dhcp excluded-address 192.168.10.1 192.168.10.10
IOU1(config)#ip dhcp pool MyNetwork
IOU1(dhcp-config)#network 192.168.10.0 255.255.255.0
IOU1(dhcp-config)#default-router 192.168.10.1
IOU1(dhcp-config)#dns-server 8.8.8.8
IOU1(dhcp-config)#lease 3 12 15

上面表示创建了一个192.168.10.0/24下的地址池，DNS服务器为8.8.8.8，默认网关为192.168.10.1，排除两个地址，地址租期为3天12小时15分钟。

DHCP中继

如果不配置，路由器默认情况对DHCP广播丢弃。

IOU1(config)#int f0/0
IOU1(config-if)#ip helper-address 10.10.10.254

将DHCP广播转发到10.10.10.254。

对于DHCP的信息验证

show ip dhcp binding 已分配的IP状态

show ip dhcp pool [poolname] IP地址池情况

show ip dhcp server statistics DHCP统计情况

show ip dhcp conflict 冲突情况

NTP

IOU1(config)#ntp server 172.16.10.1 version 4
IOU1(config)#ntp master
IOU1#show ntp status
IOU1#show ntp associations

CDP

show cdp会显示CDP定时器、CDP信息在表中的保持时间

cdp holdtime
cdp timer no cdp run 关闭cdp show cdp neighbors显示直连设备的信息，cdp不会穿越思科交换机。详细信息包括show cdp entry *和show cdp neighbors detail。

2021年6月16日