問題處理過程

簡陋的筆記一下過程~

Storage server 上, 建立 zfs volume (也可以用一般的檔案做為 iSCSI backstore)

$ sudo zfs create -V 64G rpool/raspberry-pi-4
$ ls /dev/zvol/rpool/raspberry-pi-4 -l
lrwxrwxrwx 1 root root 10  5月  7 14:55 /dev/zvol/rpool/raspberry-pi-4 -> ../../zd16

安裝 targetcli-fb (iSCSI target server 的管理工具, iSCSI 的實現是在 kernel 中)

$ sudo apt install targetcli-fb

建立 iSCSI target (也可以用它的 TUI 完成, 這裡用獨立的幾個命令完成, 首先, 註冊 backstore, 命名為 pi4

$ sudo targetcli /backstores/block create /dev/zvol/rpool/raspberry-pi-4 name=pi4

建立 portal & target

$ sudo targetcli iscsi/ create iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395

關聯 backstore 及 target

$ sudo targetcli iscsi/iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395/tpg1/luns create /backstores/block/pi4

這裡因為還沒有 initiator 的 IQN, 所以先不加 acl (連接時會用 initiator IQN 識別 client).

這裡要注意的是, targetcli-fb + python3-rtslib-fb 這兩個 package 一直都有著小 bug, 會造成重新開機後, iSCSI target 的設定沒法正常載入.

例如我原來安裝的版本會有 targetctl restore 時沒法把保存的 target 正常載入的情況, 害我以為是 zfs 的 volume 的 symlinks 建立時間太晚造成它找不到 device node 造成的.

更新成 sid 裡的 2.1.69-3 版後, 雖說修正了以上問題, 但 systemd 的 .service 中命令沒使用絕對路徑又造成 service 不啟動... 還好後者手工修一下就好了~

接著到 PI 4 上, 安裝 open-iscsi (iSCSI client 管理工具)

$ sudo apt install open-iscsi

找尋 iSCSI target

$ iscsiadm --mode discovery --portal SERVER-IP --type sendtargets

這樣, 相關資訊就會被保留下來, iscsid 就能被啟動

$ sudo systemctl restart iscsid.service

iscsid 啟動後, 就會產生 initiator IQN

$ sudo cat /etc/iscsi/initiatorname.iscsi 
## DO NOT EDIT OR REMOVE THIS FILE!
## If you remove this file, the iSCSI daemon will not start.
## If you change the InitiatorName, existing access control lists
## may reject this initiator.  The InitiatorName must be unique
## for each iSCSI initiator.  Do NOT duplicate iSCSI InitiatorNames.
InitiatorName=iqn.1993-08.org.debian:01:7cea74f92ecf

回到 storage server 上, 加上 ACL, 這樣 initiator (client) 才能 login

$ sudo targetcli iscsi/iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395/tpg1/acl create iqn.1993-08.org.debian:01:7cea74f92ecf

最終 iSCSI target 的設定長成這樣

$ sudo targetcli ls
o- / ......................................................................................................................... [...]
  o- backstores .............................................................................................................. [...]
  | o- block .................................................................................................. [Storage Objects: 1]
  | | o- pi4 ....................................................... [/dev/zvol/rpool/raspberry-pi-4 (64.0GiB) write-thru activated]
  | |   o- alua ................................................................................................... [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ....................................................................... [ALUA state: Active/optimized]
  | o- fileio ................................................................................................. [Storage Objects: 0]
  | o- pscsi .................................................................................................. [Storage Objects: 0]
  | o- ramdisk ................................................................................................ [Storage Objects: 0]
  o- iscsi ............................................................................................................ [Targets: 1]
  | o- iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395 .............................................................. [TPGs: 1]
  |   o- tpg1 ............................................................................................... [no-gen-acls, no-auth]
  |     o- acls .......................................................................................................... [ACLs: 1]
  |     | o- iqn.1993-08.org.debian:01:7cea74f92ecf ............................................................... [Mapped LUNs: 1]
  |     |   o- mapped_lun0 ................................................................................... [lun0 block/pi4 (rw)]
  |     o- luns .......................................................................................................... [LUNs: 1]
  |     | o- lun0 .................................................. [block/pi4 (/dev/zvol/rpool/raspberry-pi-4) (default_tg_pt_gp)]
  |     o- portals .................................................................................................... [Portals: 1]
  |       o- 0.0.0.0:3260 ..................................................................................................... [OK]
  o- loopback ......................................................................................................... [Targets: 0]
  o- vhost ............................................................................................................ [Targets: 0]
  o- xen-pvscsi ....................................................................................................... [Targets: 0]

回到 PI 4, login 到 iSCSI target 上

$ sudo iscsiadm -m node -L all

應該就能看到新的 disk 了 (sda). 至此, iSCSI target (server) & initiator (client) 能正常互通了

$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    0   64G  0 disk 
mmcblk0     179:0    0 29.8G  0 disk 
├─mmcblk0p1 179:1    0  256M  0 part /boot/firmware
└─mmcblk0p2 179:2    0 29.6G  0 part

Partition + format 成 ext4 後, mount 到 /mnt, 然後將 root file system 同步上去

$ sudo parted /dev/sda mklabel gpt mkpart primary 1 -1
$ sudo mkfs.ext4 /dev/sda1
$ sudo mount /dev/sda1 /mnt
$ cd /
$ sudo rsync . /mnt -av --xattrs --acls --one-file-system
$ sudo blkid | grep sda1
/dev/sda1: UUID="3742ae18-12cf-4582-8bbc-d00798975677" TYPE="ext4" PARTLABEL="primary" PARTUUID="89555973-6283-4b8d-9cae-9e8848022f59"

修改 fstab

$ cat /etc/fstab 
UUID=5fcce78f-2de3-4805-8ffa-d0f11247d5bb	/	 ext4	defaults,noatime,relatime,_netdev,x-systemd.requires=iscsid.service,discard	0 0
LABEL=system-boot       /boot/firmware  vfat    defaults        0       1

要用 iSCSI 做為 rootfs, 資訊不多, 主要是參考了 arch 的這篇. 但, 因為涉及 initramfs, 所以各 distro 的做法不同. Debian 官方似乎沒有清楚的 document, 但翻找了下 initramfs-tools 中 open-iscsi 帶上的 script (位置在 /usr/share/initramfs-tools/scripts/local-top/iscsi), 基本能明白怎麼處理了 - 在 bootloader 的設定中加上幾個跟 iSCSI 相關的參數, 供 initramfs 裡的 script 知道怎麼 login 到 iSCSI target 上. 在 PI 4 + Ubuntu 20.04 的環境下, 編輯 /boot/firmware/cmdline.txt, 加上幾個跟 iSCSI 相關的參數, 並修改 root= 指向新的 file system (其他參數基本不用修改, 另外, 修改前記得先備份一下)

... scsi_initiator=iqn.1993-08.org.debian:01:7cea74f92ecf iscsi_target_name=iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395 iscsi_target_ip=10.9.1.1 root=UUID=3742ae18-12cf-4582-8bbc-d00798975677 ...

改用 cmdline.txt 後, 就能重新開機, UART 的輸出大概長成這樣

Connecting to /dev/ttyUSB0, speed 115200
 Escape character: Ctrl-\ (ASCII 28, FS): enabled
Type the escape character followed by C to get back,
or followed by ? to see other options.
----------------------------------------------------
MMC:   [email protected]: 0, [email protected]: 1
Loading Environment from FAT... *** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   No ethernet found.
starting USB...
No working controllers found
## Info: input data size = 6 = 0x6
Hit any key to stop autoboot:  0 
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found U-Boot script /boot.scr
2603 bytes read in 38 ms (66.4 KiB/s)
## Executing script at 02400000
8377273 bytes read in 633 ms (12.6 MiB/s)
Total of 1 halfword(s) were the same
Decompressing kernel...
Uncompressed size: 25905664 = 0x18B4A00
24826115 bytes read in 1871 ms (12.7 MiB/s)
Booting Ubuntu (with booti) from mmc 0:...
## Flattened Device Tree blob at 02600000
   Booting using the fdt blob at 0x2600000
   Using Device Tree in place at 0000000002600000, end 000000000260e4d5

Starting kernel ...

[    0.886218] bcm2708_fb soc:fb: Unable to determine number of FB's. Assuming 1
[    0.887198] raspberrypi-firmware soc:firmware: Request 0x00048003 returned status 0x80000001
[    0.887222] bcm2708_fb soc:fb: Failed to allocate GPU framebuffer (-22)
[    0.887270] bcm2708_fb soc:fb: probe failed, err -22
[    1.291859] spi-bcm2835 fe204000.spi: could not get clk: -517
IP-Config: eth0 hardware address dc:a6:32:10:b2:68 mtu 1500 DHCP
IP-Config: eth0 complete (dhcp from 10.9.1.1):
 address: 10.9.1.105       broadcast: 10.9.1.255       netmask: 255.255.255.0   
 gateway: 10.9.1.1         dns0     : 10.9.1.1         dns1   : 0.0.0.0         
 domain : co-op.space                                                     
 rootserver: 10.9.1.1 rootpath: 
 filename  : 
iscsistart: Logging into iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395 10.9.1.1:3260,1
iscsistart: version 2.0-874
iscsistart: Connection1:0 to [target: iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395, portal: 10.9.1.1,3260] through [iface: default] is operational now
iscsistart: Logging into iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395 10.9.1.1:3260,1
iscsistart: can not connect to iSCSI daemon (111)!
iscsistart: version 2.0-874
iscsistart: initiator reported error (15 - session exists)
ext4
...

Target auto login

上面 target 是由 initramfs 在開機過程中 discorery & login 的. 如果不是 rootfs, default 的設定是在重新開機時不會自動 login 的, 可用以下命令設為自動 login (參考自這頁)

$ iscsiadm --mode node -T iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395 -p 10.9.1.1 -o update -n node.startup -v automatic
iscsi.sh.x8664:sn.4c630c28f395 -p 10.9.1.1 -o update -n node.conn[0].startup -v automatic

或是, 手動去修改 iscsi

# /etc/iscsi/iscsid.conf

#*****************
# Startup settings
#*****************

# To request that the iscsi initd scripts startup a session set to "automatic".
# node.startup = automatic
#
# To manually startup the session set to "manual". The default is manual.
node.startup = manual

# For "automatic" startup nodes, setting this to "Yes" will try logins on each
# available iface until one succeeds, and then stop.  The default "No" will try
# logins on all available ifaces simultaneously.
node.leading_login = No

及特定 node 的設定

# /etc/iscsi/nodes/iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395/10.9.1.1,3260,1/default

# BEGIN RECORD 2.0-874
node.name = iqn.2003-01.org.linux-iscsi.sh.x8664:sn.4c630c28f395
node.tpgt = 1
node.startup = automatic
node.leading_login = No
...
node.conn[0].startup = automatic

篇也值得看看

用了幾天的心得

  • firewall 的設定如果會動態變動的話, 會非常影響 iSCSI 的穩定度
  • 任何 firewall 的變動都要小心
  • iSCSI 上的 file system 一定要夠強固, 因為常有可能因為網路不通造成連機都關不了, 只有強制重新啟動