Friday, May 8, 2020

Linux the darkside migration method | a.k.a. The dd over ssh method

This method is the most difficult but one that almost always works. It uses dd over ssh to dump the entire partition as it is. It is also very useful when you wish to migrate just a single partition or disk. When using this method to migrate the root volume the last step when you boot the VM is uncertain as you will need to observe the errors on the console in order to boot the VM.

This is a super advanced method of migration and I've written far easier ways to migrate. So unless you know what you need to do, consider choosing one of the methods bellow:



In this example I will be migrating a source VM from AWS EC2 to my home computer on VirtualBox, however by modifying the steps appropriately you can use to to migrate to any other environment.

Prerequisites

1. Support Linux VM at home or Linux OS on the your workstation
2. VMware Workstation
3. VirtualBox

Gather critical system information from the source VM

cat /etc/*release*
uname -a

# Check if it has a package manager.
# It may be a customised distro and not have a package manager
Execute apt or yum - see if it produces any output

# See what is inside the /boot directory
ls -la /boot

cat /proc/cpuinfo #Check CPU 
cat /proc/meminfo #Check Memory

#Check mounts and partitions
df -h 
cat /etc/fstab
cat /etc/mtab
blkid

# Check running processes
ps auxwff
netstat -tulnp


Procedure

1. DD the disks from the original VM
$ ssh root@10.226.65.213 'dd if=/dev/xvda bs=1M | gzip' | gunzip | dd of=xvda.raw

2. Convert the raw disks into VDI VirtualBox disk

.\VBoxManage convertfromraw "F:\VM\mysourcevm\xvda.raw" "F:\VM-VMware\webbuild-linux\xvda.vmdk" --format VDI

3. Build a shell VM clone in VirtualBox  
Create a blank VM on VirtualBox with the same structure as the original and mount the converted disk.

4. Use dark magic to make it working
This step is simply as it is. Once you power on the cloned VM you will need to analyse the errors on the console (if any) to see what could prevent it from booting. You may as well need to reinstall the bootloader or alter the GRUB parameters. From here on you are on your own to make it working. In my experience I've succeeded in this step in 90% of the cases.

Linux migration from anywhere to anywehre | a.k.a. The online rsync method

This tutorial describes migration instruction of Linux machines from anywhere to anywhere (but in this case to AWS EC2). It was shown to me by my friend Sergiu Badan.

1. Check the source operating system distribution and architecture. 

Launch an AMI in EC2 of the same distribution and architecture. Install rsync and screen on both source and destination:
$ yum -y install rsync screen


2. Check disk requirements

If the source distribution has a total disk size of more than the EC2 AMI storage, you'll need to create EBS drives on the EC2 machine.

Example:
The source system has the following disk space distribution:
/home - 39 GB
/usr - 16
/var - 11
/opt - 430
rest - 5

If the destination has a root partition of 8 GB, you will have to append EBS volumes for /home (40), /usr (20), /var (15), /opt (440).

Format the partitions. See example below, for /home partition which will be on device /dev/xvdh.

Do this for all devices:
$ fdisk -l #(see how they are labeled, like /dev/xvdh or /dev/sdh, or whatever).
mkfs.ext4 /dev/xvdh #(put the proper device name here)

# mount to temporary location
$ mount /dev/xvdh /media

# sync all files from mountpoint to the new device
$ rsync -avz /home/ /media/

# edit /etc/fstab and put the new device name and the mount point:
$ cat /etc/mtab | grep xvdh #or whatever the device is

# append the line to /etc/fstab, but replace /media to the mountpoint. Eg:
/dev/xvdh   /home      ext4    defaults        0 0

# remount the partition:
$ mount -o remount,rw /home

3. Create keypairs on the source system. 

Copy the public keys to both source and destination /root/.ssh/authorized_keys.

# on the source:
$ ssh-keygen -t rsa # answer the defaults, put no password.
$ cat /root/.ssh/id_rsa.pub
# append the content to /root/.ssh/authorized_keys on both source and destination server.

# Copy company public key (through which you access the destination server) also to both source and destination in /root/.ssh/authorized_keys.

4. Create a list of files to exclude, on source in /excluded_files

$ vi /excluded_files
# add the following lines
/etc/fstab
/etc/mtab
/etc/sysconfig/network-scripts
/proc
/boot
/sys
/dev
/etc/resolv.conf
/etc/hosts
/etc/conf.d/net
/etc/network/interfaces
/lib/modules
/etc/sysconfig/network
/etc/nsswitch.conf
/etc/lvm
/var/run

5. Execute the first sync

On the source:
$ screen -R rsync
$ ssh root@$destination_ip_address # answer yes to trust the destination key

$ exit # exit from the remote computer

# rsync <options> <excluded files> <source> <destination>
# Copy everything from SOURCE:/ to DESTINATION:/ exclude the excluded_files
# The --delete option will delete any files in the destination directory if 
#  they don't exist in the source directory.
$ rsync -avz --progress --delete --exclude-from=/excluded_files / root@$destination_ip_address:/ 

# to exit the screen press ctrl + a d
# to enter the screen again, type screen -x

6. After first sync finishes, do the second sync

$ screen -x rsync
$ cat /dev/null >/root/.ssh/known_hosts  # wipe the known_hosts file, as on the destination the key has changed

$ ssh root@$destination_ip_address   # answer yes to trust the destination key
$ exit # exit from the remote server

# stop all important services on the source, like mysql, oracle, apache, nginx, sendmail, postfix, after which do the second rsync:
$ rsync -avz --progress --delete --exclude-from=/excluded_files / root@$destination_ip_address:/


7. Finalise

After the rsync finishes, reboot the remote server, and check that it starts successfully. If there are problems, troubleshoot (check startup logs from EC2 console, eventually umount the /var partition and mount it to other server to check the logs).

Linux EC2 to On-Premise migration | a.k.a. The offline rsync method

Setup

This guide assumes that we have one EC2 instance with only the root volume i.e. one mount only. If you have multiple mounts you will need to adjust the steps accordingly and rsync the multiple mounts. We will be migrating the instance from AWS to a VMware clsuter however by adjusting the steps you can use this method to migrate to any other custom environment, even VirtualBox.

1. Preparation

  1. Source EC2 to be cloned (I name this SOURCE-VM)
  2. Create helper VM in VMware (I name this HELPER-VM)
  3. Create helper EC2 VM in AWS (I name this HELER-EC2)
  4. Create blank VM at the target VMware Site with a similar structure as the original (I will name this CLONE-VM). Make sure that the disk is about the same size and thin provisioned
  5. Take note of the disks location of the CLONE-VM.
    Example:
    [WorkloadDatastore] 2f6ba65a-6af4-6f09-2666-124516e7c87c/debian.vmdk

Gather SOURCE-VM critical system information

cat /etc/*release*
uname -a

# Check if it has a package manager.
# It may be a customized distro and not have a package manager
Execute apt or yum - see if it produces any output

# See what is inside the /boot directorty
ls -la /boot

cat /proc/cpuinfo #Check CPU 
cat /proc/meminfo #Check Memory

#Check mounts and partitions
df -h 
cat /etc/fstab
cat /etc/mtab
blkid

# Check running processes
ps auxwff
netstat -tulnp


2. In AWS

  1. Take snapshot of the root volume of the SOURCE-VM to be cloned
  2. Attach the disk from the SOURCE-VM snapshot to the helper HELER-EC2
  3. Mount the disk into /mnt
  4. Verify that the contents from the attached disk match the contents from your SOURCE-VM. They need to contain the entire disk layout.

3. In VMware

Attach the empty disk that you have taken note before from the CLONE-VM to the HELPER-VM as an additional disk. Mount it to /mnt.

NOTE: You don't need to detach the disk from the CLONE-VM in order to attach it to another VM. In VMware the disks can be attached to multiple VMs at once.

On the HELPER-VM

Format the new drive attached from the CLONE-VM
# Identify the new disk
ll /dev/sd*

## /dev/sdd in my case

## This creates the partition and prepares it for ext4 fs
parted
select /dev/sdd
mklabel msdos
mkpart primary ext4 0% 100%
set 1 boot on
quit

## This formats the partition. Be very carefull not to use a wrong partition here, or it will wipe out everything.
mkfs.ext4 /dev/sdd1

## Mount the partition in /mnt
mount /dev/sdd1 /mnt/

4. Initiate data transfer

From the HELPER-EC2 initiate a data transfer of the contents of the root partition of the SOURCE-VM to the disk of the CLONE-VM that is attached to the HELPER-VM

Example in my case:
rsync -avxHAWX --numeric-ids --info=progress2 -e ssh /mnt/ root@192.168.1.57:/mnt

Once the rsync is finished, shutdown the HELPER-VM and detach the additional hard disk from the it.

5. Re-Install the bootloader

We now have almost everything we need for our CLONE-VM to start working, except that it will not boot without a bootloader.

To do that, we need to to a couple of things:

  • Remove any existing bootloader from AWS;
  • Remove any folders that contain additional options for booting. These options are tailored for use for an EC2 and may cause problems with our VM;
  • Reinstall the boatloader (GRUB);
  • Check and adjust /etc/fstab to match our new system layout;
  • As an additional cosmetic we also need to remove cloud-init.

To do all of this we need a temporary environment to work on. The system in this case was debian so I will use a bootable Ubuntu CD and go to recovery mode. Then we shall execute a chroot environment into our new disk.

The following is an example of commands used to identify the current boot loader. Then I am removing it. Then I remove any existing folders that have additional boot options. Finally I install the new bootloader. If you will do this on a CentOS/RedHad the commands are slightly different. Use Google find them out.

## Find any existing GRUB software and remove all of it.
$ apt list --installed | grep grub*

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

grub-common/now 2.02~beta2-36ubuntu3.15 amd64 [installed,upgradable to: 2.02~beta2-36ubuntu3.17]
grub-gfxpayload-lists/xenial,now 0.7 amd64 [installed,automatic]
grub-legacy-ec2/now 17.1-46-g7acc9e68-0ubuntu1~16.04.1 all [installed,upgradable to: 17.2-35-gf576b2a2-0ubuntu1~16.04.2]
grub-pc/now 2.02~beta2-36ubuntu3.15 amd64 [installed,upgradable to: 2.02~beta2-36ubuntu3.17]
grub-pc-bin/now 2.02~beta2-36ubuntu3.15 amd64 [installed,upgradable to: 2.02~beta2-36ubuntu3.17]
grub2-common/now 2.02~beta2-36ubuntu3.15 amd64 [installed,upgradable to: 2.02~beta2-36ubuntu3.17]

$ apt-get remove grub && apt-get purge grub #The 2nd removes any grub configuration

$ rm -f /etc/default/grub
$ rm -rf /etc/default/grub.d ## This folder has AWS specific configuraiton thus it will cause problems later on and must be removed.

## Reinstall GRUB
$ apt update
$ apt install grub2 os-prober

At this point GRUB will ask you to which drive the boot sectors to be installed on. Choose the whole drive /dev/sda and not the partition /dev/sda1.

After this the system should boot normally.

[end]

Wednesday, July 3, 2019

Scalable Jenkins with Kubernetes cluster

I wrote a short tutorial on GitHub about creating a scalable Jenkins on a Kubernetes cluster. The main pod is the master which creates slave pods per need and then deletes those pods when they are no longer needed.
You can find the tutorial at the same GitHub repo here: https://github.com/spirit986/skjenkins

Friday, March 11, 2016

VSphere | Advanced snapshot trobuleshooting - Part 3 - Example: Invalid snapshot configuration

Unable to consolidate because of invalid snapshot configuration

This is Part 3 of the short tutorial series:
vSphere | Advanced snapshot troubleshooting

This is much more advanced than our previous example.  You will see these two errors if you attempt to consolidate, clone or migrate the VM:

Detected and invalid snapshot configuration

or

... vmdk was not found

VSphere | Advanced snapshot trobuleshooting - Part 2 - Example: Unable to consolidate because the file is locked

This is Part 2 of the short tutorial series:
vSphere | Advanced snapshot troubleshooting

Virtual Machine consolidation attempts result with bellow error:
Unable to access file since it is locked

VSphere | Advanced snapshot trobuleshooting - Part 1 - The ESXi shell is your friend

I am writing this article as an extension to my previous post A VM is showing disk size of 0B where I will attempt to make a general explanation on how to troubleshoot and successfully solve snapshots and consolidation problems in your vSphere with a number of examples. Because the general article was so big, I decided to split it into three parts. This Part 1, is where I am describing the commands which are used to solve most of the snapshot problems and in the next two parts I will describe the process of troubleshooting via examples.