Tuesday, 24 May 2016

Sqoop & SQL Server #4 - Setting up FTP between Windows host and Cloudera VM


By default the Cloudera VM is not configured to allow easy transfer of files between the Windows host and the VM. But this is easily fixed.

You can amend the Oracle VirtualBox settings to allow drag and drop from the host to the VM but that isn't going to help if you're connecting to a VM on your network or in the cloud. I prefer to use the more flexible method of using an FTP client.

First you'll need to amend the Oracle Box VM settings. Follow these steps;
  1. Shutdown the virtual machine
  2. Open Oracle VirtualBox Manager
  3. Select the Virtual Machine you intend to work with from the list
  4. Click Settings
  5. Navigate to the Network option
  6. Select the tab named Adaptor 2
  7. For the option Attached To select Host-only Adaptor
  8. For the option Name select VirtualBox Host-only Ethernet Adaptor
  9. Click OK to save the config
  10. Start the Virtual Machine

Next we need to identify the IP address the Virtual Machine has been given. To do so follow these steps,
  1. When the VM has started open a terminal session
  2. Type su and Enter
  3. Then enter the password 'cloudera'. You now have admin rights
  4. Type ifconfig
  5. Maximise the window if you haven't already. You'll see the network details listed.
  6. One of the blocks of text is prefixed Eth2 for Ethernet Adaptor 2. Make a note of the inet address within this text block. It ought to remain the same even after restarting the virtual machine.

The following video may help if my explanation isn't clear.

Finally we'll set-up an FTP client to connect between the host and VM. I prefer to use WinSCP because it not only functions well as an FTP client for simple tasks but it's also well-suited to more complex tasks and has command line options which can be use as part of a wider process. I've used it in conjunction with SSIS in the past. For more details see this post.

Follow these instructions to create a connection and send a file,
  1. Open WinSCP. It will display a menu of pre-existing connections. If this is the first time you're using it there will be only one item in the list called New Site.
  2. Click on New Site. This opens up a dialog box in which we will configure the connection.
  3. For File Protocol select SFTP
  4. In the Host Name type the IP Address you noted down from previous section.
  5. The Port Number should be set to 22 as that is the SFTP default, if not change it to 22.
  6. For the username type 'cloudera'
  7. And for the password type 'cloudera'
  8. Nothing needs to be changed in the Advanced tab so Save the connection and give it a meaningful name.
  9. The named connection now appears in the list.
  10. Ensure you VM is started and double click on the connection.
  11. It may take a few moments to establish a connection then you'll see a file explorer window with two panes, Windows on the left and the Cloudera VM on the right.



No comments:

Post a Comment