A few days ago deployed Oracle 11gR2 RAC on VM for a customer pre-production prototype at Logicworks where I am the Sr. Engineer Oracle, but this time instead of VirtualBox built-in networking tried out Nicira OpenvSwitch. My previous post here described this. Please allow me to thank Oracle Ace Director Tim Hall and his Oracle-Base and OakTable for the initial boost of this blog post.
This has now grown up into a valid SDN deployment which has Oracle 11gR2 RAC, OpenvSwitch, Floodlight OpenFlow Network Controller, and (just added to post today) Beacon OpenFlow Controller so there are choices here too. SDN idea is to separate the control plane and the data plane in networking and to virtualize networking the same way we have virtualized servers. This is an idea whose time has come because VMs and vmotion have created all kinds of problems with how to tell networking equipment that a VM has moved and to preserve connectivity to VPN end points, preserve IP, etc. There are some good videos on youtube to give some ideas about what SDN is, means, and what it will do for networking.
The “Big-Box” this-gives-us-credibility one is this one.
OpenFlow@Google-Urs Hoelzle, Google
Urs presentation was amazing as far as courage to bravely go where no one had gone before at the NCC 1701 level, but I must say that the part at the end where he discussed the security concerns around having an entire worldwide network of interconnected datacenters under the control of a patch of granted redundant SDN Open Flow controllers was notable. Incidentally the slides that go with Urs presentation are not so easy to find. They are here
Simon Horman did a video that I personally found to be excellent:
So it turns out you can build such system very easily and it’s well worth your time. At the end you will have 2 Oracle 64-bit 11gR2 RAC nodes, 1 or more Nicira OpenvSwitches, and a BigSwitch Floodlight (or alternatively, Beacon) SDN controller for managing your flows. Once your testbed is built the sky is the limit. Brent has posts here on how to start learning how to push flows to your OpenvSwitch using curl. The amazing thing about this revolution is that what we learn on our VM playground is going to be pretty much exactly what we will do when SDN fully takes hold, and it is hailed by some of the largest enterprises as a paradigm whose time has come. There is good info on this here, here and here…
This build is on Ubuntu 12.04 64-bit Desktop. The install and configure steps for OpenvSwitch I have below are based on Brent’s URL post here. The method at that post is to use packages from the Ubuntu repository which is quick and works great. We only need some of the steps from that post for installing openvswitch from the Ubuntu deb package repository as shown below. Note Brent has a similar post for installing openvswitch from source, but I recommend for the first run to just use the version from Ubuntu repository. The repo version is 1.4 which is very close to the current LTS version (1.4.3) as currently as of Sept 2012 shown here, and not far behind the latest 1.7.1 version, and the repo version will start on it’s own automagically at boot without any further config needed, whereas if you install from source you’ll have to set it up to start at boot, which are extra steps at this stage of the work. But if you want, you can go with the install from source as detailed by Brent here .
For what we are doing here (as root), we only need some of the steps at Brent’s post. If you need to login as root from your default ubuntu user account, use:
sudo su –
or if you prefer you can set a password for root using (as your regular Ubuntu user) run
which will set a root passwd after you give your regular username password. Now at the terminal as root you can execute the following to install OpenvSwitch:
$ apt-get install openvswitch-datapath-source bridge-utils
$ module-assistant auto-install openvswitch-datapath
$ apt-get install openvswitch-brcompat openvswitch-common openvswitch-controller
$ ovs-vsctl show
Processes should look something like this
$ ps -ea | grep ov
26464 ? 00:00:00 ovsdb-server
26465 ? 00:00:00 ovsdb-server
26473 ? 00:00:00 ovs-vswitchd
26474 ? 00:00:00 ovs-vswitchd
26637 ? 00:00:00 ovs-controller
vi /etc/default/openvswitch-switch and change brcompat from no to yes
TO and uncomment by removing the #:
$ /etc/init.d/openvswitch-switch restart
root@U-Aspire-X3400:~# /etc/init.d/openvswitch-switch restart
* ovs-brcompatd is not running
* Killing ovs-vswitchd (29710)
* Killing ovsdb-server (29700)
* Inserting brcompat module
* Starting ovsdb-server
* Configuring Open vSwitch system IDs
* Starting ovs-vswitchd
* Starting ovs-brcompatd
* iptables already has a rule for gre, not explicitly enabling
I like to call my switches “swX” such as sw1, sw2 etc. They are often called “brX” harkening back to the Linux bridging code, but OpenvSwitch is much more than just Linux Bridging. Visit http://www.openvswitch.org to learn about this fully-featured multi-protocol software switch. Started as a $50 million USD venture capital enterprise about 4-5 years ago, Nicira was bought this year by VMWare for about $1.2 Billion USD so huge is the advent and promise of SDNs. OpenvSwitch is one of Nicira’s flagship (opensource) products.
Now that you have openvswitch up and running you might want to reboot and just make sure it is working ok after a reboot. You can test it with
$ ovs-vsctl show
Now let’s build our single virtual switch. I did mine connected as root so I recommend you do the same although maybe not necessary. Run this code at the ubuntu command line as root to build the switch. Note the “sw1” at the end of the tap names isn’t really necessary here because we are only using 1 switch, but in multiswitch topology it becomes very important for keeping the wiring straight when you are hooking things up in highly-available redundant configurations. You can use whatever names you want for your switch, switch ports and interfaces etc. but my scheme is:
nX for “node X” , pX for “port X” , swX for “switch X”
# You can just paste below into a terminal as root to the “ovs-vsctl show” command. You should also save this (as root) to a file called “sw1.sh” and make sw1.sh executable so you can rerun it whenever you want. That way you can try all kinds of tests removing and adding back ports on the vswitch to see how the RAC responds to disconnected ports. In more sophisticated switch topology you can build highly redundant bondX interfaces in your RAC from multiple ethX and then do tests creating and destroying ports to see if the HA works as it should, try different BONDING_OPTS in your ifcfg-ethX files etc.
To this end of testing, I build little scripts that allow me to remove and add back a port at the switch level. For example, create two small files and chmod 755 them. You can use these types of files to “turn on” and “turn off” a switch port.
ovs-vsctl add-port sw1 n1a1sw1 tag=10
ovs-vsctl del-port n1a1sw1
In what follows, note this: The openvswitch itself will persist across reboots of your Ubuntu host because it is stored in a configuration database by openvswitch. However, the “taps” will not persist across host reboots, so they will have to be recreated one way or another if you reboot your Ubuntu host. You can tell if they are there or not by running:
So now we create the OpenvSwitch.
# Create a OpenvSwitch 4-port virutal switch
ovs-vsctl del-br sw1
ovs-vsctl add-br sw1
ip tuntap del mode tap n1a1sw1
ip tuntap del mode tap n1a2sw1
ip tuntap del mode tap n2a1sw1
ip tuntap del mode tap n2a2sw1
ip tuntap add mode tap n1a1sw1
ip tuntap add mode tap n1a2sw1
ip tuntap add mode tap n2a1sw1
ip tuntap add mode tap n2a2sw1
ip link set n1a1sw1 up
ip link set n1a2sw1 up
ip link set n2a1sw1 up
ip link set n2a2sw1 up
# The “tag” sets a VLAN. This gets more important in more complex switching topologies.
# Our VirtualBox Adapter 1’s will be on VLAN 10 and Adapter 2’s on VLAN 20.
ovs-vsctl add-port sw1 n1a1sw1 tag=10
ovs-vsctl add-port sw1 n1a2sw1 tag=20
ovs-vsctl add-port sw1 n2a1sw1 tag=10
ovs-vsctl add-port sw1 n2a2sw1 tag=20
ifconfig n1a1sw1 mtu 9000
ifconfig n1a2sw1 mtu 9000
ifconfig n2a1sw1 mtu 9000
ifconfig n2a2sw1 mtu 9000
Here’s the output for the correctly built switch:
root@U-Aspire-X3400:~# ovs-vsctl show
So now we can follow Tim’s excellent instructions here for building 2-node RAC which follows with the only change being we use OpenvSwitch instead of VirtualBox Host-Only or Internal network.
The only change will be in the networking setup of the VM’s. Where Oracle-Base guide has these steps we want both Adapters to be bridged (don’t use internal) to our OpenvSwitch taps to run our RAC network traffic over OpenvSwitch instead of over the VirtualBox provided internal or host-only adapter networks. Here are the steps in the Oracle-Base guide instructions which we will alter slightly. For the build, just make sure both are set to bridged and select the correct OpenvSwitch tap for that adapter as explained below.
In addition, I recommend you add a 3rd network adapter (Adapter 3) which is set to “NAT” on each VM. We’ll take the easy route and have this so that we have off-box internet if we need or want it which is an alternative to trying to introduce off-box internet via the OpenvSwitch itself which gets into gateways and other side issues. It’s an interesting problem to return to though later.
On VM node 1
“Adapter 1” is enabled, set to “Bridged Adapter” and “n1a1sw1”,
“Adapter 2” is enabled, set to “Bridged Adapter” and “n1a2sw1”.
When you later clone the VM as explained at the Oracle-Base page, you will , in addition to all of the Oracle-Base steps, need to update node 2 network adapters to use the openvswitch taps for node 2 (see below).
In addition to those changes, also expand the “Advanced” features of the network adapters and set all the adapters which are going to be attached to openvswitch (i.e. Adapter 1 and 2) to “Promiscuous Mode: Allow All” on both VM RAC nodes. I believe the reason for this is that because packet flows are managed by these flow controllers, the packets that arrive at VM may appear to have originated not from a RAC node but from an OVS switch or an OpenFlow controller so we have to tell the virtual network cards to allow such flows. Also, I recommend you use the “Adapter Type: Intel PRO/1000 MT Server (82545EM)” for Adapter 1 and 2 on both VMs. This is what I used and I’m not sure the PCNet adapters will support MTU 9000 and all the other necessary features.
The picture here comes directly from the Oracle-Base build page and see where it shows “Name: eth0” below “Bridged Adapter” That is the drop down is changed to “n1a1sw1” and so on for each adapter.
On VM node 2 make sure:
“Adapter 1” is enabled, set to “Bridged Adapter” and “n2a1sw1”,
“Adapter 2” is enabled, set to “Bridged Adapter” and “n2a2sw1”.
Be sure to go into Advanced here too and set “Promiscuous Mode: Allow All” again for all adapters that will be attached to OpenvSwitch.
With these changes, your RAC will run on OpenVSwitch. You are already halfway to a valid Software Defined Network complete with virtual switch and Open Networking Foundation (ONF) OpenFlow controller (Floodlight) or alternatively (Beacon). Note that OpenVSwitch comes with it’s own very robust OpenFlow controller called openvswitch-controller, so this RAC setup will work even without Floodlight or Beacon. But because the SDN idea is to separate control plane from data plane we really need Floodlight or one of the other options to have a true ONF OpenFlow architecture because openvswitch-controller in this deployment runs on the same server as openvswitch itself. Note it should be easy enough to install openvswitch-controller on another box and run the control-plane using openvswitch-controller from the other box which would be more like what SDN is about, but I haven’t explored that so can’t say it it’s possible yet, and I don’t know if openvswitch-controller itself has the API features present in the programmable java- and python-based flow controllers. So, it’s because Floodlight and it’s friends have API and way to push down flows to the virtual switch, which is a big part of what SDN is all about which makes them so useful: centralized programmatic control of standardized switching, routing, load balancing and other network devices built to a common API standard and programmed from a management plane, rather than a forest of proprietary devices all running different proprietary management interfaces locally at device (which is becoming a management nightmare) hence the dawn of the SDN era (but there are other compelling reason too based around VM largely and VMotion needs).
So now when your RAC is built do ping tests. The switch is set to use MTU 9000 jumbo frames, so if you set your interfaces on your VMs to use jumbo frames, you should be able to pass jumbo frame ping tests. Use the following format for the ping tests:
ping -s 16384 -I 192.168.0.1 192.168.0.2 (example for a typical interconnect)
This says, ping a packet of size 16384 and use 192.168.0.1 as the sending interface. The 16384 tests the MTU 9000 configurations on the ethX and the virtual switch and virtual switchports. If the ping fails, take out the -s 16384 and try again. If it works then, you have probably forgot an MTU 9000 setting somewhere and need to track it down. Note also that in VirtualBox only bridged adapters will transmit jumbo frames; none of the other types will.
To set your VM interfaces to MTU 9000 go to the OS of the VM RAC nodes and edit:
and add at the end of the file:
and then reboot or do a “service network restart”.
Note that the vswitch can be finicky about things sometimes. You might have to do a “service network restart” at your VM node level, and possibly a
at the Ubuntu level. You get a feel for it after awhile. Gives you an idea why Nicira took the $1.2 Billion USD. It’s gonna take $$$ to get this software running perfectly, and because it’s network, to be taken seriously, it’s going to have to be six sigma 99.9999% reliable software. It may not yet be as I find that sometimes I have to restart networking when I “unplug” things programmatically etc. It’s not quite as robust as plugging and unplugging a real network cable. Sometimes you have to restart the stack when you do things like that.
Ok now you have RAC and you have openvswitch carrying your traffic. Now you need Floodlight. Start out with Floodlight because it’s easiest to setup and configure. Also, the GUI is cool. I like the Toplogy diagram best. Anyway, try out Floodlight use it for a day or two and then try the Beacon deployment (below later in the post). The Beacon deployment I personally found much more understandable once I had been using Floodlight for a couple days. Anyway getting back to Floodlight: to install Floodlight, we use some steps from Brent’s post here to install it.
This is the part at Brent’s post that we need:
Install dependencies, apt-get for UB and yum for RH:
$apt-get install build-essential default-jdk ant python-dev eclipse git
Clone the Github project and build the jar and start the controller:
$git clone git://github.com/floodlight/floodlight.git
Note: before you start floodlight, check what is running already on port 6633 (Floodlight default port)
netstat -an | grep 6633
The openvswitch-controller runs on port 6633 also. We can either stop it before we run Floodlight below, or we can change the default port of Floodlight (preferred option) so here’s how we change the Floodlight default port.
The floodllight.properties on my system is installed at /home/gstanden/floodlight/src/main/resources/floodlight.properties
You can edit the line with port 6633 in it and change it to for example port 6636. Now you can have both the openvswitch-controller and the floodlight controller running at the same time.
$java -jar target/floodlight.jar
Leave this window open. The output from Floodlight running will look like something this:
“Listening for switch connections on 0.0.0.0/0.0.0.0:6636” (or whatever port you are using)
and as time goes by it will give this message at intervals
“Sending LLDP out on all ports.”
Later when we hook it up to an OpenvSwitch it will be displaying a running stream of debug output related to your network traffic being processed by the Floodlight OpenFlow network controller. Once the service is running you can go to the web GUI and view topologies, flows etc or use curl statements to add remove datapaths.
View the floodlight GUI in a web browser http://localhost:8080/ui/index.html
The GUI has a neat topology tool which will show your switch and what is sending traffic through it. You can view flows, hosts, and switches in the GUI as well. Gives very interesting insights into how RAC transmits it’s traffic in actuality.
If you want to switch back and forth between the openvswitch-controller running on port 6633 and the Floodlight controller on port 6636 (or whatever port you chose) you can do that with this (type at command line use “sudo” or connect as root using “sudo su – “).
So here are typical commands which connect the OpenvSwitch to an OpenFlow controller. Here is how to connect to the default openvswitch-controller running on the same box as the OpenvSwitch:
$ sudo ovs-vsctl set-controller sw1
To connect the OpenvSwitch to Floodlight, use a command like this:
$ sudo ovs-vsctl set-controller sw1 tcp:127.0.0.1:6636
This would be the case where Floodlight is running on the same box as OpenvSwitch. If you have Floodlight on another physical box on your network, or in a separate Ubuntu VM, you can use the IP of that box in place of 127.0.0.1 with all other the same (make sure you have the right port for Floodlight). Note that if you are on a different box such as a separate VM it’s got to be networked to the RAC VM’s using the normal VirtualBox networking (or any other method you wish to use) but it has be be able to communicate with your OpenvSwitch.
So you can eventually have several controllers running on different VMs and as long as they are all reachable, you can switch your traffic with similar commands to above to use whatever controller you want.
After working with Floodlight and OpenvSwitch for awhile, you can move on to try Beacon, which is also a java project. The connection between Floodlight and Beacon is described at here .
What’s the connection between Floodlight and Beacon?
Beacon was created by David Erickson of Stanford University as a cross platform, Java-based OpenFlow controller. It is currently licensed under GPL V2. Prior to assigning this license, Beacon was forked to create Floodlight, which carries on with an Apache license. Floodlight has been redesigned without the OSGI framework so it can be built, run, and modified without OSGI experience. Additionally, Floodlight’s community currently includes a number of developers at Big Switch Networks who are actively testing and fixing bugs and building additional tools, plugins, and features.
To setup beacon, you an either put it on the same computer you have used to create the above virtualized system, or you can take things to the next step and put it on another computer on the same network you are currently on (or on it’s on VM for example) to see how OpenFlow controllers can be (and usually are) remote from the networks they manage. Here’s how to setup Beacon. First watch this tutorial on youtube:
How convenient for us this is also done on Ubuntu 12.04 but as the presenter states, it can be done on Windows or OS X also. Once you have watched the video through, go ahead and install Beacon. You can also refer to this guide at the Beacon website. I used the “Develop Using Eclipse” option here.
Note that as it mentions in the guide, but does not underscore with emphasis, you MUST use Oracle (Sun) Java JDK 6 because 7 will not work. I hit all kinds of errors, but once I had JDK 6 all good. There is an excellent tutorial for installing Oracle JDK 6 here .
If you decide to keep both 6 and 7 on your machine using java alternatives, then be sure when you go to build Beacon that you select Java SDK 6 in the eclipse GUI.
Beacon has it’s own Web GUI at http://localhost:8080 . It’s very different from the Floodlight GUI and has it’s good points and bad points. You can compare the two.
Now you have two development platforms for creating and developing OpenFlow controller code. You can push flows down to your virtual switches etc. and as they say, develop the next great application for SDN.
There are other openflow controllers out there, notably in Python as well.
I have a slightly more complex OpenvSwitch configuration here .
to get an idea of where one can go on the OpenvSwitch front as far as multi-switch and build-out.
There is also a very good tutorial here (and they also have a pre-built VirtualBox VM with mininet and other OpenFlow goodies already pre-built on it for open download) wherein you can using “mininet” quickly setup “mock” network topologies of various types. For those of us who have waited quite a long time for a way to put our virtual machines together with equally virtualized network components, it seems our ship has come in!
Gil Standen, Logicworks, NYC, Septemer 22, 2012
PS: VirtualBox 4.2 was released this week. It has great new features including support for 32 NICs per VM, supports VLAN tagging and has a new feature to group VMs logically in the management console (very handy). I found however that the guest additions that came with 4.2 had some issues with the mouse and I reverted back to 4.1 guest additions and mouse problems were gone. I notified VirtualBox developers and they did reply that they were aware of mouse problems with guest additions in the RC so this may have squeaked into the first 4.2 release so be sure to take snapshots of your VM’s before upgrading their guest additions to 4.2. Hopefully another release will come shortly with a fix for this guest additons mouse issue.