I’ve been running IOS simulations in Dynamips for a number of years now, and over the last few years in both dynagen and the ever-expanding GNS3, I’ve found a few guidelines that have remained useful in any situation in terms of maximizing the number of routers you can simulate on a given host machine.
How to Approach Simulation
Using dynamips for a while has taught me to simulate as little of the network as possible, because CPU is a valuable commodity and simulating routers eats up CPU pretty quickly. So wherever possible, I’ll figure out which routers need to be realistic in the simulation (i.e. receive dynamic routes, process them, have similar configs to production), and which parts of the network I can simulate with a single router that just injects routes.
IP addressing should match production wherever possible (you don’t want to have to convert IPs from lab configs to production, as that usually introduces errors). Similarly, take a moment and also match things like OSPF process IDs, BGP ASNs, and so on. Doing so makes it so easy to migrate configurations from lab to production via your change script, so it’s worth the extra effort up front.
Think about what you are actually testing. For example, I may have a firewall in a simulation, running OSPF. If I’m not testing the firewall itself, and it’s just a transit device, why not simulate it with a router instead and keep your simulation simpler? You may have a redundant pair of edge routers, but if that isn’t an important part of your simulation, do you need both routers in the simulation? Could you suffice with a single router?
If your network change involves manipulating routing, you need the routing tables to as accurately as possible replicate reality. You can’t easily simulate the entirety of the rest of the network, so instead I’ll grab a copy of the routing table from the production device, and I’ll reverse engineer it so that I can inject the same routes with, as best I can, the same metrics. If you can’t script (I’m a perl addict) this can be tricky. If you can, you quickly build up a library of little scripts to process configurations and generate commands to inject routes.
If I’m injecting BGP routes my approach is usually to configure static routes to null0 on the router that will act as the route source. I use route tags to mark similar routes, then redistribute static into BGP via a route-map that can set appropriate parameters on the routes based on the tags. In a recent simulation I had to simulate the MPLS service for two service providers. Rather than using two routers, I used route tags to allow me to distinguish routes that were unique to CarrierA, unique to CarrierB, and those carried by both. I then used a single router and appropriate route-maps to peer with two simulated CE routers that would then redistribute the routes into OSPF so they were injected into the network. The fact that I didn’t have two different ASNs for the simulated MPLS didn’t matter because the network I was testing only cared about the OSPF. But it was important that I could manipulate the BGP to OSPF redistribution and see the result – again, it’s about knowing what you are testing. You can also inject BGP routes from external sources; Jeremy Gaddis has an excellent post on getting BGP routes into dynamips using perl, which is great especially if you need huge numbers of routes.
Simulating End Points
There are two easy ways to simulate end points for ‘traffic testing’. The first involves using loopbacks on routers. This is a little bit of a cheat, but it can suffice for basic testing. If you want to move to a true end point connected to an interface, check out the VPCS (Virtual PC simulator) capabilities now in GNS3. It’s incredibly unintuitive to use, but once you get to grips with adding hosts for VPCS, you can ping and traceroute to your heart’s content from real LAN-connected endpoints. VPCS is pretty memory and CPU efficient so it’s a good solution to the problem of simulating users, but the pay back is that it’s pretty limited in what it can do. Quite often that’s a reasonable compromise though.
Pick Your Routers and Images Carefully
Simple tip, more based on feeling than fact, but I try to run simulations on the slowest old dog routers available (Cisco 3640) where possible, rather than running a more powerful, newer model. I do this because the 3640 has the slowest CPU of all the models simulated, and since dynamips has to emulate each CPU cycle, the fewer cycles that need to be emulated, the less work dynamips has to do. Sounds convincing, right? I don’t know if it’s true, but I’ve done pretty well building large simulations on my laptop with this method.
I also typically run Ethernet interfaces rather than serial interfaces, for example, as the routing is usually more important to me than the medium over which the packets are sent and Ethernet interfaces are easy. If I need more than 4 interfaces, I’ll have to choose something bigger than a 3640, but generally the 3640 does the job.
Depending on what features you need, choosing an IOS image to run can be a tough call. In an ideal world you want to run the exact same image as you do in production, right? But if you can’t (and quite often you’re not simulating the same hardware in any case), I like to choose an image that maximizes features against memory and flash. For me that means avoiding “T” trains where possible, for example. On the 3640 I run a 12.x service provider image that needs only 64MB of RAM, and yet is capable of running 99% of all my routing simulations.
Can you achieve, with a VRF, what you would have achieved with another router? VRF lite can sometimes reduce your simulated router needs by effectively acting as two isolated routers, without doubling the resource overhead.
Set it. Don’t be afraid to re-generate an IdlePC value if you change modules on the router. We found in the past that even changing the underlying host (e.g. to a PC with a different processor) was enough to make a previously effective IdlePC value ineffective. It’s so easy to do in GNS3, there’s no reason not to try it whenever you notice CPU is rising a bit.
What else should I be doing to get the most out of my simulated environment?
Good tips. I too do a lot of GNS3 simulations and have blogged about it in the recent past. CPU is always a concern so I do heed your tips about simulating the bare minimum and sticking to the “spirit” of the test rather than the emulation of the physical environment.
I’ve used VPCS in the past to get the end host feel a little better than loopbacks on routers. I’ve also connected my simulation to a “cloud” icon that maps to the loopback of my host PC so I can get things like traps and syslogs to a collector (Perl script – thought you’d like that) on my physical host. Sometimes, that just isn’t enough.
I’ve used Qemu quite successfully with GNS3 and “custom built” a TinyCore linux image with additional useful packages beyond the standard. Things like Apache, HAProxy, ISC-DHCP … That allows me to test IPv4 and IPv6 for things like web server, proxy services, load balancing, 6-to-4 address family translation on the load balancer, DHCPv6 … It even works as a multicast receiver with an ‘ip’ command MAC add and a ‘tcpdump’ running.
We need to be creative when simulating in the virtual world!
That TinyCore image sounds great! Sounds like you have some fabulous simulation capability there – much more than I do! Thanks for sharing details.