ITKarma picture

Hello, Habr! My name is Dmitry, and I'm the developer of DCImanager - panels for managing equipment from ISPsystem. I spent quite a long time in the team, developing software for managing switches. Together we experienced ups and downs: from writing services for managing iron to the fall of the office network and hourly visits in the server room in the hope of not losing our loved ones.

And now it's time for testing. We were able to cover part of the processors with ready-made testing solutions. But it didn’t work out with Juniper. Reception and implementation served as the idea for this article. If interested, welcome to kat!

DCImanager works with different types of equipment: switches, power distributors, servers. DCImanager currently supports four switch handlers. Two using SNMP (Cisco Catalyst and common snmp common) and two using NETCONF (Juniper with and without ELS).

We abundantly cover all work with equipment with tests. It is impossible to use real equipment for automatic testing: tests are run on each push and run in parallel. Therefore, we try to use emulators.

We were able to cover the processors with support for the SNMP protocol using the SNMP Agent Simulator library. But with Juniper’s problems. After searching for ready-made solutions, we selected a couple of libraries, but one of them did not start, and the other did not do what was needed - I spent more time trying to revive this miracle.

The question arose, but how to emulate the operation of Juniper switches? Juniper runs on the NETCONF protocol, which in turn runs on top of SSH. An idea flashed through my head to write a small service that will work on top of SSH and emulate the operation of the switch. Accordingly, we need the service itself, as well as a Juniper snapshot to emulate the data.

In snmpsim, a snapshot refers to a complete copy of the state of a switch, with all of its supported OIDs and their current values. In Juniper, everything is a little more complicated: such a picture cannot be taken. Here, by snapshot we mean a set of templates of the type: request-response.

Part One: landing architecture

Now we are actively replenishing the "zoo" of processors for working with switches. Soon we will have new handlers, and not all of them we can cover with ready-made testing solutions. However, you can try to write a general architecture of the service, which will simulate the operation of various devices using different protocols.

In the simplest case, a factory, which, depending on the protocol and handler (some switches can work on several protocols), will return a switch object in which all the logic of its behavior will be already implemented. In the case of Juniper, this is a small query parser. Depending on the input rpc request with parameters, it will perform the necessary actions.

Important limitation: we will not be able to fully simulate the operation of the switch. It will take a long time to describe the entire logic, and adding new functionality to the real handler, we will have to edit the switch mock as well.

Part two: we select the soil for planting

A glance fell on the paramiko library, which provides a convenient interface for working with the SSH protocol. To begin with, I didn’t want to spread the architecture, but to check basic things, for example, connectivity and some simple request. We are doing the same thing. Therefore, we don’t bother with authorization: a simple ServerInterface and a socket server in conjunction give us something similar to a working option:

class SshServer(paramiko.ServerInterface): def check_auth_password(self, user, password): if user == SSH_USER_NAME and password == SSH_USER_PASSWORD: return paramiko.AUTH_SUCCESSFUL return paramiko.AUTH_FAILED socket=socket.socket(socket.AF_INET, socket.SOCK_STREAM) socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) socket.bind(("", 8300)) socket.listen(10) client, address=socket.accept() session=paramiko.Transport(client) server=SshServer() session.start_server(server=server) 

An approximate implementation of what I would like to see, but it looks scary

When connecting a client to the server, the second should respond with a list of its capabilities. For example, this:

reply=""" <hello> <capabilities> <capability>urn:ietf:params:xml:ns:netconf:base:1.0</capability> <capability></capability> <capability></capability> </capabilities> <session-id>1</session-id> </hello> ]]>]]> """ socket.send(reply) 

Yes, this is XML]] >]] >

If anything, the code is unstable.In this implementation, there is a problem with closing the socket. Found a couple of registered issues in paramiko with this problem. He put it off for a while, deciding to check the remaining option.

Part Three: Landing

Trump in the sleeve - Twisted. This is a framework for developing network applications with support for a large number of protocols. He has extensive documentation and a great Cred module that will help us.

Cred is an authentication mechanism that allows various network protocols to connect to the system depending on your requirements.

To organize all the logic, Realm is used - the part of the application that is responsible for business logic and access to its objects. But first things first.

The core of the login is the Portal . If we want to write an add-on over the network protocol, we define a standard Portal. It already has methods:

  • login (provides client access to the subsystem)
  • registerChecker (direct credential verification).

To bind business logic to the authentication system, a Realm object is used. Since the client is already authorized, the logic of our SSH add-in begins here. This interface has only one requestAvatar method, which is called upon successful authorization in Portal and returns the main object - SwitchProtocolAvatar:

@implementer(portal.IRealm) class SwitchRealm(object): def __init__(self, switch_obj): self.switch_obj=switch_obj def requestAvatar(self, avatarId, mind, *interfaces): return interfaces[0], SwitchProtocolAvatar(avatarId, switch_obj=self.switch_obj), lambda: None 

The simplest Realm object implementation that returns the necessary Avatar

Special objects are responsible for managing business logic - Avatars. In our case, an add-on for the SSH protocol begins here. When the request is sent, the data goes to SwitchProtocolAvatar, which checks the request subsystem and updates the configuration:

class SwitchProtocolAvatar(avatar.ConchUser): def __init__(self, username, switch_core): avatar.ConchUser.__init__(self) self.username=username self.channelLookup.update({b'session': session.SSHSession}) netconf_protocol=switch_core.get_netconf_protocol() if netconf_protocol: self.subsystemLookup.update({b'netconf': netconf_protocol}) 

We check the subsystem and update the configuration, provided that this switch handler runs on NETCONF

Speaking of protocols. Do not forget that we are working with NETCONF, and proceed to implementation. Protocol is used to write add-ons on existing protocols and implement their logic. The interface of this class is simple:

  • dataReceived - used to process events for receiving data;
  • makeConnection - used to establish a connection;
  • сonnectionMade - used when the connection is already established. Here you can define some logic before the client begins to send requests. In our case, we need to send a list of our capabilities.

class Netconf(Protocol): def __init__(self, capabilities=None): self.session_count=0 self.capabilities=capabilities def __call__(self, *args, **kwargs): return self def connectionMade(self): self.session_count += 1 self.send_capabilities() def send_capabilities(self): rpc_capabilities_reply="<hello><capabilities>{capabilities}</capabilities>" \ "<session-id>{session_id}</session-id></hello>]]>]]>" rpc_capabilities="".join(f"<capability>{cap}</capability>" for cap in self.capabilities) self.transport.write(rpc_capabilities_reply.format(capabilities=rpc_capabilities, session_id=self.session_count)) def dataReceived(self, data): # Process received data pass 

The minimum implementation of the wrapper over the protocol. Removed unnecessary logic for clarity

We begin to collapse our nesting doll. Since we use an add-on over SSH, we need to implement the logic of the SSH server. In it, we will define keys for the server and handlers for SSH services. The implementation of this class is not of much interest to us, since authorization will be by password:

class SshServerFactory(factory.SSHFactory): protocol=SSHServerTransport publicKeys={b'ssh-rsa': keys.Key.fromFile(SERVER_RSA_PUBLIC)} privateKeys={b'ssh-rsa': keys.Key.fromFile(SERVER_RSA_PRIVATE)} services={ b'ssh-userauth': userauth.SSHUserAuthServer, b'ssh-connection': connection.SSHConnection } def getPrimes(self): return PRIMES 

SSH Server Implementation

For the SSH server to work, it is necessary to determine the logic of the sessions, which works regardless of what protocol they came to us and what interface is requested:

class EchoProtocol(protocol.Protocol): def dataReceived(self, data): if data == b'\r': data=b'\r\n' elif data == b'\x03': # Ctrl+C self.transport.loseConnection() return self.transport.write(data) class Session: def __init__(self, avatar): pass def getPty(self, term, windowSize, attrs): pass def execCommand(self, proto, cmd): pass def openShell(self, transport): protocol=EchoProtocol() protocol.makeConnection(transport) transport.makeConnection(session.wrapProtocol(protocol)) def eofReceived(self): pass def closed(self): pass 

Logic of sessions for all described interfaces

I almost forgot about the handler itself. After all the checks and authorizations, the logic proceeds to the object emulating the operation of the switch. Here you can define the logic for processing requests: receiving or editing interfaces, device configuration, etc.

class Juniper: def __init__(self): self.protocol=Netconf(capabilities=self.capabilities()) def get_netconf_protocol(self): return self.protocol @staticmethod def capabilities(): return [ "Candidate1_0urn:ietf:params:xml:ns:netconf:capability:candidate:1.0", "urn:ietf:params:xml:ns:netconf:capability:confirmed-commit:1.0", "urn:ietf:params:xml:ns:netconf:capability:validate:1.0", "urn:ietf:params:xml:ns:netconf:capability:url:1.0?protocol=http,ftp,file", "urn:ietf:params:netconf:capability:candidate:1.0", "urn:ietf:params:netconf:capability:confirmed-commit:1.0", "urn:ietf:params:netconf:capability:validate:1.0", "urn:ietf:params:netconf:capability:url:1.0?scheme=http,ftp,file" ] 

The main logic of the handler. Cut out all the functionality and request processing, leaving only receiving capabilities

Well, finally we are merging it all together. We register the session adapter (describes the connection behavior), determine the connection method by username and password, configure Portal and start our service:

components.registerAdapter(Session, SwitchProtocolAvatar, session.ISession) switch_factory=SwitchFactory() switch=switch_factory.get("juniper") portal=portal.Portal(CustomRealm(switch)) credential_source=InMemoryUsernamePasswordDatabaseDontUse() credential_source.addUser(b'admin', b'admin') portal.registerChecker(credential_source) SshServerFactory.portal=portal reactor.listenTCP(830, SshServerFactory()) 

Setting up and starting the server

We start the mock server. To test the health, you can connect using the ncclient library.A simple connection check and viewing the server capabilities is enough:

from ncclient import manager connection=manager.connect(host="", port=830, username="admin", password="admin", timeout=60, device_params={'name': 'junos'}, hostkey_verify=False) for capability in connection.server_capabilities: print(capability) 

We connect to the mock server using the NETCONF protocol and display a list of server capabilities

The result of the request is presented below. We successfully established a connection, and the server gave us a list of its capabilities:

Candidate1_0urn:ietf:params:xml:ns:netconf:capability:candidate:1.0 urn:ietf:params:xml:ns:netconf:capability:confirmed-commit:1.0 urn:ietf:params:xml:ns:netconf:capability:validate:1.0 urn:ietf:params:xml:ns:netconf:capability:url:1.0?protocol=http,ftp,file urn:ietf:params:netconf:capability:candidate:1.0 urn:ietf:params:netconf:capability:confirmed-commit:1.0 urn:ietf:params:netconf:capability:validate:1.0 urn:ietf:params:netconf:capability:url:1.0?scheme=http,ftp,file 

Capabilities of the server


This solution has enough pros and cons. On the one hand, we spend a lot of time implementing and describing the entire logic of request processing. On the other hand, we get the ability to flexibly configure and emulate behavior. But the main thing is scalability. The Twisted framework has rich functionality and supports a large number of protocols, so you can easily describe new handler interfaces. And if everything is well thought out, this architecture can be used not only for working with switches, but also for other equipment.

I would like to know the opinion of readers. Have you done this, and if so, what technologies did you use and how did you build the testing process?