Network Design Scenario #3: Remote Access VPN Design
Updated: May 2, 2020
Information covering Network Design of Remote Access Client based VPNs aka SSL VPN
The "Design Scenario" series is where I showcase different network or security designs. Generally I will list a few requirements, go over the designs and discuss the pros and cons. I will try to steer away from specific configurations as the goal is always to try and be vendor neutral but I will usually provide config guides for reference and support material. I hope this provides you with an alternate way to look at a network engineering or a defense challenge. Feel free to share your thoughts or your own designs as well!
With the recent global activity related to the COVID-19 pandemic of limiting social gatherings, shelter in place orders and businesses closing, there has been a greater need to work remotely from home. I've heard of stories from others about scrambling to update their VPN but I've also seen some forum posts about design help for setting up VPNs, so I thought I'd create this post to help with those working on a VPN design solution. Personally I had the fire drill like some did at work once the orders were announced. Due to an aging remote access setup we had to rush our refresh plan ahead a couple months in order to get the business ready for a 5x increase in remote workers.
This corona virus crisis will come and go, but the need to work from home will always be there! I hope you and yours are safe and sound, and a big thank you to all the first responders out there on the front lines, including those in IT.
In the last design scenario we looked at DMZs, in this post we will talk about client based Virtual Private Networks (VPN) usually referred to by their original name of Secure Sockets layer (SSL) VPN. I say old name because SSL is now a deprecated protocol and has been replaced with Transport layer security (TLS). The other type of secure connection in this space is a Site to Site VPN where two remote firewalls create a connection between offices for example, we won't be talking about site to site tunnels in this post.
What client based VPNs do is facilitate a means to connect securely into the corporate network from remote locations via the internet like your home or a coffee shop. The business need for this is clear, so we won't dive too deep into that and cover more of the high level technical aspects of configuring and implementing a remote access solution.
I believe the name SSL VPN came back from the inception of the concept because users would connect via a web browser using SSL to a web page to login. Although the web page based VPN method is still used, most vendors have released client applications that install on a computer which facilitates the secure connection.
Although this article is intended for network operators and those who are already familiar with virtual private networks, if you're not familiar then here is a quick article I found to educate you. It will explain some detail and of the "why" you want a VPN like encryption and authentication etc.
Choices and Considerations
One of the first things you need to consider in any design is what problem are you trying to solve? The problem could be you need outside sales reps to connect to access corporate resources which will help with increasing revenue, or perhaps you have 3rd party programmers who need access to a database in the company data center to help with deploying a new application.
In general remote access should be a part of a business continuity plan. That alone could be the reason you justify deployment, which would mean you'd need to ensure there is some language about remote access and how the design works to meet the BC plan needs. For example if a remote office is hit by a tornado and there are no other offices in the area, likely you wouldn't be flying out employees to another state's office you would have them remote in via a remote access application to perform their work as they did when in the office.
Taking the recent covid-19 situation where people were ordered to work from home - "we need a secure solution to allow people to work from home for their safety". Now that is definitely a business reason to implement or justify remote access VPNs. Next we need to decide how we will do it and what are some of the necessary features that will be required.
The first consideration is where will you establish the incoming VPN connections? Does your current edge firewall support remote access VPNs? Or maybe the plan is to purchase a stand alone appliance because you are migrating to a new vendor. How healthy the budget is might determine that. One of the advantages of purchasing a stand alone specialty VPN appliances is that 'typically' they might have more flexibility when it comes to user policy and security settings. Some appliances also might have more robust client applications. For a long time Cisco Anyconnect was the king of VPN apps but in reality most vendors have really caught up to each other and most applications are very similar as of recent.
You could also use a separate pair of firewalls which is done a lot with Cisco ASAs to get the Anyconnect client while using a more robust edge firewall device like Palo Alto. However, by choosing your edge firewalls as the VPN appliance you somewhat simplify things as you don't need to worry about how you will implement and connect a new appliance in the edge/DMZ or plan for 2 sets of policies between devices. Plus you can save on cost by using one set of equipment. Although, maybe your edge firewall's client application sucks and there is some compliance requirement you have, so in that case you probably would decide to purchase another standalone appliance. There are a lot of options when making a new selection so be sure to research and perform due diligence during the selection process.
This is where cost usually comes into play, (along with the next section redundancy). We won't go into cost here, just remember that in order to justify something there needs to be a business need. Whether that be for security reasons, high availability reasons, or from a technical requirement perspective (e.g. it could be impossible to utilize existing hardware). If you are wanting to switch to a new solution, then examine why the current choice isn't meeting your needs and how the new one will.
Redundancy and State
Is redundancy a requirement? How geographically dispersed and large is the company? Lets say there are 2 HQ offices and 2 data centers that connect to each other in a square. Now the question is - What will you do if HQ1 is hard down and users need to VPN in? You decide to place 1 VPN appliance at each HQ for failover purposes. If HQ1 fails HQ2 VPN will be available and vice versa.
You could set up the units as 1 high availability (HA) pair across the two HQ sites and use a virtual IP that answers for the public IP (in this example parts of the same public IP block are announced from both locations). So if HQ1 is down then all traffic will flow from the internet to HQ2 which will be answering the same virtual IP address. This probably sounds like a head ache and I agree because you'd probably need to be stretching layer 2 to accomplish this and have the same VPN IP subnet exist at each location which requires some routing adjustments; in addition, latency would need to be considered. Nevertheless it has been done and is a possible design selection. Just remember only you know whats best for your environment.
I would only recommend setting up a HA pair when both devices are physically located at the same location or located very close - this would be a choice for say, a single HQ small business. You might even select multiple HA pairs of firewalls at different hub locations strictly for VPN which provides geographic redundancy and hardware high availability - this would be for a larger business.
With HA clusters they share information or "state". The advantage of sharing state between the devices is that failover happens faster and in theory more seamlessly. This is because the standby gear already has all the information needed to takeover like which users are currently logged in. Another positive aspect of HA is that you only need to configure things once because the configuration will sync to the backup.
Another common design is to have the appliances exist separately at each location using different external and internal IP addresses and subnets. Then within the client application you will have a primary and secondary address or profile to utilize. So in the example of HQ1 failing, the client will attempt to connect a few times until failing and then would connect using the second DNS address to reach the HQ2 VPN. This would take longer to failover though than having the single HA pair option.
In both of these options (and in general) you'd need to recognize the application flows and ensure your firewall policy is duplicated on either side for the possible failure scenarios. There is also the chance that the company is small or the need to VPN in isn't justified enough for the purchase of multiple appliances. In that case you'd only need to worry about configuration in one place. The more single deployments there are, the more scalability comes into question as configuration steps need to be performed multiple times.
Protocols and Policy
There are technical policy aspects you will need to decide before the implementation and as you configure your remote access solution - regardless if its a firewall or standalone appliance.
I recommend to have at least a separate VPN zone and IP address subnet on the firewall to decide the access policy. Typically you will an IP address subnet specific for the VPN, be sure to inject it into your routing protocol so devices within the network know where to go. Moreover, if the subnet exists at two locations, that you adjust the routing metrics to so the primary/secondary paths are adhered to. For the firewall, confirm you have specific allow and deny rules from the VPN zone into the corporate network, and also perhaps from the corporate network into the VPN zone (like for push updates or something similar).
Depending on how big the organization is will determine how many VPN groups there will be. I would say the most common are standard end user groups, an IT administrative group, and 3rd party groups. It's important to differentiate the 3rd party users in order to better lock down their access via firewall policy. In addition, its likely that the administrative group will need more access than the typical end user to better manage the network which points to again - having different firewall zones depending on the VPN group/subnet.
What is the authentication policy? Don't tell me local accounts! Active Directory integration will almost always be the choice whether direct LDAP or by using an AAA radius server. Frequently you will have multiple groups setup in AD and on your VPN device. So by adding a user into a certain AD group they will fall into the particular VPN group, and subsequent firewall zone to then decide what they have access to or not. Using this methodology the on-boarding process can really be simplified and controlled. Also newer NGFWs have the ability to create rules based on AD groups and user-id, so there isn't as great of an emphasis on IP based firewall rules.
When looking at how the clients connect, ensure that legacy protocols like SSLv3 and the newly deprecated TLS 1.0 are disabled as they are susceptible to attacks (e.g poodle). TLS 1.1 is generally recommended to disable, and more specifically TLS 1.2, and 1.3 are the preferred configurations. Use secure cipher suites like Elliptic Curve key exchange, SHA256 or higher and for encryption - AES256 or higher.
You'll want to use a verified PKI Certificate (using RSA 2048 bits or higher) signed by a valid CA like Digicert or Verisign. This is so when users connect to the company VPN URL they do not receive an error that the VPN hub is using an 'untrusted' i.e. self-signed certificate - this is important and often not done.
Another option would be to use a non-standard port to connect (e.g. vpn.company.com:12443). Essentially a lot of the automated attack scanning on the internet is for default ports like HTTPS TCP/443. Look at using another port like 12443 or something unassigned/non-standard in order to be less detected. Security by obscurity!
Typically you can configure these items in the VPN settings of your equipment.
Are more than one connections allowed by a single user at once? What will the idle timeout be? How many failed login attempts before an IP is blocked or an account is locked out? Is access allowed 24/7 for all users? These are some of the policy questions that need to be decided. Obviously you'd likely only want 1 connection per user and <5 failed logins before lockout.
There is also the question of what sort of host checks will the head end device perform? This can be an A/V version, OS version, or domain check prior to allowing the host to connect. However be aware that this could also cause problems for clients connecting as its another component being added to the initial connection process which could generate a few extra trouble tickets here and there. I could see this being less of a concern though if your VPN client, A/V, and firewall are in the same vendor ecosystem.
As I mentioned previously most of the setups now use a client installed on the user's computer to initiate the secure connection. However there could be a need to still use the web portal based connectivity. From what I've heard the web portals are more susceptible to certain attacks like cross site scripting but I'm not as familiar with that. One reason to configure it could be if 3rd parties need remote desktop access to only a single server, you could create a single web portal with 1 bookmark which links to a RDP connection to the server. So there wouldn't be a need to install the client on their computer, just provide the web link and create the portal/group as a solution. or it could be a fallback for testing if a client software is having issues.
The last part of the section I want to touch on is connectivity. In the above diagrams I illustrated the stand alone appliances connecting to the DMZ switch and then to the firewall. That is one connectivity choice I would recommend if your stand alone appliance is indeed a firewall, but if your device is not then it's probably best to connect it to the firewall and then perform a NAT to it (if using v4) or bridge it via transparent interfaces. This is so all connections will first travel through the firewall before hitting the VPN; therefore you can better protect the hardware by having an IPS engine inspect the incoming traffic. A lot of devices will have an inside and an outside interface so you could connect each of them to the firewall (or switches and vlan to firewall) and assign them different firewall zones. Then once the tunnel is established and the user is authenticated it goes back to the firewall where it's handled by the internal zone polices based on VPN group and IP address etc.
Split Tunnel or Full Tunnel
Filtering malicious traffic when end users are off the corporate network is always a concern and a common requirement. As we know users will often use business hardware for personal reasons, combine that with the possibility of the user clicking a malicious link and there could be a few problems.
You'll probably notice when setting up the SSL VPN groups that you have the option of full tunnel or split tunnel.
With full tunnels all of the user's traffic will go through the VPN even if its destined for the internet. The pro of using full tunnels is you can apply all the filtering on the corporate firewall without the need of having another 3rd party service like Netskope or Umbrella which filters users traffic off net (i.e. cost savings, policy simplification). Moreover, the downside of this is that more bandwidth will be used on the corporate internet connection because now all that internet traffic is traveling through the VPN and out. This is a big consideration if all the storage and backups are handled in the cloud, although one could make the argument that the usage could be the same if the users are regularly within the corp network.
With split tunneling you are designating which subnets to send over the corporate VPN tunnel. You can designate certain IP subnets like 172.24.8.0/24 or a supernet like 10.0.0.0/8. The plus side is internet bound traffic will leave out the user's local network thereby saving bandwidth on the corporate side. Additionally if the host is using a personal computer or is a 3rd party vendor, you won't see any unnecessary traffic which could generate logs that are embarrassing or false positives. However as I stated, you will lose some visibility and also the ability to apply company internet firewall filtering policy that would typically be applied when the user is on net. Therefore you you might need another application client side to facilitate that protection.
One advantage of split tunneling is that it can usually solve a common problem you've probably encountered which is the problem of 'over lapping subnets' (been there before!). The problem happens when the local network of the user overlaps with a network on the far end that they need to access. This is why it is bad practice to use something like 192.168.1.0/24 inside a business network because that is one of the most common IP schemes used by default on home routers.
For instance lets say the user is trying to access a server on the company network at 192.168.1.5 and the local network at the user's home is 192.168.1.0/24. When the PC gets the request to access 192.168.1.5 it will ARP out the local network thinking the PC is directly connected instead of sending traffic down to the far end VPN gateway. Consequently the request fails and the user is unable to access the server. With split tunneling you have the ability to create specific routes, so you could create like a specific /32 route for the server. Although that's not too scalable it is a valid fix when in that situation and is available with split tunneling.
2 Factor and Licensing
A growing trend in the last few years and a common best practice is to use 2 factor authentication. So when one goes to login after entering their username and password there is a prompt to enter in a token or to accept a push notification. The token application can be from an app on the computer, a cell phone, or a method like e-mail or SMS messaging. The application method is generally accepted vs SMS or e-mail as the more secure choice, however just having the second method at a minimum is good. These solutions can generally be deployed on-prem or from the cloud via APIs or similar mechanisms.
Applications from the likes of RSA, Okta, and Duo are some of the most popular picks. This is a choice you need to make if you want the added security, although it does add a little complexity and likely cost if going to a 3rd party, it will add peace of mind. Once you find a solution you like, you will need to ensure its compatible with your VPN device. Sometimes its best to see what the vendor supports before searching as you might find something you like but isn't supported on your hardware. In addition, there are some companies that have 2FA included in their ecosystem like Fortinet (FortiToken) but it still has a per user cost as far as I know.
Licensing has been taking a few people by surprise these days I'm sure. As there was a quick added need for a large increase of users to be working from home. Companies had to scramble to check their licensing and possibly order more. Sizing your solution is important from a hardware perspective but also from a licensing perspective, so as you are checking for that max client connection metric also double check if there is a licensing component. Get the cost because it can add up quick if you have a lot of users, but you can also probably receive volume discounts in that situation. Typically it will be a max user or max connection type license.
I always recommend to leave extra breathing room for these unforeseen circumstances, and that also includes having enough licensing to accommodate an unexpected increase. To some vendors credit they were offering free temporary increases during the current situation which I think is a good thing.
Implement and Operate
The last thing I'd like to talk about is the implementation and operation of the VPN solution. The implement and operate phases are usually after the design and planning phases. The pre-planning and testing is always important when moving from one solution to the other. Before finalizing your design into production it could be prudent to perform some testing to make sure the client application experience is smooth and that the solution operates as intended (e.g. verify full tunneling, test 2FA). Its easier to troubleshoot during a pilot or pre-deployment vs. production.
How will you deploy the new client and train users? Maybe it can be pushed via group policy to computers which can reduce time spent. User and IT team training can be something that is overlooked as well before the implementation, sometimes vendor documentation is adequate but other times you will need to develop your own step-by-step document for guidance.
As far as a migration strategy, typically you'd create a new URL for the new VPN destination and also assign a new IP address to it. By standing up the new equipment in parallel to the current one you can migrate user groups over in phases, possibly starting with a pilot group, or the IT group. Therefore some of the unforeseen scenarios can be experienced before a large roll out and some kinks can be worked out as well. Hopefully you have extra public IPs to provision for it.
You could also be in the situation where your primary firewall is also your hub for SSL VPN and you need to migrate to a new firewall brand. In this case it's probably best to deploy the new firewall and turn the old one into a stand alone appliance connected to the new firewall.
After the implementation and when you move to the operating phase, keep your appliance and client software up to date as there has been a lot of publicity around unpatched VPN appliances. Sometimes because you are a using out-of-date client software its possible to 'pop' VPN boxes where the attacker can either login or take control of the box and thereby compromise your network. Just food for thought.
Reporting is another item that is important after the design and implementation. Reports like total number of logins and total number of failed logins can be a useful metric to review on a weekly or monthly basis. This is to attempt to find any anomalous or malicious activity. It could also help with trending to see the overall usage and if more licenses are needed. Consider the analytics that a device can report on when making your initial device choice.
To conclude, we talked about some of the different aspects of a remote access SSL VPN solution. We talked about some of the decisions you'll need to make like where to terminate the secure connections and the types of configurations you might encounter. We also discussed some of the policy choices and how the firewall configuration might look like. Evaluate using secure methodologies and also 2 Factor Authentication to better protect your internal resources. The business needs are fairly clear for remote access, but always be certain on what the problem is and how you are solving it. Decide between the tunneling options can also determine if you can provide the same amount of security to a end user when off network as they would get when on network. Ultimately what is important is that you provide an easy to use and secure connection into the company network. I hope this helped you, thank you for reading.
Would you like to know more?