Service Disruption This Morning

Image description

Today between 1:30 AM and 1:45 AM GMT we suffered degraded performance resulting in many requests to the API going unanswered. This was due to a large DDoS attack against our infrastructure which caused all of our servers to intermittently fault. The traffic we received was the same as 6 hours worth of queries but sent over just a few minutes.

Our Anti-DDoS CDN (Content Delivery Network) was able to mitigate the attack starting around 12 minutes into the attack and by the 15th minute normal service was fully restored while the attack was ongoing for another 15 minutes until 2:00 AM GMT or so.

We apologise for any disruption caused by the attack, we usually have our Anti-DDoS detection at our provider set to be quite conservative due to our users often coming under large scale attacks themselves that could trigger our protection if they were to make a large number of queries to us in a short time span. This is why it took a few minutes before the protection activated and restored our service availability to normal.

Next week we'll be going through the activation parameters of our DDoS mitigation and see if we can tune it further for more immediate activation when an attack begins without triggering under normal circumstances.

Thanks for reading and have a wonderful weekend.

Introducing Dark Mode

Today we're pleased to announce the introduction of an often requested feature, dark mode. For a long time now we've been wanting to deliver dark mode and we've even converted several pages to a dark appearance in the past to see how that would look and ultimately we decided to move forward with a full commitment to this feature.

Image description

We first thought about doing it after the popular PVB plugin for WordPress which integrates our API gained support for dark mode. But as our site is very complicated we knew it would take a lot of time and effort to do it correctly. The most difficult part was of course the customer dashboard which code wise is a behemoth of complexity with many unique user interface elements that required special consideration.

And of course we didn't want to just make some of the site dark, it had to be a full transformation and that meant going through some frankly ancient code and bringing it up to par. Even screenshots within our api documentation pages and the preview images on our homepage needed to be remade in both light and dark versions and then automatically change when the user switches between themes.

Choosing the right colour palette for every page was also difficult and we'll continue to tweak things as we did with our default light theme so any colours that look a little off will likely be corrected in time as we go through our normal iterative design process.

We know some of you have been using browser plugins to darkify websites like our own which didn't offer a native dark mode, we welcome you to disable those and check out what we've been able to do with a tailored approach. Hopefully it won't disappoint the dark mode enthusiasts among you.

So that's todays update, to activate dark mode simply click the little two-toned circle button found in the top right of all pages. And if you are already using the dark mode available within your operating system then our site will detect and use dark mode by default unless you override it by using the provided button.

Thanks for reading and we hope everyone is having a wonderful week.

Dashboard UI changes + new custom rule output modifiers

Image description

Earlier this month we made some enhancements to the dashboard UI to help you more easily identify if a burst token is currently in use and by how much the token is boosting your daily query allowance. Below is a screenshot showing how this looks when a token is active.

Image description

This feature is very important to us as it helps you to know if it's time to upgrade your plan or not. Making sure you understand how often your tokens are active and how your plan allowance is being affected helps you make an informed decision about your plan and hopefully save you money by staving off unneeded upgrades until they become necessary.

The other change we've made is on the paid options tab. We used to have a very large cancellation area on this tab which was so big that many customers didn't scroll down to find the other paid options such as updating your card details when your previous card expires, accessing invoices and looking at other paid plans they may want to consider upgrading or downgrading to.

To fix this we've rewritten this section. A lot of the extraneous text has been removed and things have been kept more clear and concise. We always want you to be able to easily cancel a plan on your own without needing to talk to anybody, we strongly believe you should always be able to cancel something with an even greater ease than with how you signed up for it and we believe we're continuing to offer that as shown in the screenshot below.

Image description

And finally we've added two new output modifiers to the custom rules feature. These allow you to forcibly log or forcibly disable logging for something that matches your conditions within a rule. This was a requested feature, prior to now we only ever logged positive detections made by our API or that were altered by your rule to become positive. For instance, changing a proxy: no result to a proxy: yes result.

But sometimes you want to log something that you don't want to be set as a proxy, VPN or other positive kind of detection. Sometimes you want to allow a visitor, not detect their address as something bad and still log them using our system instead of your own local database or analytical service.

As we've received a request for this feature and we think it could be useful to most customers we've added two modifiers which can force on or force off logging regardless of the detection type. These entries will appear in your positive log as type: rule entries.

Image description

Above is an example of how to setup such a rule to log connections from Paris that are Wireless in nature. You can add these logging outputs to any kind of rule even if you've already added other output modifiers like changing a connection type or modifying a risk score. This makes these new modifiers very flexible and can be added to any rule you've already previously created.

That's it for this update and we hope you're all having a great week.

Custom Rule Limit Increases

Image description

Today we've increased the active custom rule limits for all paid accounts and we would like to tell you why we did this and how it affects you. Firstly the new rule limits apply to all accounts. You don't need to start a new subscription or make any additional payments to access the higher limits, it's a free upgrade on us.

There are a few reasons we did this, firstly we've seen an uptick in rule use since we added the custom rule library in January and in-fact we're now seeing more rules being saved to accounts each week than we did in the first 6 months of launching the custom rule feature.

Secondly as custom rules have seen increased usage we've also had a lot more requests for custom rule quantities that go above our plan limits which we do offer but haven't really advertised outside of our blog posts due to the increase in employee time to fulfil those requests.

Thirdly our custom rules started at 6 rules for paid accounts and increased by 3 rules for each larger plan size. This incrementing didn't scale well with our queries per day as it didn't make sense that you could be on a 10,000 per day query plan and have 6 rules but those on our 10.24 million per day plan would have only 39 rules.

So we could have solved the first and second reasons by adding a separate subscription or a paid upsell in the Dashboard. But we felt this would only serve to complicate our product offering. We've always wanted our subscriptions to be easy to understand and as frictionless as possible which is why we only have a single subscription for each plan size and we make it very easy to monitor and cancel your subscription from within the Dashboard.

To solve the third reason we've introduced slightly different scaling. Instead of each plan having 3 more rules than the previous it will now scale by iterating 3 for starter plans, 5 for pro plans, 10 for business plans and 15 for enterprise plans. And in addition to this we've moved down where the limits begin which is now 9 on our first starter plan instead of 6.

This means our first Pro plan now starts at 20 rules instead of 15. Our first Business plan now starts at 40 rules instead of 24 and our first enterprise plan now starts at 75 rules instead of 33. And for our largest customers on our biggest Enterprise plan that now boasts a 105 active rule limit.

As previously you can still create as many rules as you want and the limit only becomes active when you try to activate more than your rule allowance. This brings flexibility for you to save rules for historical purposes or create seldom used rules and keep them around for later. In addition to this we're still offering our extra rules for securing your account with a password and a two-factor authenticator so you can take all of these rule amounts and increase them by two if you've done that.

One last thing to discuss is those of you who have already paid us for a custom rule plan which increased your limits outside of your plan type. If the volume of custom rules you have is still higher than your plan then the subscription will remain but we would have already discounted the price to reflect how many extra rules you're receiving above your normal plan size.

If your plan is now equal to or surpasses your custom rule plan we will have already adjusted your subscription to remove the added cost you were paying for these extra rules and will have refunded your most recent payment for the custom rules you purchased.

So that's everything for today we hope you'll make great use of the extra custom rules, we've been hard at work adding more rules to the library as users have requested interesting and useful rules so don't forget to check those out within the dashboard.

Thanks for reading and have a wonderful week.

VPN Detection and IP Attack History Improvements

Image description

Around three weeks ago we enabled a new VPN detection engine on the backend of our service which aims to increase our detection rate of paid VPN services through more active scanning and crawling. And as of a few days ago we've activated its interface output on the latest v2 API endpoint (November 2020 aka the current release).

The reason we've not issued a new API version is because we only do that if the API response changes substantially enough to result in client implementation breakages or unintended behaviour. As the API response hasn't changed there's no need to update your client software. To access this new detection you simply need to have the latest release selected within your Dashboard (which is the default choice) or to select the November 2020 release as shown below.

Image description

So what warranted the new VPN detection and why now? - Firstly since 2017 we've been hard at work improving our VPN detection and we have reached many milestones during that time for instance inferring hosting providers based on their websites, self-maintained descriptors and peering agreements.

But the VPN landscape has changed over time. Now more than ever VPN companies are treating their servers like trade secrets especially as more services like ours make accessing content more difficult through anonymising services which is a marketing point VPN providers use to lure in customers.

We've seen a marked uptick in providers using smaller lesser known hosting providers that try to abstract that they're not residential internet providers. We've also seen VPN companies hosting servers in peoples homes in countries that have uncapped 1Gb symmetrical internet service, even from the biggest well known VPN providers. And finally we're seeing them use rack-space inside residential ISP datacenters to hide behind their residential IP ranges.

All of this presents a problem because these addresses are harder to find, change more often and reside in "safe" residential IP space that we traditionally have been reluctant to assign as VPN's because of the potential for false positives.

The main reason we see companies doing this is to allow streaming services to work for their VPN customers. Specifically services like Netflix, Amazon Video, Apple TV+ and Disney+. These services all block datacenter IP Addresses and so the need to move into residential IP space has become important to VPN companies that want to provide seamless coverage.

This is where our new VPN detection comes in. It's by far the most aggressive detection method for VPN's we have ever created. We're indexing the infrastructure of most VPN services on a constant basis in an automated way. By user share (popularity essentially) we believe we have 75% of commercial VPN services indexed and unlike our prior methods (which are still in use) it's much more targeted being able to pluck out just the bad addresses from a sea of surrounding good ones including when they exist in residential IP space and most importantly do it in a fully automated way that scales.

As we said above this new detection is live only on the latest API version so we would recommend making sure you're using that (and by default you will be unless you've selected an older API version). And our previous VPN detection methods are still in effect, this additional more targeted detection is to function in conjunction with our more broad approaches to increase overall detection rates.

The other change we've made today is regarding IP attack history. You may have seen something like below if you've looked at our detailed threat pages before:

Image description

As we said above we're seeing more and more VPN's enter residential ISP space and one of the problems with that is those addresses are quite often dynamic meaning they are re-assigned to different customers quite often. So our pre-existing IP Attack History needs to be tweaked to be more aware of this.

We've added in some more intelligent weighting to how long history is stored and when it's displayed. Addresses that are consistently seen performing bad actions (automated signups, comment spamming, attacking our honeypots etc) will have their full history always available but addresses that are rarely seen doing bad things or that our system believes are part of a dynamic pool of addresses will have their history condensed to show more recent attacks and past bad history will be erased sooner.

We've also tweaked the risk scores that you'll see including for addresses with a lot of attack history but also for addresses we know are definitely running VPN servers but haven't yet been seen performing bad actions online.

So that's all the updates we have for you today, we know that VPN detection is very important to you all as it accounts for almost all the contact we receive from customers when they want to tell us about an address that slipped by our detection.

One last thing to mention about that, we really appreciate it when customers contact us about addresses we don't detect that we should be. We thoroughly investigate every address to determine how it got by us and what automated ways we can use to detect it and ones like it in the future.

This is why we recently updated the contact page to automatically fill out your email address for you in the web contact form. That small change may not seem like much but we have some incredibly dedicated customers who use this contact feature very often and anything we can do to speed up reporting helps them which helps us. We are aiming to have a more instant report button present on multiple pages and interfaces of the site sometime in the future so this is just a fast to implement intermediary change until then.

Thanks for reading and we hope everyone is having a wonderful week.

Database Sync + Account Security Improvements

Image description

As the service has grown in popularity we've hit up against many different technical limitations. One such issue was our original API code being single threaded which limited our server hardware choices to systems that contained very high frequency CPU's. This was a problem because the industry as a whole was moving towards many-core CPU's that have lower frequencies.

We overcame this problem by rewriting everything to be fully multithreaded and to take advantage of as much pre-computed data as possible. We were able to reduce average query latency from 350ms+ to below 10ms before network overhead and we further refined these methods with our v2 API which launched several years ago.

One of the other major points of contention has been database synchronising. We like to have all our nodes be in sync as much as possible and we've gone through many different iterations of our database system to accomplish performance and latency that exceeds the demands of our customers.

But as the growth of our service has continued we've seen the database demands increase exponentially. As we add more data to the API, better logging for our customers and as we gain many more customers the rate of database changes is in the hundreds of thousands per minute.

This is a lot of data to keep synchronised on so many servers and we don't want to use a singular central database because it would hurt our redundancy, latency and our commitment to customer privacy (data at rest is encrypted with nodes within specific regions needing to request keys at the time you make a request that requires access to your data within that physical region).

And that brings us to today. We reached a point last week where the changes to the database were simply too frequent to keep up and we had a synchronisation failure of customer statistics and ip attack history data. To this end we've been hard at work to build out a new system that can better handle these changes and we've deployed that today, we're hopeful it will provide us with breathing room until next year when we'll likely have to revisit this situation.

One of the other changes we've implemented is the usage of Solid State Drives (SSD's). Last year we made the decision that all new servers would be required to have either SATA or NVMe based SSD's and we currently have 5 of our 9 servers running completely on SSD's. It is our intention to migrate our preexisting Hard Disk Drive based servers to SSD's as we naturally rotate them out of service, this grants us huge response time improvements for all database operations including synchronisation.

Image description

We also wanted to talk about account security and one of the ways we're looking to increase the security of your accounts. It has been brought to our attention that all the automated emails you receive from us contain your API Key and since these keys are important to your accounts that we should not display them so frequently.

So as of today we're now not going to include your API Key with automated mails except for when you signup or perform an account recovery. That means when you receive a payment notice or a query overage notice and all those other kinds of mails we will not be including your API Key with those.

Having the key displayed in these mails was mostly a holdover from a time when we didn't have passwords and account recovery systems and we didn't want users to lose their keys, including them on our mails was a way to guarantee they wouldn't be lost. Times have moved on significantly from then and we agree with our customers that have brought this up with us that it's time to no longer include them in the majority of our mails to you.

We hope everyone has had a great easter break and as always if you have any thoughts to share feel free to contact us, we love hearing from you :)

Dashboard Interface Improvements

Image description

This month you may have noticed some color changes to the website, specifically the top navigation bar and the sub-navigation bars on our pages that featured them have had an overhaul with better gradients that remove the middle grey hues that appear when traversing the RGB color space in a direct line between two opposing colors.

Image description

We've also enhanced the readability of text on those navigation bars by adding a subtle drop shadow, its a simple change but one that was sorely needed. We've also added tooltips to buttons that only used icons to make the dashboard more accessible to new users who are learning how it works for the first time.

Image description

Probably the biggest functionality change though has been the addition of automatic data refreshing for our logs on both the threats homepage and more importantly within the stats tab of the customer dashboard.

The way we added this feature to the dashboard has been in a very user-centric way because we know not everyone will want the log to automatically update as they're looking at it and so there is a pause button which remembers your choice so you won't need to pause it again on your next visit to the page.

We also automatically pause the log refreshing if you expand any of the data shown and then unpause it when you change pages or collapse all of your detailed views. And of course both of these auto-unpause methods will respect your chosen play/pause preference.

We've had a lot of customers requesting this automatic data refreshing feature over the past several years and while it has taken some time for us to get around to it we're very satisfied with the implementation.

In addition to these visual and usability changes we've also been enhancing the API, just today we launched SOCKS4A and SOCKS5H proxy types in all version of our API. Although we always detected these types internally we've never exposed them in the API until now. We did have requests from security researchers for more detailed categorisation and this is us fulfilling those requests.

That's it for now we hope everyone is having a great week so far and thank you for reading.

Teaching an old Raven new tricks

Image description

If you've been reading our blog for the past few years you may have seen a post we made in December 2019 where we detailed our inference engine called Raven. This is the software we created that runs not only on each of our nodes for real-time inference but also on separate dedicated hardware tailored specifically for post-processing inference.

Since that post we've changed where and how Raven functions. We broke up our single dedicated inference server (STYX) into many separate servers and repurposed STYX as a distributor of work instead of a processor for Raven. We had to do this due to the service becoming so popular we couldn't process the volume of addresses we were receiving in a time frame that made sense.

This month we've been hard at work improving Raven and the associated infrastructure that supports it. We've reached a scale where traditional databases, storage systems and networking are not scaling for our use case, we want to be able to process many more addresses per second and in a more thorough way which requires more resources at every link in the chain from the way addresses are collected, transported through our infrastructure, processed and delivered back to our cluster nodes.

To this end we've completely changed how addresses are collected from our cluster, it's now multithreaded and scales seamlessly to the volume of data waiting to be picked up. We're also now storing addresses in a high-performance in-memory database served by MariaDB. We're seeing very high transaction throughput combined with extremely low CPU utilisation from MariaDB and in-fact this one change from our prior custom solution reduced CPU usage from 97% to 30% on our work distribution server.

But that's not all, Raven for us is more than just a data analysis tool, it also includes what we call agents which allow it to be extended with plugins that serve as data collectors and data formatters. Essentially a way to feed Raven auxiliary data through a multitude of means. For instance processing firewall logs from our data partners or even agents that probe addresses directly to see if they're running proxy servers.

That last agent we mentioned that probes addresses directly has become a very important tool for Raven because it provides conclusive evidence which helps to reinforce its prior conclusions and thus help it to make better decisions in the future. Another advantage of this particular agent is its ability to find new proxies from where we have no data. This is important because we, like all anti-proxy services, operate a network of scrapers which scour websites that publish proxy and vpn addresses in an attempt to collect as much data about bad addresses as possible.

The problem however is many of these websites have data that overlaps with one another and so there is not many sites publishing proxies that we don't already know about. We spend a lot of time locating new sites and often even if they list thousands of addresses as being seen within the past several minutes we already detect 99.9% to 100% of them. So the ability to seek out unique addresses that have never been published publicly is important if we want to have a full picture which is certainly our goal.

And indeed we do find many unique proxies on our own, in-fact we find hundreds of unique proxies daily that have never and in some cases will never be listed on publicly accessible proxy indexing websites. With how important this agent is to our service we spent the last few days rewriting it to be faster and smarter. We've come up with some subnet searching algorithms that increase the chances of finding bad addresses without needing to scan an entire service providers address range in addition to some other improvements that we're going to keep close to our chest for now due to their trade secret value.

The last piece of the puzzle has been iterating on Ravens inference models. In the past we would collect a subset of important decisions and their outcomes to train Raven. It would actually almost take a month each time. But we've been able to improve the training time by breaking up the data into smaller units which can be iterated on across different computers. In addition to that we upgraded our main workstation that we would traditionally compute these models on which has cut the training time in half. We're now able to produce a new model in 8 days down from the 26 days it took previously which is a significant improvement that allows us to tweak Raven more often.

So that's what we wanted to share with you today. If you often monitor our threats page which is where we post unique proxies we've found that haven't been seen on indexing websites before you may notice a vast increase in the postings over the past 2 days. This is going to continue to ramp up as we further tweak the new software and find the right balance between detection rate and processing throughput.

Thanks for reading and have a great week!