Welcome Cronus & Metis

Image description

We're sure if you've been following our blog this year you would be surprised to see another new node announcement quite this soon let alone two. But it's true, we're adding two new nodes today called Cronus & Metis and these are quite special because they're the first nodes we're activating outside of Europe.

This year has seen us increase our capacity quite a lot to meet the growing demands of our customers and while we were intending to add new servers early next year we've pushed up our timetable because we're seeing increased request volumes from outside Europe.

Specifically 1/4th to 1/3rd of our traffic (depending on the time of day) now originates from America and Canada. Having these requests traverse the Atlantic Ocean to our servers in Europe has meant our north American customers are facing higher than acceptable latency and so today we've added two new server nodes in Canada just on the border with the United States.

Cronus was the greek god of time and so it's aptly named as its only job will be to serve the North American market with an aim to reduce their access latency to our API and Metis is the personification of prudence or in other more common language, cautiousness. And we're being cautious with our North American rollout by adding two servers for load balancing.

In addition to the new nodes we've spruced up the status page a bit breaking out where our server nodes are available. At current that is Western Europe, Eastern Europe and now North America. It is our intention to add servers in the Oceania and Asia regions to serve those areas in the same way and we will likely add such server nodes next year.

Like all our other servers these new ones are part of our unified cluster architecture and so while all North American traffic will go to Cronus & Metis it will seamlessly failover to our European servers if there are any problems. Your data is synchronised between all nodes and protected from downtime without you needing to do anything.

So that's what we have for you today and we hope you enjoy this one last present before Christmas.

Happy holidays everyone!


CORS take two

Image description

Today we've released some major updates to the CORS (Cross-Origin Resource Sharing) feature found within your dashboard and we're excited to tell you about them.

Firstly we've made some under the hood changes to how your origins are stored on our servers and processed by our v2 API endpoint which should reduce retrieval time from our database and lower the latency incurred when answering a CORS based request from customers, especially for those of you with a large number of origins on your accounts.

Secondly we've improved the import/export experience within the Dashboard. The exported CORS files will be easier to parse and edit with UUID's scrapped from the process, only domains are present within the exported files now.

Thirdly we've added wildcard support which means if you have a lot of subdomains you no longer need to enter them all manually and can instead put a star to indicate all subdomains and the main domain should be allowed to use CORS for your account. (example: login.site.com can become *.site.com).

Image description

Fourth we've finally added a Dashboard API endpoint (currently in beta but accessible to all customers) which allows you to list, add, set, remove and clear your origins but crucially it allows for large batch changes to be performed for both adding and removing origins which supports usage at scale. You can view all the documentation for this here

So those are the four changes to CORS, we know you'll find them useful, especially the API and the wildcard support which have been often requested by customers. One last thing, if you intend to use the CORS API please report any issues you come across to us and we'll work to remedy them quickly.

Thanks for reading and have a great weekend!


Introducing Burst Tokens!

Image description

Today we've launched a major new feature called Burst Tokens which allow our customers to make even greater use of their plans without needing to lift a finger.

For a long time we've had customers coming to us with a simple problem. Most of the time their usage fits within their plan size but sometimes they have bursts of activity which go beyond their plan size. This is a problem because it doesn't make economic sense to increase your plan size just for those one or two days a month when you need a few more queries.

This scenario plays out fairly often especially with websites that receive unexpected viral traffic and game servers which are often targeted by DDoS attacks from disgruntled players.

So to solve this problem for our customers we've introduced Burst Tokens which while active allow you to go over your daily allowance by five times until the next daily reset time. And best of all the tokens are redeemed on your behalf automatically when you go over your daily allowance.

Image description

You'll receive tokens on the 1st of every month and only a single token can be consumed each day with the plan you have dictating how many tokens you're granted. For our free customers on our 1,000 daily query plan they are given one token to use each month while our Starter plans get 3, our Pro plans (as illustrated above) get 5, our Business plans get 6 and our Enterprise plans get 7.

As we said above you can go over your daily allowance by 5 times when a token is consumed. So if you're on our first paid tier which is our 10,000 daily query plan and you happen to go over that, a token will be automatically redeemed and your daily allowance for the remainder of that day will be 50,000 queries.

At this point you're probably wondering if this is a new paid feature and actually it's not. All past, present and future customers with an account will have access to the new token feature and in-fact by the time you're reading this you'll be able to see your available tokens in your customer dashboard. We've also updated our usage dashboard API endpoint with burst token availability.

And so that's all there is to it, a free upgrade on us to help supercharge the plan you already have. But don't worry, we'll still send you normal usage emails when you go over your plans daily allowance but they'll now also detail if a burst token was used so you know if it's time to upgrade your plan or if it's just a spike in usage that your tokens can handle.

With the launch of this feature we have released a new version of our API v2 dated November 17th. If you already have your version set to use the latest stable API version you will be using this version of the API automatically, otherwise you can select it within the customer dashboard. We're not expecting any implementation breakages but some of the status code messages have changed wording to indicate if a burst token has been consumed or not.

If you have any questions about the new feature as always contact us, we love to hear from you.

Thanks for reading, stay safe and have a wonderful day!


Welcome Aura

Image description

It's hard to believe only 9 months have passed since we introduced our Eos server node and yet we're already introducing another new server node to our cluster.

This year has been filled with difficulties as the world continues to grapple with the COVID-19 pandemic. A result of which has meant more people than ever before have turned to the internet for their communication with loved ones, entertainment, education and work.

As our company helps individuals and businesses protect their infrastructure we too have seen the demand for our services grow. In-fact we broke every record we held this year. Monthly, weekly and daily signup records to the service were easily broken multiple times as were our daily query volume records. We saw record levels of user activity on the website and general enquiries about the service from potential customers increased by an unbelievable volume.

And this is why it's so important to always be continually investing in our infrastructure. The previous blog post to this one explained how we had added multiple high-end servers for post-processing inference so that our proxy detection can continue to be the best available. Today we continue that focus by adding a new high performance server node to our cluster.

Aura is the Titan goddess of the breeze and fresh cool air of the early morning. And it is also now our most powerful server node featuring a high performance AMD Zen2 processor. This is the beginning of a new platform for us, this single server is the equivalent of three of our 1st generation server nodes in raw compute power giving us enormous growing capacity.

It is our intention to replace all our 1st and 2nd generation infrastructure with nodes of this capability and to keep the cluster around 10 servers or less spread out around the globe offering us redundancy against not only individual system failure but also geographic problems such as international fiber optic cable damage. Already we make use of multiple datacenters spread across Europe and we will expand on this as we add more systems to the cluster.

At the moment Aura is in the final stages of provisioning where we perform rigorous tests to make sure it's up to our standards. So far it's looking good and we're expecting Aura to answer its first customer queries starting tomorrow.

Thanks for reading, stay safe and have a great day!


Post-Processing Inference Infrastructure Update

Image description

Today we would like to share with you some updates we have regarding our machine learning infrastructure geared towards post-processing. This is where you send us an IP Address to be checked and after we give you an immediate answer we put it into a large pool of addresses to be examined where time is no longer an issue.

In February 2019 we made a blog post about a new server we introduced called STYX which was designed to do all post-processing inference to free up resources on our core cluster so they could spend more time answering queries instead of processing data.

Image description

You can see above a graphic we shared within that post illustrating how our (at the time) three cluster nodes would feed data into STYX to be processed by its many processing cores.

Since then the volume of addresses we process every day has increased to an unimaginable amount. To keep up with this growth we've increased our cluster size from 3 to 5 servers, replaced our weakest servers with stronger ones and gone to extreme levels of code optimisation all of which has allowed for our level of growth without spending obscene amounts of money on cloud providers.

But coming back to STYX we did hit a problem there. No amount of code optimisation can get around the fact that there are simply too many addresses to process on one system. We put in some stop-gap measures by creating a ratio system where only half of addresses were tested, then 1/3rd, 1/4th and finally only 1/5th. Eventually if we continued in this manner only a tenth of all addresses would be able to be processed by the post-processing engine on STYX.

And so that brings us to todays post where we have invested in an entirely new range of infrastructure dedicated to inference. They consist of various servers with various core counts. Some of the largest servers we've acquired for this now feature dual 18-Core XEON's. In-fact our inference infrastructure is now several times more powerful than our cluster that answers customer queries.

STYX is still with us but it has been repurposed as a job scheduler. It will now monitor all of the inference infrastructure, hand out jobs as needed and retrieve the results. We created a little fun visualiser for ourselves to see what STYX sees as it hands out work which we thought would be interesting to show below.

Image description

So what is the net benefit of all this work? well the main thing is we can once again fully examine every single address we receive from customers within our post-processing inference engine and we can easily add more servers to the inference infrastructure as needed in the future which is something we will need to do as the service becomes ever more popular.

One of the quickest ways to see the results of our new infrastructure is to check out the threats page. This is where we post only addresses our post-processing inference engine found to be proxies and it takes a random assortment of the most recent few hundred to be displayed there. It wasn't so long ago that all the entries on that page would show as last being seen 8 to 12 hours ago but with the new engine steaming through data we're discovering more proxies per hour than we used to discover per day.

This is why you'll see a lot of addresses on there were last seen just an hour ago or less. Being able to obtain knowledge of proxies like this that are "undiscovered" on the wider web (ones we've discovered that aren't yet posted publicly on message boards, blogs and websites) is important to us as it's these proxies that are perhaps the most dangerous and most likely are being abused by the individual[s] who set them up in the first place (often on hacked remote servers and IoT devices).

In addition to broadening our infrastructure we did also rewrite the way we synchronise information within our cluster. We found with so much data being updated per second there were some bottlenecks which we were able to completely solve several days ago.

Some of this was caused by the immense data changes occurring due to the new infrastructures ability to process so much data at once and some of it was a watershed moment caused by some internet problems affecting one of our cluster nodes that meant it had more data to synchronise than usual once it came back online, during this process we noticed how it wasn't able to reach parity with the other nodes after several hours due to just how much data was changing during the synchronising process.

So that's what we wanted to share with you today, bigger and better infrastructure that leads to tangible improvements in proxy detection.

Thanks for reading and have a great week!


Building a better Detective

Image description

One of the challenges a service like ours faces is the existence of anonymising services that specifically go out of their way to obscure their infrastructure. And so while it's easy to detect most addresses and their suppliers there's always a small percentage that slip by.

Which is why this past month we created a list of these difficult to index suppliers and went about building tools that were tailored specifically to scan and verify the addresses they offer. Traditionally when we want to scan a provider they will offer a webpage of addresses or hostnames which makes it easy to scan them and correlate what we find across our honeypot network and other scraped websites, this is part of our collect and verify strategy.

But some of these let’s say hardened providers will either mask their addresses behind signup pages, paid memberships or other means. For instance it's becoming very common for VPN providers to only show you their server addresses once you've signed up and paid for service and with there being hundreds of VPN suppliers paying for all those subscriptions isn't really commercially viable.

But even the free providers are becoming more shrewd by inserting randomly generated addresses within their legitimate address pools to thwart page scraping and some sites only show you addresses once you verify you're not a bot by solving a captcha or require a javascript engine to decode the addresses before they're rendered on the webpage.

All of these are things we worked to solve this month with what we're calling our Detective. It's a new module within our custom scraping engine which allows for a lot more thought during collection and processing. The results have been quite promising with our list of detected proxies and virtual private networks steadily increasing since it went live.

Some of its features include:

  1. Web and non-Web collection for anonymising services that only offer an application for accessing their network of servers.
  2. Javascript engine for solving any kind of proof-of-browser anti-Bot measures during address collection.
  3. Captcha solving support using image recognition with a fallback to human based solving.
  4. Bad/Fake/Generated address discardment through time based observation and frequency of appearance.
  5. Pattern recognition for indexing VPN providers infrastructure based on a few hand entered sample hostnames.

There are some very well known providers that are constantly being abused that have employed one or more of the above tactics to make it difficult for services like our own to get a full picture of their infrastructure but the new system we've devised has been able to break all of these approaches.

As always if you've come across an address, range or service provider we don't yet detect please contact us, we really do investigate every lead sent to us by customers.

Thanks for reading and have a great week!


API Version Selector added to the Dashboard

Image description

After we updated our API yesterday it became apparent that some customers had really depended on our type responses only being present for positive detections which is understandable because we hadn't used this field for clean addresses before.

Because of this we've decided to push up our launch of the API version selector which we mentioned in yesterdays post to today. This means you can now choose which version of our API you want to be used (by the dates of major revisions) and when you select a version you'll get a neat explanation about what changed compared to the previous dated version like in the screenshot below.

Image description

We are also providing a way for you to select which version gets used by adding &ver=date to the end of your queries. For example &ver=17-June-2020 and when you do that you'll get a version response back from the API like so:

{
    "version": "17-June-2020",
    "status": "ok",
    "98.75.2.4": {
        "proxy": "no"
    }
}

Which lets you know the version you requested was in-fact the version you received. This version indicator will not be present if you're using the latest version of the API via the selection box in the dashboard (the default selection) or you've not provided the &ver= flag with your query.

We hope this will help customers to plan for future changes to the API so they can upgrade their implementations when they're able to do so instead of on our release schedule.

Thanks for reading and have a great week!


API updated with new type responses

Image description

Today we've launched an update to our v2 API which includes new type responses including Residential, Wireless, Hosting and Business. These new type responses are to help you create better custom rules that target specific connection types beyond just the ones we've determined to be proxies or virtual private networks.

We'd like to focus just on one of those new types for a moment, it has been a long requested feature that we identify wireless connections because many customers have problems with malicious users who utilise wireless access points to get around IP bans. This has been because it's very easy to acquire a wireless connection and it's even easier to get a randomised IP Address from a cellular network provider.

And so we've added that type response alongside residential, hosting and business. Like all our types you shouldn't expect a type response to always be there as we won't show you any kind of generic or default response if we don't actually know what kind of a connection an IP is utilising.

In addition to this change we've also made a lot of back-end changes to how the API functions, specifically around storing and retrieving data (which includes our new type information even for clean addresses and location data) and also how the API is initiated on our webserver including API sub-versioning which isn't yet exposed to customers but we are using it internally and it is our intention to make it available to customers in some manner in the future.

So as of right now the new types are live which means you can build custom rules within your dashboard to utilise them immediatly. We've already updated our API documentation ahead of todays launch as-well including the test console where you can try it out in your web browser.

Thanks for reading and happy querying!


New design language & forthcoming API improvements!

Image description

If you visit the homepage today you may notice it looks a lot different to the one we've been serving for almost two years now and that's because we felt it was time to iterate on our design language and follow some of the newer things we've been doing particularly in the customer dashboard.

We want things to not just look nice but be consistent so that all of our pages and the features found on them feel part of the same product. We decided to start with the homepage as it may be considered the most important page of a website when viewed by potential customers.

So what have we changed? Firstly we've changed our call-to-action section which draws your eye to use a wider but softer drop shadow and increased the radius on border edges. Both of these changes give the homepage a more pleasing and modern appearance.

In addition to that we've done away with needless single-pixel borders around some of the information we showed instead embracing white-space and using soft pastel backgrounds beneath important information such as our API URL and our Live API Result area.

But it isn't only the aesthetics we've improved as the live api area now includes buttons for some example addresses and the API URL now updates in real-time when you change the address you're checking. We feel both changes will help to make a much better first impression.

As we mentioned this is simply the first page of the site that is receiving our updated design language and it will be implemented on other pages soon too so make sure to check back for that!

Image description

The other thing we wanted to talk about is a forthcoming improvement to the API, a huge change in-fact. Later this month we'll be enabling a new range of type responses for clean addresses so you can determine if an IP Address belongs to a residential address, business, wireless operator or hosting provider.

These new type responses have been on our roadmap for a very long time and it has taken considerable effort to provide this data with a high enough confidence level that it's not just usable but reliable. We first began writing code to implement this feature a little under a year ago and it has taken until now to reach a point where we feel it's ready.

But it's not just about the data itself as whilst we prepared to deliver this feature we went through the API and overhauled some of the ways we store and access metadata about all addresses (ISP, Location, Type, Threat etc). This was necessary to make available the new connection type information for all addresses, not just the bad ones. The resulting improved code will help us to deliver new kinds of metadata and in a more timely fashion in the future which is something customers are always asking us for.

The new clean type responses once available through the updated v2 API will be accessible like all our features to all customers whether you're on a free or paid plan. That means we'll be one of the only API's in the world offering location, provider, connection and anonymity information about IP Addresses for free. This is something we're unequivocal about, free matters and we are committed to our full-featured free offering.

That's everything we wanted to share with you today, please check back soon for another blog post where the new clean type feature will be going live. Thanks for reading and have a great week!


Improving Account Security

Image description

Last month we made a post where we told you about a new feature within the dashboard that rewards you with extra custom rules when you secure your account with a password and a two-factor authenticator.

Today we're bolstering your account security in two major ways to help combat account takeovers which have been steadily rising over the past year.

The first change is if you have a two-factor authenticator attached to your account then you're no longer able to create an account recovery code through our automated process here.

Instead when using the account recovery page you'll be sent an email where a support representative from proxycheck.io will accept evidence of account ownership from you and the recovery request will be manually evaluated. This is to stop situations where someone compromises your email account and then has the ability to gain access to your proxycheck account through that chain of access.

The second change we've implemented is login security alerts. From now on when you login to the dashboard using an address we've not seen you use before we will send you an email detailing that login so you can quickly take action if it wasn't performed by you.

As we mentioned at the start of this post account takeovers are on the rise. As our service becomes more popular so do the attempts on your accounts. We've seen a large increase in credential stuffing and so it's very important that you secure your accounts, we really cannot stress that enough.

Thanks for reading and we hope everyone is having a great week.


Back