June Newsletter

Image description

Today we sent out the first newsletter of this year to users who have the "New features and improvements" email toggle enabled within their dashboard. This has again been the widest distributed newsletter so far with 50% more customers toggling the option on within their Dashboards than for our previous November 2019 newsletter.

If you didn't receive the newsletter but would like to read it you can do so here on our website.

We've made quite a few changes since November 2019 when we sent our last newsletter. We only publish two per year so you can expect our next one around December this year.

Thanks for reading and have a great week!

New Export Log Options, New Partner Relationships & Other changes!

Image description

Late last month we remade the download feature built into the positive detection log within customer dashboards. Previously when clicking the download log button you would receive a basic text file containing your most recent log entries but we received feedback from customers that this text based log was limited in its usefulness and more format options were desired.

We completely agreed and so now when clicking the download button a new menu will open which lets you export your log entries as text, json, csv and html as seen below.

Image description

We've also added a time scale selection dropdown which lets you limit how far back in time the log goes which we know will be a welcomed addition especially for users with very large accounts that incur a lot of positive detections.

Image description

The second thing we wanted to talk about were some new data brokerage deals we have made with other entities. Over the past year we've courted many agreements which have granted us access to attack information that most companies in our space don't get access to. This can range from single occupant firewall logs all the way up to attack logs from entire datacenters.

Last month we began utilising data provided from several new relationships and the affect on our dataset is already being felt through increased detection rates of both proxies and compromised servers. And in-fact we've seen our detection of emerging threats increase substantially just due to having access to so much more data where we can observe both individual addresses and entire ISP's performing attacks across the wider web.

Image description

In addition to the above we also wanted to detail some behind the scenes fixes and quality of life improvements we've made over the past couple of months across both our API and website.

  • Our custom syncing system has been beefed up substantially resulting in faster and more reliable syncing between nodes.
  • We corrected an encoding issue on the v2 API endpoint for City names
  • We corrected a decoding issue on the Threat pages for Region names
  • We corrected a whole bunch of minor visual bugs around the dashboard
  • Various plugin documentation was updated with new plugins, screenshots and descriptions
  • We're now self-hosting everything on our website (fonts, js libraries, icons etc) for reliability, performance and privacy.

And that's all we wanted to fill you in on today, thanks for reading and we hope everyone is staying safe and healthy!

Introducing The Detection of CloudFlare Warp

Image description

On April 1st 2019 CloudFlare announced that they would be offering a VPN service called Warp but instead of focusing on customer privacy by hiding users IP Addresses they would instead focus on speed by utilising CloudFlare's servers and security by encrypting all traffic exiting the users device on its way to CloudFlare's servers.

This differs from traditional VPN services in that they usually focus their marketing on explaining how they hide your IP Address to provide privacy. So last year when CloudFlare announced the Warp beta for iOS and Android users we did not at that time detect CloudFlare as a VPN provider because every website visited by users of Warp had their original and legitimate IP Address sent to the website they're viewing in a custom HTTP header by CloudFlare.

Essentially the service didn't provide anonymity so we felt detecting Warp wasn't justified. However over the past year things have changed somewhat. Warp has opened up to more users on more devices, Windows and Mac betas are being conducted and the service can now be used to access more than just websites.

Which means if a user uses Warp to access other kinds of services (FTP, SSH, RDP, VNC, IRC, Email, Game Servers etc) then those HTTP headers containing the Warp users real IP Addresses are not sent by CloudFlare making those users essentially anonymous.

And so it is for this reason that today we've flipped the switch and are now detecting CloudFlare Warp as a VPN service and our API will now detect and display these IP Addresses as VPN's.

But as CloudFlare is a CDN (Content Delivery Network) and many websites use CloudFlare you're probably wondering if you use proxycheck.io and CloudFlare together will you be negatively impacted by this change in detection.

The answer to that is no. You shouldn't see any difference and Warp users will still be able to access your website like normal if you're using CloudFlare because if you've implemented CloudFlare properly it will present to your website the real users IP Addresses and you would only ever send those to proxycheck.io, you would not send CloudFlare server addresses to us to be checked.

And secondly we've implemented a system whereby we only detect Warp and not CloudFlare's CDN reverse proxy addresses so even if you've misconfigured your implementation of CloudFlare or our service you will not be negatively impacted.

We know that you've all wanted this change to be made because you've told us via email for some time. We've been constantly keeping an eye on the situation with Warp since last year and we've decided now is the right time to enable this detection for all customers. And of course if you want you can still override this detection in your dashboard with the Whitelist feature or a custom rule but as we say there shouldn't be any negative consequences of this new detection being enabled.

Thanks for reading and we hope everyone is having a great week!

Broadening Location Data with Region Information

Image description

Although our focus has been the detection of anonymous IP Addresses we have found through discussions with our customers that many of them are using our service as an affordable way to gain access to IP location information.

We ourselves have seen the added value location data provides and there is a lot of synergy offering generalised IP information alongside our more targeted proxy detection information.

This is why we've continually increased the amount of location information offered by our API and at no added cost to our customers. To us, a query is a query regardless of the information you utilise from our responses.

To bolster our location offering today we've added two new fields to our API response based on customer feedback. These are for Region Names and Region Codes. Below is a sample of how the new information will show when you perform a request to our v2 API with our ASN flag enabled.

{
    "status": "ok",
    "node": "EOS",
    "196.247.17.9": {
        "asn": "AS52219",
        "provider": "Router Networks LLC",
        "continent": "North America",
        "country": "United States",
        "isocode": "US",
        "region": "California",
        "regioncode": "CA",
        "city": "Los Angeles",
        "latitude": 34.0584,
        "longitude": -118.278,
        "proxy": "yes",
        "type": "VPN"
    },
    "query time": "0.006s"
}

The region data doesn't only apply to American States but will actually provide information for regions all over the world. For example checking an IP from England will now show England as the region while continuing to display United Kingdom as the country like previously.

You may also have noticed that we've moved up where the Country isocode response is outputted from above proxy to below country. This means all our location information now output from the least precise to most precise which makes more logical sense.

Support for the new region and region code fields has been added across proxycheck. The v2 API top to bottom, our custom rules feature both in the API and your Dashboards, our threats pages and our API documentation page including our API test console present there.

Like our City information not all addresses will have region data available. Addresses used as part of IP Anycast systems and addresses that have no lock to a specific region will simply not show this information. An example of this may be a country-wide wireless carrier that uses a range of addresses all over the country based on network demand.

We hope this new feature will help those of you creating geofenced apps and services, we know this is the main use case for this kind of information because as we said above this feature was requested by customers on a number of occasions.

Thanks for reading and we hope you're having a great week!

COVID-19 and proxycheck.io

Image description

Hello everyone

As you are all likely aware by now the world is currently gripped by a global pandemic caused by an infectious disease known as COVID-19. As of right now many countries are in lock-down and many more are in the process of shutting all non-essential travel to slow its spread.

At proxycheck.io we operate in such a country that is currently in lockdown and all non-essential travel is no longer permitted here, please don't worry about us though we're doing perfectly fine working from home.

Over the past few weeks you may have noticed our live support chat has been unavailable and we've only been accepting support requests sent to us via email. This is directly due to the disease as our live support staff have been told to stay home for the safety of themselves and their families.

At the same time due to so many people around the world staying home due to the disease the volume of queries we're handling has increased quite significantly. Our daily peak traffic hasn't changed too much but the surrounding low-periods have increased to meet our peaks. We have more than enough capacity for this extra traffic and so the service has remained completely stable.

However this increased traffic has lead to an extra burden on our lowered support presence as many of our customers have been upgrading their plans to get access to more daily queries and these plan alterations are currently done manually by our staff. In-fact we've seen more customers upgrade their plans in the past two weeks than in the previous several months combined.

And so that's where we are today. The service is handling its extra traffic fine, we're still continuing to work on everything and support is still available via email like normal, although replies may be a little more delayed than usual. The live support chat isn't currently available but feel free to use it when you see it accessible again.

Looking to the near future we hope this disease will be under control soon, it hurts us deeply to see so many suffering. And please do listen to your countries officials and heed all their advice just like we're doing here at proxycheck.io.

Thanks for reading and stay safe!

v1 API has reached end of life

Today we're officially ending all support for our v1 API. We first announced we were doing this way back on March 17th 2018 and we've been showing a notice within customer dashboards since that time if you've made any recent v1 API calls.

Now to be clear, we're not removing the v1 API endpoint but we are no longer guaranteeing that it will remain functional and available. In addition to that all of the new features we released over the past 9 months have only been accessible through our v2 API endpoint and this will obviously continue to be the case as we roll out further new features.

As an example some highly requested features such as Custom Rules and CORS (Cross-Origin Resource Sharing) have only been accessible through our v2 API endpoint since they launched.

And so if you're one of the 0.36% still calling our v1 API endpoint now is the time to switch. To help with the urgency we're changing our dashboard alerts wording and the colour of the notice that appears to be more prominent to users. An example of this notice has been included below.

Image description

We know these kinds of changes can be stressful when they're made without fair warning. This is why we spent the previous two years giving customers a lot of notice and we're quite happy to see most customers transitioned before today, 99.64% of you in-fact. This is no-doubt due to the many wonderful developers who've updated their integrations over the prior 24 months to utilise our current v2 API.

At present we do not have any plans to make another change of this type, we feel the v2 API format is very robust and extensible allowing us to add new features without jeopardising backwards compatibility. So in short, update with confidence we won't be doing another API format change any time soon.

Thanks for reading and have a great week.

A world of Caches

Image description

Probably the biggest obstacle to overcome when operating a popular API such as ours is having the hardware infrastructure and software architecture to handle the millions of requests per hour that our customers generate.

Careful selection of our hardware combined with our extensive software optimisations have resulted in our ability to operate one of the internets most utilised API’s without turning to hyper-scale cloud hosting providers. That's important as it has allowed us to remain one of the most affordable API's in our space.

Today we wanted to talk about one very important part of our software architecture which is caching. We use caching not just for local data that is accessed often by our own servers but also at our content delivery network (CDN) to deliver repeatable responses.

During February we took an extensive look at all of our different levels of caching to see if there were more optimisations possible, which we found there were. We’ve also created a new feature we’re calling Adaptive Cache that by the time you read this will be enabled across all our customer accounts for your benefit.

So before we get into the new feature lets just quickly detail the three key areas where we employ caching today.

Server-side Code Caching

When our own code is first interpreted and executed the result is stored on our servers in memory as opcode and then those stored instructions are re-run directly instead of going through our high-level code interpreter.

This results in massive savings both computationally and time wise. In-fact if we didn’t do this a single request to our servers would take between 1.4 and 2 seconds instead of the 1ms to 7ms requests take currently.

Server-side Data Caching

Whenever you make a request to our servers and we have to access data from a database or compute new information from data held in a database we cache all of it. We cache any data we requested from our database and the computed answers you received.

This also dramatically increases performance as database operations are much slower than accessing things stored in memory and similarly it’s much faster to retrieve a computed answer from memory than it is to compute it again from the raw elements. This is one way we’re able to do real-time inference so quickly.

Remote CDN Caching

Whenever a request is made to our service our CDN stores a copy of our response and if the exact same request is made to us (same API Key, IP Address being checked, Flags etc) then the CDN simply re-serves that prior stored response. But only if both requests were made in the same 10 second window.

This is one of the most important levels of caching for our service when it comes to maximising the efficiency of our infrastructure because as you’ll see below we receive a lot of duplicate queries, mostly from customers not using client-side request caching.

So that’s the three main ways in which we utilise caching. Code, Data and Content. The main way we determine if our caching is working well is by monitoring cache hit rates. Which simply means when the cache is asked for something how often does the cache contain what we asked for.

Ideally you want the cache hit rating to be as high as possible and now we would like to share some numbers. Firstly code caching. This should be the type of caching with the highest hit rates because our code doesn’t change very often. At most a few source files are altered daily.

Image description

And as expected we have a cache hit rate of 99.66%. The 0.34% of missed hits are from seldom accessed code files that execute only once every few hours or days.

Image description

For data our hit rate is also quite high at 31.88% as seen above. This is mostly due to us having servers with enormous pools of memory dedicated to caching. In-fact all our servers now have at minimum 32GB of memory and we usually dedicate around 1/3rd of that to data caching (this is tweaked per-server to maximise the hardware present at each node, for example one of our nodes has 256GB of memory shared across two processors and a larger cache is more appropriate there).

Image description

Finally and perhaps this will be the most surprising to our readers is our CDN cache hit rate. At 52.15% it’s extremely high. This means for every 2 requests we receive one of them is requesting data we already provided very recently (within the past 10 or 60 seconds depending on certain factors).

The reason we say this is extremely high is because for an API like ours that provides so many unique requests (literally millions every hour) it’s odd that so many of the requests we receive are duplicates, especially when our CDN cache is customer unique meaning one customer will never receive a cached result generated by another customers request.

So what causes this? it happens to be the case that many of our customers are calling the API multiple times with the exact same request due to a lack of client-side caching. The common scenario is a visitor comes to your website and you check their IP. They load a different page on your website and you check their IP again because the first result was not saved locally and cannot be used for the second page load. Thus generating two identical requests to our API, the first answered directly by our cluster while the second coming from our CDN only.

Now the good news is, the CDN we’re using (CloudFlare) is built to take this kind of traffic and since they have datacenters all over the world getting a repeated answer from them is usually going to be faster than getting it from our cluster. The other benefit is it saves you queries as we do not count anything served only from our CDN Cache as a query, they’re essentially free.

And so that brings us to todays new feature we’re calling Adaptive Cache. Prior to today we only cached requests made by registered users for a maximum of 10 seconds at our CDN. But with our new Adaptive Cache feature we’re now going to adjust the caching per-customer dynamically based on how many requests you’re making per second and how many of those requests are repeatable. This will save you queries and thus money and help us more efficiently utilise our cluster by answering more unique queries and spending less time handing out duplicate responses.

Essentially if you make a lot of repeatable requests but some of them are spread out too far from each other to fit within the 10 second CDN cache window we’ll simply increase the window size so your cache hit rate becomes higher. But don’t worry we’ll only adjust it between 10 seconds and 60 seconds.

It’s completely transparent to users and the system will always try to minimise the caching time so that changes you make in your account (whitelist/blacklist changes or rule changes for example) will be more immediately reflected by our API responses.

And so that brings us to to the end of what is a very long article on caching. If you want to optimise your own client that uses our API we highly recommend adding some kind of local caching even 30 to 60 seconds can save a considerable amount of queries and make your application or service feel more responsive for your own users.

Thanks for reading and we hope everyone is having a great week!

Introducing Cross-Origin Resource Sharing!

Image description

Today we're introducing a new feature that we have been asked for frequently since our service began. The ability to query the API through client-side JavaScript in a web browser.

This feature may seem on the surface quite simple just allowing the API to be queried through a web browser. But securing this system so that your API Key was never put in jeopardy while maintaining the integrity of our service and making it easy to use required some thought and engineering effort.

The way it works is simple, you go into your dashboard and click on the new CORS button. There you'll receive a new public API key intended to be used in client-side only implementations of our API. Below that you'll find a field where you can enter the origin addresses for all your client-side requests to our API.

Image description

Client-side implementations use the same end-point as our server-side API and just make use of your new public key. This lets our API know you're making a client-side request to the API which will lock the API to checking only the requesters IP Address. It also tests the origin (domain name) of the request against the ones you entered into your dashboard.

All queries made this way will accrue against your private API Key automatically and appear in your dashboard the same way that server-side requests do. In-fact you can make both server-side and client-side requests to the API at the same time allowing you the flexibility to use the right implementation in the right place.

Since you're making requests to the same endpoint as server-side requests you get access to all the same features. You can use all our query flags like normal and gain access to all the same data such as location data, service provider information and more. It even supports your custom rules automatically.

So what does change when using the client-side request method? - The main thing is a downgrade in security. If you choose to block a website visitor using only JavaScript it's possible for your visitor to disable JavaScript or modify the script on the page to circumvent the block.

And so if unwavering security is something you require then the server-side implementation is still the way to go and is still our recommended way to use our API. But if you have a website that doesn't make it easy to integrate a server-side call to our API or you lack the expertise to perform such an implementation then our client-side option may be appropriate.

To make it as easy as possible to utilise the client-side method we've written some simple JavaScript examples for both blocking a visitor and redirecting a visitor to another webpage. You'll find both examples within your dashboard under the new CORS tab, an example of which is shown below.

Image description

The last thing we wanted to discuss was origin limits. Several years ago we added an FAQ to our pricing page containing a question about website use limits which we've quoted below.

Do I need to purchase a plan for each individual website I want to protect?

No, you can simply purchase one plan and then use your API Key for every website you own. This applies to both our free and paid plan account holders.

We know how frustrating it is when you signup for a service and they apply arbitrary limits. No one wants to signup for multiple accounts and we've never wanted to push complex multi-key management or licensing on our customers. And that is why our new CORS feature has no origin limits. Simply add as many origins to your dashboard as you need.

If you visit your Dashboard right now you'll find the new CORS feature is live and ready to be used. We do consider this feature beta so you may come across some minor bugs and we welcome you to report those to us using the contact us page on our website.

Thanks for reading, we hope you'll find the new client-side way to query our API useful and have a great weekend!


Back