Upgrades to the v2 API we're working on

Over the past couple of weeks we've been working on upgrading the v2 API in a few specific ways.

  1. Increasing result speed, lowering query latency for both single and multi-queries.
  2. Giving you more control over the resolution of your queries
  3. Giving you more information with your positive detection results

1. Increasing Speed

So to increase speed we've worked on this from a multitude of angles. Most of our queries are singular and so our goal was to dramatically decrease the time to process singular queries. Multi-checks are actually quite fast already due to our multi-stage cache priming that happens during the initial IP Address processing making subsequent address processing very fast.

So our goal has been, how do we get the same performance benefits that those primed caches deliver when we're only processing a single query. We think we've accomplished that by creating a new process on our servers within which we can place all of our proxy data in its most efficient data format.

We had already been using RAMDISK's to hold data but it was not held in its most efficient format and our RAMDISK had a lot of overhead for features we don't make use of. Our new custom process does away with all this and is designed specifically for our use case.

So what's the result of all this? Well in testing we've been able to reduce single query latency to around 1ms before network overhead. This is with VPN checks and the real-time inference engine turned off, both of which we'll be working to improve the performance of later.

We're still testing and benchmarking these changes but so far it looks promising that we will be able to change to the new code soon.

2. More Control

The second thing we've been working on is giving you more control over the resolution of your queries. What this means is, you can specify in days how close to the present time you want your results to be restricted to. For example perhaps only Proxies which have been seen as operating within the past 3 days as opposed to the past 7-15 days (the API's default maximum).

This new feature has been implemented as a flag called &day=# making it very easy to use. We're going to allow a resolution scale from 1 day to 60 days giving you great flexibility between very conservative detection which may miss some proxies and very liberal detection which may present some false positives.

3. More information

This is a combination of a few features. Firstly, because we're allowing you to specify the resolution in days of your detection results we're also going to give you the ability to see the last time we saw an IP Address operating as a proxy server. You'll be able to activate this feature by supplying the flag &seen=1 with your queries. We'll be displaying the result in both a human-readable "x time ago" format and a UNIX time stamp.

The other feature we're adding is the ability to view port numbers. This has been requested more times than we can count and it hasn't been something we've wanted to expose on the API because frankly it serves no security benefit but with how easy it is to scan an IP Address to discover running servers we've decided to implement the feature based on customer feedback. To activate this feature you'll be able to supply &port=1 as a query flag.

Below we've included a paste containing a query result provided by the new API version with both &seen=1 and &port=1 flags supplied.

{
    "node": "PROMETHEUS",
    "1.10.176.179": {
        "proxy": "yes",
        "type": "SOCKS",
        "port": "8080",
        "last seen human": "5 hours ago",
        "last seen unix": "1520297212"
    },
    "query time": "0.001s"
}

When are these changes coming?

We're still validating the new API, the changes we've made essentially constitute a brain transplant, almost all the code that actually performs the checking associated with a query has been completely gutted and replaced with a new more efficient system based around our new on-server caching process.

It's our hope to have the new v2 API updated later this month and we may even back-port parts of it to v1 which the majority of our customers are still using. We're also going to be overhauling the API Documentation page as we've grown the flags available considerably with these changes necessitating a redesign to the page layout.

Thanks for reading, we hope everyone is excited to try the new API, we're certainly excited to get it finished for you all to enjoy.


Improved Statistics

Today we've rolled out an update to our statistics gathering and processing code with some great improvements.

The first change reduces the individual pieces of customer stat information (for your dashboard stats page) we need to sync between our server nodes by 50% while retaining the same information as before, effectively we have reduced redundant stats. This change will have a measurable effect on how fast your stats are updated within your dashboard.

The second change is we're now performing query refunds when you send us multiple addresses to be checked and the query volume goes over the allotted 1,000 or the 90 second processing time window is exhausted.

So for example if you send us 1,500 addresses to be processed you will initially have 1,500 queries added to your accounts query total. But if we were only able to process 700 of those addresses your account will receive a refund of the remaining 800 queries that we didn't process. The refund is almost immediate.

We feel this change (refunding non-processed queries) is fairer to our customers as it doesn't cost us much computationally to not process data that is sent to our servers and it will allow you to re-send the data for processing without worrying about query overages.

Also before we made this change we never charged for invalid addresses, these new changes affect only multi-check queries that go above our query allowance of 1,000 addresses per query or queries which exhaust the 90 second processing time window.

During these upgrades we also discovered and corrected a bug which under-reported VPN, Proxy and Refused query breakdowns as displayed within your dashboard when you made use of the multi-ip checking feature (both via API and our Web Interface). This bug only ever affected our v2 API endpoint and only when performing multiple checks in a single query.

Thanks for reading and we hope everyone had a great weekend.


Web Interface now supports ASN lookups

Today we've enhanced the Web Interface page with a new checkbox to enable ASN lookups on queries. This new feature will show you the ASN name, provider and country for the IP Addresses you're querying and it does so with a subtle hover tooltip as shown below.

We know you'll find this addition to the web interface useful as we ourselves make many manual queries to the API to get country and provider information for IP Addresses so this feature will be something we make heavy use of ourselves.


ASN Lookup Performance Improvements

Around a week ago we changed the ASN lookup code on both our v1 and v2 API endpoints with the goal of speeding up these queries.

Due to the way ASN data has to be computed before it can be searched effectively it has always been resource intensive to perform ASN data lookups but with our new code we're now pre-processing all ASN data so it's in its most easily searched representation before any queries are made by our customers.

We've also changed how we're caching ASN data meaning repeated lookups even hours apart will have a much higher cache hit rate. All of our API cacheing functions are custom made and highly tuned utilising system memory.

The result of all this work has been significantly faster ASN results, we're now seeing around 80% of our ASN queries answered in under 55ms when previously 99% of ASN queries took over 300ms. We're also seeing a 30% query cache hit rate for our new level-1 ASN cache (IP to ASN) and 70% cache hit rates for our level-2 ASN cache (ASN to company and country).

Only around 4% of the queries we process every day utilise the ASN flag and that's likely due to either the added latency it creates or the customer not needing to perform ASN lookups. So whilst this is a small area of our business it was important for us to improve its performance which we have now done.

Another added benefit of this change is lower CPU utilisation on our cluster nodes which will allow us to answer all queries even faster than before and increase the volume of queries each node can process.

Thanks for reading and have a great day!


A quick note on Meltdown and Spectre

Recently we've had customers enquire about our operational security and how we safeguard customer data. Sharing with you our GDPR compliance earlier this month answered some of the questions about how we store and process data while respecting data privacy laws.

But another question raised has been if and how we are affected by the Meltdown and Spectre vulnerabilities. As a website that accepts sensitive customer information including login credentials, payment information and ip address data there is potential for user information to be read from our various servers operating memory if we used shared computing resources that haven't been patched against Meltdown and Spectre.

Whilst the computer processors we use are affected by these bugs (as are all modern processors to some degree) our infrastructure as a whole is not affected as we do not use shared computing resources and have never done so for our cluster architecture. We have only ever used shared resources for honeypots which have never and will never contain customer information or any other potentially sensitive information.

As part of our GDPR compliance we are bound to continuously evaluate potential threats to our infrastructure. So as soon as the Meltdown and Spectre news broke we read all of the information available starting with the linux kernel patches and comments from AMD right through to the disclosures from the researchers who discovered the processor flaws, Google's announcement and Intel's press release.

We came to the conclusion almost immediately that our infrastructure was not in danger due to our use of bare metal servers that we either purchased, built and deployed ourselves or rent from major data centers. Due to the way the attacks work you would need to be running malicious code from an attacker on your server or be sharing a physical server with an adversary.

So to sum up, we were aware of Meltdown and Spectre on the very first day the news broke and we did a risk assessment immediately at that time and determined we were not affected due to decisions made during our infrastructure building.

Thanks for reading and have a great weekend.


General Data Protection Regulation Compliance

On the 25th of May 2018 a new piece of positive regulation is coming into force within the European Union called the General Data Protection Regulation (GDPR) to help protect EU citizens sensitive and non-sensitive personal information online.

Under this regulation proxycheck.io will be classed as a data-processor and our customers which send data to us to be processed will be classed as data-controllers. As we have customers ourselves we will also be classed as a data-controller regarding the data we store about our own customers.

Because there are now specifically defined ways in how we can receive, handle, process and pass-on the data given to us by our customers we felt it prudent to create a document explaining how we are GDPR compliant. We do believe that we meet all the requirements set-forth by the GDPR and have been meeting them for over a year.

We have included links to our GDPR compliance document on every page of our website at the bottom and also next to our registration button on both our homepage and pricing page so that everyone can see the document clearly.

I'd like to iterate that we're proud to be compliant, we think the GDPR is a great piece of regulation that we hope will strengthen consumer privacy and stop what we see as rampant over collection and misuse of personal data.

You can view our GDPR compliance document here.

Thank you.


Saying goodbye to Skype

When we began offering customer support 16 months ago we started with Skype, iMessage and Email. Later on we added live chat support and since then we have seen an explosion in live web chats. In-fact most of our real-time interactions with customers have been done through our live chat support on our website.

Skype was once a very well utilised platform but the numbers are telling us it's not worth our time to support it for our business and we should instead focus on our live chat support, email and iMessage. All of which can be easily accessed by our customers with no annoyances.

In-fact you receive the same level of service through all three support channels meaning we're able to help you with payments, account related issues and service questions however you want to contact us which is a marked difference to many other sites where the live chat support is manned by sales call staff that can't assist with account related matters.

With how few customers took advantage of our Skype support we know this won't affect many of you. We're continually evaluating the best way to offer support to our customers and we're confident the options we currently provide will satisfy everyone.

Thanks for reading and have a great weekend.


ASN flag issue on our HELIOS node has been resolved

Earlier today we became aware of an issue with ASN data lookups on our HELIOS node affecting both v1 and v2 API endpoints. We have since corrected the problem and ASN data from HELIOS is now working correctly.

Neither our ATLAS or PROMETHEUS nodes were affected. Thank you.


v2 API Improvements & Web Interface Refresh

Since we debuted the v2 API on January 1st we've been collecting a lot of data about how our customers are using the new API. One of the obvious use cases has been multi-checking. We've seen a huge volume of customers making use of our multi-checking functionality especially our largest customers that have lots of prior data they wish to check or re-check.

To help facilitate this use case we have increased the volume of IP Addresses that can be checked in a single query from 100 to 1,000. We've also added in a new timing feature so if your query reaches 90 seconds it will stop and display to you the IP Addresses it has and has not processed up to that point, this is to ensure the query doesn't time out.

So if you provision your software for the time exhausted response Not processed. Reached 90 second processing time limit. you can re-send in a new query only the addresses that didn't get checked in your prior query. Depending on the flags you're using it is possible to query near 1,000 addresses within the 90 second query window so we know this will be a useful addition to the API.

In addition to these changes we've also remade the results interface on our Web Interface page (which also now goes to 1,000 checks per query or 90 seconds query time whichever comes first). We've made these changes so that it's easier to view large data sets, the previous UI was great up-to a certain number but once you're dealing with hundreds of addresses it could become cumbersome to navigate. Below is an example of the new interface with collapsable results.

Under each section if you're receiving 90 results or less it will automatically be expanded but if there are more than 90 results it will be collapsed enabling you to get an overview of all the kinds of results you've received before you choose to drill down into a specific result type.

We hope you enjoy these changes they are all live as of this blog post.


Introducing our new v2 API

Over the past couple months we've been working hard on a new version of our Proxy Checking API. And today we're proud to launch it officially and welcome you to try it out.

You may remember last month we shared with you a new API endpoint called v1b and at that time we weren't quite sure if we should launch the new API under the /v1/ endpoint because it created a lot of code bloat to make the new API compatible with our old result format whilst still being able to support the new features we've included.

New features such as:

  • Proper formatted URL strings so you can supply [ipaddress]?key= instead of [ipaddress]&key=.
  • The ability to send your IP Address to be checked using the POST method instead of just GET.
  • The ability to check up-to 100 IP Addresses in a single query using GET or POST methods.
  • The ability to disable real-time inference checking with a new inf flag.

v2 has all of these and to support the multi-checking feature we've had to alter our result format so that the proxy declarations are nested under the IP Addresses. We did (with v1b) create a compatibility layer so that if you were checking a single IP Address it presented the old format. But this created code bloat and frankly we didn't feel it was a good trade off.

So instead we're going to maintain our /v1/ endpoint for a long while (probably until 2020-2022 depending on usage). And in its place we will be presenting /v2/ as our main API endpoint from now on.

If you're worried about how this change will affect your code, don't be. The API is still very easy to query and in-fact in our own PHP Function (available on GitHub) we were able to upgrade with just two changes, the URL we were querying and the JSON conditional statement.

Essentially this: if ( $Decoded_JSON->proxy == "yes" && $Decoded_JSON->ip == $Visitor_IP ) {

Became this: if ( $Decoded_JSON->$Visitor_IP->proxy == "yes" ) {

It's just that simple. Now again you do not need to rush around and change your API over to v2, we will be supporting the v1 endpoint for many years yet. But if you want to get on the latest and greatest you're more than welcome to do so.

You'll find that we've updated our API Documentation page to include the new /v2/ endpoint and we've also spruced up the pages appearance. We hope you enjoy the new look.

We've got a lot of things coming in 2018 and this is the foundation on which we'll be building them. Thank you everyone for reading and we wish you a Happy New Year!


Back