# Privacy

As described in our POPIA guide we are exposed to personal data in the form of bank statements. This page specifies our privacy policy which defines how we handle these bank statements and how long they are retained.

This policy forms part our POPIA compliance specifications. Clients may direct their end-users to this guide in order for the data-subject (the owner of the bank statement) to provide consent for their bank statements to be processed by Spike.

This is a full and accurate enumeration of all circumstances under which we retain personal data. All processes to adhere to this policy are automated. We do not deviate from this policy.

# Principles

The guiding principle of POPIA is that operators and responsible parties should only retain personal data where this supports the intended purpose under which the user volunteered the data. In our case this means that we can only retain bank statements where we can justify that this retention helps to improve the provision and quality of the Spike service.

Our goal is to minimize the amount of personal data that we retain in order to reduce the quantity of data that may be exposed in the event of a breach. Our objective is that no data should be retained for longer than a week. In practice a very small subset of data may be retained for longer durations. Each such case is clearly described below and in each case our only reason for doing so is in order to ensure the ongoing provision and quality of the Spike service.

This is a full and accurate enumeration of all circumstances under which we retain personal data. All processes to adhere to this policy are automated. We do not deviate from this policy.

# What we retain and why

Below is a summary table with links to descriptions:

enumeration rule name
1. Error queue
2. Statement library
3. Most recently processed queue

# Error queue

Occasionally pdfs are submitted to us which do not process correctly. There can be many reasons for this (see our errors guide). Upon error a pdf is retained in an error queue until we have had a chance to review and fix the error. Any pdf in the error queue is automatically deleted from the queue after 1 week.

# Statement library

Our statement library documents all of the statement templates which we currently support.

In order to build this library we need to retain a representative statement. We typically use the first statement from the error queue where we first discovered the new statement template.

Occasionally a pdf for which we have already developed a statement parser, may fail to process as a result of something unique to that individual pdf. Again there can be many reasons for this but typically it will be things like: language (Afrikaans statements when we currently only support English), footnotes / marketing information (e.g. COVID notices, notes on interest rate changes, and other information that is subject to change), or changes in layout (like column widths). We call these unusual conditions "edge-cases", and we may need to retain a statement that exhibts an edge case for future back-testing of our code.

To summarise the following information is retained in our statement library:

  1. a redacted image[1] which is displayed in our statement library
  2. a representative pdf which is retained in our internal software development repository for back-testing of our code
  3. an edge-case pdf which is retained in our internal software development repository for back-testing of our code

# Most recently processed queue

When modifying our code we may introduce bugs which prevent existing statement parsers from functioning correctly. This can be subtle - e.g. the parser may work for the majority of of the pdfs in a statement template but break on a handful of edge-cases. In order to ensure that we deliver a high-quality service we need to back-test each parser against a set of pdfs in order to ensure that no bugs have crept in. And in order to do this back-testing we need a sufficient sample of pdfs to test against.

For this reason we retain a queue of the most-recently processed pdfs in each statement template. This queue operates in a first-in-first-out manner where the oldest pdfs are continually ejected to make room for newer statements. In this manner the queue is constantly refreshed and pdfs are not maintained indefinitely.

# Changes to privacy policy

If we need to change these policies we will notify all clients timeously via email prior to any changes being enacted. You can take note of the Updated date on this page as an indication of last modification date.


  1. clients may request us to further obscure the image if any data is insufficiently hidden ↩︎

Updated: 7/21/2021, 10:50:11 AM