As a society we’re currently living through two monumental technology shifts which impact your data and privacy more than ever.
Huge amounts of data and services are being migrated to cloud environments for scalability and cost-efficiency. The key questions to ask yourself are:
- Can you trust the cloud provider to honour their data usage policy?
- Does the cloud provider or you hold the private encryption keys that secure your data?
- If the company is in charge of securing your data, will they hand over encryption keys (and therefore your data) as soon the government tells them to?
If your data is compromised on a cloud provider by hackers (which happens ever year) will you know? If hackers sell or post your username or password on the Darknet, will they have access to other accounts you own that use the same password?
A key threat of the cloud transition is data protection.
As we consume more and more online services, the amount of data, metadata and personal identifying information about us increases.
Even when companies promise not to store direct personal data about you, the metadata trail you leave behind when you access their services or browse the internet is enough to piece together exactly who you are — with trivial effort.
Machine learning pipelines coupled with database platforms can efficiently store and parse huge amounts of metadata like never before. It has never been easier for companies to not only identify who you are, but monetize your internet usage with product recommendations and advertisements tailored to you.
Metadata includes things like your IP address, timestamps of website visits, where you hover your mouse on websites, HTTP header information (OS, browser, time zone etc), your search history, cookies and much more.
But this only covers the application of machine learning to traditional metadata. These days there are far more complex forms of metadata that can be harvested.
Machine learning on social media platforms like Facebook can learn from your photos what you look like. Furthermore, did you scrub the GPS metadata from your photo before posting? Apple photos and Google Photos also use machine learning to recognize you, your friends and objects within your pictures. Even if they promise to store the learnt models only on your private device, do you really trust that?
After the Snowden leak, it’s not too hard to imagine a time in the future where the government might silently force Apple, Facebook and Google to collect and handover learnt photo models about you to feed mass CCTV surveillance systems like in China. Perhaps the government will just scrub the data themselves from publicly available sources. Perhaps they already have this data.
A key threat of the Machine Learning Transition is data misuse.
“I need privacy, not because my actions are questionable, but because your judgement and intentions are.”
“Privacy is the right to choose what information I share and with whom I share it.” [1]
Machine learning is a double edged sword in terms of privacy; it can be both a weapon against privacy and a weapon for it . As mentioned previously when I reviewed Synthetic Data Tools and Models, I think machine learning will revolutionize how we share data as well [2].
Every browser review site has a different opinion about the best overall browser. But when looking at privacy, Brave consistently scores at the top of most rankings. It’s also one of the highest performers overall thanks to being built on top of the same open-source backend that Microsoft Edge and Google Chrome are built on.
A study published at Trinity College, Dublin in February 2020 [3] ranked Brave as the best browser for privacy from a backend connections perspective. The number of connections a browser makes with a backend server is a good proxy for how private your browsing experience is. The more connections the browser makes to open backend server the more data it is transmitting about your current session. The study measured the number of connections to backend servers made by 6 different browsers: Google Chrome, Mozilla Firefox, Apple Safari, Brave Browser, Microsoft Edge and Yandex Browser.
“For Brave with its default settings we did not find any use of identifiers allowing tracking of IP address over time, and no sharing of the details of web pages visited with backend servers. Chrome, Firefox and Safari all share details of web pages visited with backend servers…Microsoft Edge and Yandex are qualitatively different from the other browsers studied. Both send persistent identifiers than can be used to link requests (and associated IP address/location) to back end servers. Edge also sends the hardware UUID of the device to Microsoft and Yandex similarly transmits a hashed hardware identifier to back end servers” [3]
An email alias hides your real email address. How it works: you provide the email alias to someone and all emails sent to the alias will be forwarded to your real email address. There are 3 privacy benefits to this:
Prevent service linking
Using the same email address for multiple services allows a company or hacker to easily link multiple services you access. With an alias you can make it harder to do this. Furthermore, some alias services allow to create randomised strings or link custom domains you own.
E.g. 94960540-f914–42e0–9c50–6faa7a385384@<alias_provider_domain> or <anything_you_want>@<your_domain>
Prevent multiple services being compromised
If an attacker has compromised one of your passwords and you use that password on multiple sites with the same email, you’ve just made the attackers job a lot easier. There’s no guarantee that just because an attacker has your password that he will know the email you registered with as well.
Prevent spam
If a service you signed up to starts selling your email to other 3rd parties likes advertisers your inbox can quickly become full of spam. If you had used an alias instead, you could quickly identify which service sent you the spam and with the click of a button disable that alias so that all emails from that service are no longer forwarded to your real email address.
Recommendations
There are many email alias services out there.
Personally I like open-source https://anonaddy.com/ but have not tried others yet.
To see if your email address has ever been associated with a data breach, you can enter it here to find out: https://haveibeenpwned.com/
You should use a different password on every website you login to to prevent multiple services being compromised at once. This tactic is also known as reducing your attack surface. Furthermore you should make the password as long and complicated as the service allows for. Using a good password manager helps you achieve this. As Quantum Computers become more stable why risk the possibility that someone now, or in the future could brute-force the hashed password (not all providers are even salting password hashes). Read about password hashing and salting here [4].
Recommendations
I personally don’t recommend using a non-open source password manager. There is no way to know if their source code is well written to protect one of the most important data protection assets you have — your passwords.
Personally I like open-source https://keepass.info/ or https://bitwarden.com/. They are tried and tested the world over.
To see if your password has ever been associated with a data breach, you can enter it here to find out: https://haveibeenpwned.com/Passwords
There are basically 3 ways (factors) that can be used to authenticate someone to a service.
- Using something you know like a password
- Using something you have like a phone
- Using something you are, like a fingerprint or eye scan.
Two Factor Authentication (2FA) refers to the latter 2 of these; In addition to giving your password, 2FA requires an additional authentication step to prove who you are such as something you have or something you are. 2FA prevents an attacker from compromising your account even if they have your password.
Recommendations
There are multiple companies offering you the ability to link their 2FA system to the service you’re using. Google and Microsoft both have their own 2FA apps.
Personally I like to stay open-source, so I like using the 2FA functionality built into https://bitwarden.com/.
One of the best ways to prevent credit card fraud when shopping online (besides ensuring you’re connected using https and the domain doesn’t look suspicious) is to use a virtual credit card. It’s essentially an “alias” for your real account. Disposable virtual cards use temporary numbers that are only used for a single transaction. After that transaction the card number is destroyed or is automatically changed. In the event of a data breach your virtual credit card number that is stolen is completely useless to an attacker, since the card number was invalidated the moment the original transaction completed.
Recommendations
I like https://www.revolut.com/.
Other reputable services [5] are:
American Express Go
Capital One ENO
Citibank Virtual Account numbers
Don’t be fooled by WhatsApp touting end-to-end encryption. They previously also stored your message unencrypted on your phone in plain-text and in the cloud, which since the FBI v Apple showdown in 2016 isn’t too comforting since the FBI demonstrated they could essentially hack into iPhones without Apple’s help. The biggest reason for using Signal is that it was built from the ground up with privacy in mind. It’s also open-source and peer-reviewed by many of the world’s cyber security experts and auditors. Only last year WhatsApp was compromised [6] by spyware created by an Israeli company to infect phones and take over the operating system.
Trusting your most important data to a cloud storage provider is really a game of Russian roulette, especially if you use multiple storage services. Either the cloud storage provider eventually oversteps their data use policy or there could be a data breach down the road. Take action now and at the least, encrypt your most sensitve data before uploading to a cloud provider.
Recommendations
Again, multiple services exist each with varying ease of use, cloud integration and cross platform support.
For open source: https://cryptomator.org/
Other popular data encryptors [7]:
https://www.encryptedcloud.com/
It goes without saying that Google is essentially the internet these days. Even websites that are not Google are likely using a plethora of Google services in the backend to serve their content, track you or use Google’s captcha service. With a history of constantly pushing the limits of acceptable data use, invasive / aggressive business practices and multiple international charges all related to violating privacy laws [8], Google should not be in your consideration list for anything that you want privacy for.
In addition to switching your web browser, switching your main search provider and email provider should also be a priority for anyone wanting to minimize their digital footprint.
Recommendations
Search provider: https://duckduckgo.com/
Email provider: https://protonmail.com/ or https://tutanota.com/
A DNS resolver is like a phone book. When you type an address in your browser like duckduckgo.com your computer sends a request to the DNS service to ask what the IP address for duckduckgo.com is (since internet requests are all based on IP addresses, not names). Most people use the default DNS resolver provided by their internet service provider (ISP). This means that every website you want to access gets queried first to your ISP. Not great for privacy. Especially if you mistrust your ISP or the country you live in.
Using encrypted DNS requests won’t make you anonymous on the internet (since you’re still susceptible to metadata collection and linkage attacks), however it will at least hide the website request from your ISP and other attackers trying to monitor your website requests [9].
Recommendations
https://github.com/DNSCrypt/dnscrypt-proxy/wiki/Anonymized-DNS
https://nlnetlabs.nl/projects/unbound/about/
When you want true internet privacy and anonymity the best thing to help you is Tor. True privacy is impossible on the internet but you can make it exceedingly difficult for hackers to compromise or track you. Tor uses an open-source protocol supported by thousands of volunteers running “nodes” throughout the world that encrypt your data several times on route to its final destination. The cool part about the Tor protocol is that it allows server nodes to essentially communicate with one another without knowing who each other are. Tor users will always be at the mercy of time-delay attacks, a sophisticated cyber attack that uses differential timings between internet requests on nodes to triangulate and pin point where your true location might be from. The basic premise is that your computer takes less time or more time to send data to a node depending on the physical distance it is from the node.
[1] “reddit.com/r/privacy/comments/fvocq2/i_need_privacy_not_because_my_actions_are”. Reddit, Accessed 14 Dec. 2020.
[2] Pillow, Timothy. “A Review of Synthetic Tabular Data Tools and Models”. Medium, 2 Jul. 2020, towardsdatascience.com/a-review-of-synthetic-tabular-data-tools-and-models-d83b232aae25
[3] Leith, Douglas J. “Web Browser Privacy: What Do Browsers Say When They Phone Home?” Trinity College Dublin, Ireland, 24. Feb 2020, scss.tcd.ie/Doug.Leith/pubs/browser_privacy.pdf. Accessed 14 Dec. 2020.
[4]. Nohe, Patrick. “The difference between Encryption, Hashing and Salting”. Hashed Out, 19 Dec. 2018, thesslstore.com/blog/difference-encryption-hashing-salting/. Accessed 14 Dec. 2020.
[5]. Duffy, Jill. “How to Protect Yourself Online With Disposable Credit Card Numbers”. PC Mag, 22 Jun. 2020, uk.pcmag.com/encryption/127489/how-to-protect-yourself-online-with-disposable-credit-card-numbers. Accessed 14 Dec. 2020.
[6]. Wong, Julia Carrie. “WhatsApp urges users to update app after discovering spyware vulnerability”. The Guardian, 14 May. 2019, theguardian.com/technology/2019/may/13/whatsapp-urges-users-to-upgrade-after-discovering-spyware-vulnerability. Accessed 14 Dec. 2020.
[7]. “The Best (Free) Encryption Software for Your Cloud Storage”. The Sweet Bits, 19 Apr. 2019, thesweetbits.com/best-encryption-software-for-cloud-storage/. Accessed 14 Dec. 2020.
[8]. Phillips, Gavin. “Stop Using Google Search: Here’s Why”. Make Use Of, 30 May. 2017, makeuseof.com/tag/stop-using-google-search/. Accessed 14 Dec. 2020.
[9]. “Encrypted DNS Resolvers”. Privacy Tools, privacytools.io/providers/dns/. Accessed 14 Dec. 2020.