Data leak

Do you know exactly what info you’re transmitting and to whom? Not so fast there. It may be in vogue for companies to revise — and in the best of circumstances, clarify — their privacy policies, but that doesn’t mean everything’s crystal clear for the end user.

So much has been written about and fretted over privacy policies that The Wall Street Journal, trying to sort some of the mess, took a hard look at what is being sent and where. It rounded up the top 50 websites in the U.S. and another 20 popular sites across different sensitive categories such as social, health, politics and children’s issues. Then it registered on all of them, browsed around on the sites and used a software program to find out what data was being transmitted.

The Journal followed each site’s suggested registration procedure, including email confirmation when necessary. In addition to registering, the Journal logged out of each account, logged back in, and browsed all known types of pages on the site – for instance, article pages, profile pages and setting pages. The Journal cleared its test computer of tracking files, known as cookies, between each browsing session.

During each browsing session, the Journal used mitmproxy, an open-source software program, to inspect the data being transmitted to and from the sites. This method reveals all data being passed via the Web browser.

End result? Of its sample, sites like Google, Facebook, Amazon and Craigslist walked away clean, but The Journal discovered that others — 25 of them — transmit personal data to other companies using your browsing sessions. And most of them do it without benefit of encryption or security encoding. The Journal’s own site, WSJ.com, even made the list.

Identifiable data chart
Main list

(*Sites not in comScore’s top 1,000)

Looks like one of the worst offenders on the list is dating site OKCupid. Here’s what happens when you click on its entry on the list.

Okcupid

It’s pretty common now for sites to share user data. There are usually terms explaining that info may be used for ad purposes, often requiring a user to check a box to acknowledge that or agree. But sending data out in the open like this? That’s a huge no-no that can lead to all sorts of problems, from security issues to a never-ending cavalcade of Viagra spam. Hopefully now that they’ve been singled out, they’ll take some action on this.

In the mean time, here’s a tip that can help.

Unless it’s crucial that you input your authentic info, use a secondary email or one of those temp addresses (like Guerillamail). Or if you want to stick with, say, one Gmail account, then try this formula:

username+ANYWORD@gmail.com

This is a classic Gmail strategy for organizing/filtering purposes. Add a codeword or identifier to your username, and the email will still be sent to your inbox. And if you start getting weird messages, it’s easy to identify the originating site that leaked or sold your data. From here, it’s simple to filter out those messages.

Not all sites accept special characters like the plus sign, but it works like a charm on the ones that do.

username+CNN@gmail.com

username+RottenSpammers@gmail.com

username+dont.leak.my.data.bro@gmail.com

(Likewise, you can put a dot, or “.”, anywhere in your username, like so: u.sername@gmail.com.)

As for the other data, think twice before giving out your real name, address, phone number or other confidential info. If it’s necessary, say for shipping, that’s one thing. But there’s no reason why some of these sites need that level of detail on you — especially if they’re sloppy about securing it. So use your best judgment about where or whether you offer up your genuine info.

Got other tips on how to minimize third-party data risks? Weigh in below.