Episode 100 - Spell-Jacking: Addressing a threat to personal data privacy

Spell-jacking: a new word emerging from the tech world. Learn its meaning and what can be done to protect personal data privacy. We use convenient third-party features on websites that can expose highly sensitive information about us without our even suspecting this is happening. When we use spellcheck on a website, this can send the entire form we are working on to “the cloud.” The information is in flight and can be shared (or hacked) in unexpected ways. A September 2022 study by otto-js, a JavaScript security firm, found that the vast majority of enterprise websites send data with Personal Identifying Information (PII) back to Google or Microsoft when users access Chrome Enhanced Spellcheck or Microsoft Edge Editor. This can release passwords, Social Security numbers, and other personal information users would not approve. Through enabled features that are convenient for users (such as spellcheck or “show my password”), personal data is being shared in ways individuals did not expressly approve and would avoid if they could. Otto-js co-founders Maggie Louie and Josh Summitt tell how this problem was discovered and share how risks can be mitigated. While legitimate enterprises have no interest in releasing PII to mal-actors, spell-jacking as such is currently unregulated or under-regulated. Learn how industry and regulators are addressing this issue – and what consumers can do about it to protect their own personal privacy. Helpful guides for developers and consumers are available on the otto-js website. If you have ideas for more interviews or stories, please email info@thedataprivacydetective.com.

00;00;06;26 - 00;00;33;26
Speaker 1
This is a data privacy detective. And today we're going to talk about spell jacking, a growing threat to our data. Privacy could be a new word for you. And with us today are two experts in this area. Maggie Lui is the CEO of Otto James is for JavaScript, by the way. For a decade, she worked for large media companies as they were developing mobile products.

00;00;34;17 - 00;00;41;17
Speaker 1
And she founded Otto Jazz after an experience with a hacker. Well, Maggie, thank you for joining us.

00;00;41;22 - 00;00;42;27
Speaker 2
Thank you for having us.

00;00;43;10 - 00;00;49;17
Speaker 1
Well, it's a pleasure. Now, tell us the hackers story that prompted you to co-found Otto Jazz.

00;00;49;22 - 00;01;29;29
Speaker 2
Sure. Yeah. I had been working in the industry for publishers for close to a decade, and all of those products really were revenue dependent on programmatic advertising, which is how I became very aware of this problem. And so looking for some friends in the publishing industry that were not making any money on their programmatic ads, what could be the problem discovered that a hacker had actually exploited the third party JavaScript that was enabling all of the programmatic ads and was not only siphoning off their revenue, but injecting his own ads and other malicious stuff and stealing different sorts of gross stuff and hitting on.

00;01;30;00 - 00;01;58;06
Speaker 1
A common thing. But that puts you on to this thing that we now call spell jacking. It's more than that. We'll talk about it. But also with us today is the company's co-founder and the chief technology officer, Josh Summit. Josh, you've been in cyber security since 2005. You've been responsible for finding and mitigating security weaknesses in many large organizations like Bank of America, FedEx and the Internal Revenue Service.

00;01;58;06 - 00;02;12;26
Speaker 1
I hope somebody hasn't found my tax return. I don't know. But Josh, the one publication credits you with discovering and even coining the phrase spell. Jacky. So help us understand what is spell checking?

00;02;13;11 - 00;02;39;27
Speaker 3
Sure. Yeah. So spell jacking is when you have different scripts or different features within your site and it happens to be taking using maybe the spell check features and whatever you type, it sends it all to that spell checking feature. And so a lot of times that data is being siphoned off and sent to a third party server that you might not, you know, not not know is happening.

00;02;40;07 - 00;03;08;03
Speaker 3
And, you know, these features are common and useful features that are built into the browser when you're, you know, typing emails and things like that. But you don't really expect them to be enabled on things like, you know, username and password fields. You kind of expect that those are fields that would not be susceptible to, you know, needing to be spell check because they wouldn't really match anything in a dictionary, especially your passwords shouldn't match anything in a dictionary.

00;03;08;03 - 00;03;17;26
Speaker 3
Right. We found different features within the browser. Sometimes extensions, sometimes native features that are sending this data off to, you know, companies like Microsoft and Google.

00;03;18;00 - 00;03;39;00
Speaker 1
Well, on this, we call this is the cloud, isn't it? The cloud, really? And there's no cloud just isn't God sitting up there with Saint Peter in the cloud. These are simply when one entity that you're dealing with through data is sending information to another, for example, to use its spell check feature, maybe show my password so I make sure I put it in.

00;03;39;00 - 00;03;41;24
Speaker 1
Right. This is kind of the problem you're talking about, right?

00;03;41;27 - 00;03;59;21
Speaker 3
Exactly right. It's like a combination of of two very useful features. You know, one, you want to be able to like show your password so you can make sure you typed it correctly. Right. And then you also want the spell checking features. But when those two things are combined, it actually creates this security weakness and this data exposure.

00;03;59;21 - 00;04;25;13
Speaker 1
That's the old story. It's convenient, it's helpful to us. And yet we have to think about what is this done to our to our privacy? Well, Maggie, let's go back to you now. Your company, Auto, just released a September 2022 study that I found fascinating. And it talks about how Google and Microsoft and other companies end up receiving our personally identifiable information.

00;04;25;13 - 00;04;32;28
Speaker 1
PII That includes things like passwords and even Social Security numbers. So tell us about what your study found. Sure.

00;04;32;28 - 00;04;59;06
Speaker 2
Well, this is all you know, what Josh discovered while he was testing some of our new behavior script monitoring, which is the spell jacking. You know, if you have chrome enhance spell check features enabled as well as enhance spell check features in edge, it's actually reading all of the input fields on the pages that you're visiting. So if you log in to any site, it'll send your username and any information you type into those form fields.

00;04;59;06 - 00;05;20;24
Speaker 2
And if you have show password, you'll even send your password to their servers and clear text. And when you look at the request, what it will send back is Google recommends. And then they'll have a corrected spelling. So when we did this, we've made it kind of a joke. We've made the password, shared password and Google response to that corrected spelling was share password.

00;05;20;24 - 00;05;21;09
Speaker 2
We suggest.

00;05;21;15 - 00;05;22;09
Speaker 1
Your password.

00;05;22;19 - 00;05;44;14
Speaker 2
We're just going to be cheeky but but that so that is all when Josh discovered us, we all as a team decided that we needed to research this and figure out how big the skill problem was. And that's the the thesis behind the report that we did. We took six key industries and then we looked at five sites across each of these industries, the top performing visited sites.

00;05;44;14 - 00;05;47;10
Speaker 1
So you had 30 control group websites.

00;05;47;11 - 00;05;54;22
Speaker 2
Exactly. Yeah. And that was banking, online banking, government sites, health care, e-commerce and even pornography.

00;05;55;03 - 00;06;00;16
Speaker 1
And so I've read that study, but if I haven't, I guess, oh, what's a 10% problem? Right.

00;06;01;01 - 00;06;03;27
Speaker 2
Right. So 97%.

00;06;04;14 - 00;06;05;13
Speaker 1
7%.

00;06;05;27 - 00;06;11;25
Speaker 2
Yeah, we're sharing. We're sharing PII 70. A shocking 73% actually shared your password.

00;06;11;27 - 00;06;16;05
Speaker 1
Would share your password. And of course the user has no idea this is happening.

00;06;16;15 - 00;06;29;15
Speaker 2
And what's important is that the ones that weren't sharing your password, it wasn't that they had mitigated the problem. They just didn't have a shared password option within their password field. In fact, the only company that had mitigated it was Google.

00;06;29;15 - 00;06;46;28
Speaker 1
Well, Josh, let me turn to you and help us understand the vulnerabilities of system use, abused JavaScript and you know who doesn't, but that's your company addresses, Maggie. And the report shown the problem. So what are the underlying vulnerabilities that caused this?

00;06;47;00 - 00;07;13;05
Speaker 3
So what's interesting is we were actually, you know, part of the main thing that we're looking for is how different scripts and how different third parties are sharing data for different sites that you go to, how they're tracking you, what they send to different parties, how they track you. And it's kind of how we stumbled upon this. We were looking at how you know, any third party script that's running on a website actually has the ability to scrape all of this data and send it off to other third parties.

00;07;13;15 - 00;07;37;22
Speaker 3
So, you know, while we're looking for that, we happened upon, you know, the thing that Google is actually doing it within the features in the Chrome browser and also in Windows on Microsoft. Right. We we noticed, you know, in addition to these kind of things, there are a lot of these third parties that are also basically form jacking, skimming your data without your consent or without you, like clicking submit on a button.

00;07;38;02 - 00;07;39;13
Speaker 3
So for instance, in my.

00;07;39;13 - 00;07;42;10
Speaker 1
Data broker problem, relatively unregulated.

00;07;43;05 - 00;07;43;29
Speaker 3
Exactly right.

00;07;43;29 - 00;07;46;03
Speaker 1
Great information is not that hard to do.

00;07;46;15 - 00;08;07;02
Speaker 3
Yeah, yeah, yeah. But I think from a user perspective, you sort of expect that maybe I'm entering data into forms, but I'm not expecting it to go anywhere until I do an action like I click submit or you know, you know, something like that. And we found many sites, the scripts that are on the page different, you know, features.

00;08;07;10 - 00;08;11;12
Speaker 3
We're actually scraping that data before you actually gave your consent to send it.

00;08;12;10 - 00;08;20;29
Speaker 1
Which was really interesting. So a major vulnerability. Well, Maggie, what what can companies do about this and how does your company, you know, help them deal with?

00;08;21;09 - 00;08;49;07
Speaker 2
Yeah, I mean, so if you're a company, the question is, you know, are you do you have risk in liability of sharing third party, you know, sharing consumers data with third parties? And so I think it's important to mitigate that risk because, you know, you don't necessarily know of all of your third party partners. Google and Microsoft might be ones that you trust, but, you know, every vendor that you have has their own third party dependencies and so on.

00;08;49;17 - 00;09;16;01
Speaker 2
So our software makes it really easy if you're a company and there are other softwares out there to to make those decisions per script, what they're allowed to read and what kind of functions they're allowed to do. And if they're allowed to send data to external servers. And you can handle some of that manually too. Like if you're, you know, if you've got an in-house development team or security team and Josh could give more detailed as a follow up document for you guys on how to do that.

00;09;16;01 - 00;09;37;04
Speaker 2
But, you know, you can put some rules around reading form fields. It's a bit manual and it could cause some user experience issues. The hardest part, though, is actually being able to find out what's happening and to to actually surface the problem. And that's the big challenge that I think we're going to see over the next few years that PCI compliance is really trying to head off.

00;09;37;04 - 00;09;47;29
Speaker 2
Now, I will say this, for consumers who want to protect themselves, that's probably one of the bigger challenges. And we do have a free extension that consumers can use. It doesn't cost anything, but I'll go.

00;09;47;29 - 00;10;02;21
Speaker 1
Back to that a little later. But companies so you had your company and companies like your scan can help a really any size website owner deal with this first by understanding what what's actually happening and then secondly, mitigating.

00;10;02;28 - 00;10;29;27
Speaker 2
That's right. Yeah. And it's know we built our tech specifically to help the non-technical, you know, there's a shortage of security experts. So we built it really to be developed centric, but also for small businesses who don't have any coding skills. So it's so important that people feel they can protect themselves in this online environment as everyone's moving digital transformation, you know, so many mom and pop shops just don't have the technical expertize to know how to begin to think about these problems.

00;10;30;12 - 00;10;58;14
Speaker 1
That's right. Josh, let's turn to you a little different question. I mean, the Giants have enormous amounts of data about all of this. And, you know, whether they say it or not, they're tempted, let's just put it that way, to at least sell or share the data, because that's part of how their business model is based. So in general, are you finding that companies large and small are aware of what we're talking about today and are dealing with it responsibly?

00;10;58;14 - 00;10;59;15
Speaker 1
What are you finding?

00;10;59;22 - 00;11;29;28
Speaker 3
I think a lot of companies probably aren't fully aware of the extent that this is happening. You know, you do sort of expect companies like Google to be harvesting all your data. And it's sort of an expectation that, you know, they they have all of it anyway, but that still doesn't necessarily make it right. You know, one thing we're trying we were doing with the software is actually providing a lot of visibility into what is, you know, taking your data and where it's being sent to really provide that transparent.

00;11;29;28 - 00;11;31;07
Speaker 3
See for companies to know.

00;11;31;10 - 00;11;34;09
Speaker 1
We started to up what do you say.

00;11;34;09 - 00;11;34;28
Speaker 3
Exactly.

00;11;34;28 - 00;11;55;18
Speaker 1
So yeah, good point but nobody's out to to get in trouble. Let's turn to the compliance issues now. Maggie, let me let me start with you. You know, we have five states of the United States that have relatively comprehensive law. I did a little exercise to see if any of them, say, spell jacking in the statutes. I couldn't find it.

00;11;55;28 - 00;12;03;00
Speaker 1
How do you see the the compliance requirements? Let's put it that way right now for this. What do you see in the future?

00;12;03;00 - 00;12;33;07
Speaker 2
Well, so I think you just made an excellent point. And I want to give our VP of engineering kudos. Walter Hoey actually coined spell jacking for the company as we were trying to brainstorm what would be a clever name that would get people talking. But it really is a great example of how just by naming something you might eliminate or seemingly eliminate its existence within a compliance, it's coming by not understanding the underlying thing that's happening or because of the nuance.

00;12;33;07 - 00;13;17;24
Speaker 2
Like in this particular case, unlike a regular third party JavaScript, you know, visibility or exposure or vulnerability even, this is something that is a real quirky thing that's being caused by Google. Microsoft trying to spell check everything that you're typing in and sending it to their servers in clear text to then, you know, have their advanced grammar tools send you back an idea of what would be the best thing to to to spell that as and so they they don't necessarily intend to capture that information and yet it's happening and there isn't this is like I would I would classify this under something like we've called criminals victims, which is there are victims of this

00;13;17;24 - 00;14;01;13
Speaker 2
kind of data exposure and many other things said there haven't been crimes yet defined, laws yet defined that make it a crime explicitly. And so there are all these crimes victims. And even with when you look at GDPR and you look at PCI compliance V for on the desk, it's, it's I think that one, the PCI compliance is really probably the most advanced one we've seen because it talks specifically and explicitly about third party JavaScript and runtime and monitoring that scripts, activities and behavior to make sure you are mitigating any kind of unauthorized activity with your customers as they're in the client, in the browser runtime.

00;14;02;22 - 00;14;09;05
Speaker 1
And so so just under that, when companies have a period, do they to comply with that one and see what. Yes, it will go.

00;14;09;12 - 00;14;34;23
Speaker 2
Yeah, they wrote so they published that version in like I think was March of this year and they've given till next year they're going to actually the new requirements they're rolling out like you have to get compliant for the V4, but there'll be overlap and 2024 to 2025 where you can be both as long as you're working towards the new V for, but then after that you have to be V4 compliant.

00;14;34;23 - 00;14;40;18
Speaker 2
So it's they've given a maybe an unreasonable amount of time. Well, a couple of users. I mean in.

00;14;41;02 - 00;14;53;17
Speaker 1
2022 and it doesn't have to be 2025, you have two or three years here where those would be this gray area. Of course, that's how the law work usually follows in practice. We don't like. And then the legislators are saying.

00;14;55;06 - 00;15;19;11
Speaker 2
Here's something else, though. Having been, you know, at the L.A. Times and American Public Media, you know, big media companies, it can take two years is to get a project like that really rolling. So I think it's it's actually a very narrow window for companies, but for consumers who are having their PII and their credentials potentially exposed and not knowing the depth of that, it's a really long period, you know.

00;15;19;11 - 00;15;22;01
Speaker 2
So if you could look at this from perspectives.

00;15;22;12 - 00;15;40;13
Speaker 1
And there will be regulators, certainly PR regulators, I think looking at this certainly creates an enormous amount of exposure of PII. And Josh, just briefly, regulators, do you think they're aware of this problem by now and thinking about how to deal with it.

00;15;40;26 - 00;16;11;02
Speaker 3
So specifically, you know, with the PCI type stuff that's coming out, I mean, they're they're aware that, you know, there's been some major data breaches that have been result of third party scripts or vendors or the supply chain risk, the calls those have gotten, I think, you know, quite a bit of media attention lately. I think these are things that they're having to comply with and they're going to have to, you know, make sure that they're audited and go through the same process that they usually would for like a normal type of application that they would do.

00;16;11;02 - 00;16;15;08
Speaker 3
But now they're going to have to do it for all of their vendors and supply chain risks.

00;16;15;22 - 00;16;34;25
Speaker 1
So this is on the agenda and we'll see where it leads. Maggie, one more question to you and turn to consumer advice in a minute. What about sites like oh, how about Planned Parenthood? Women are exploring their their health issues and what they can do about a pregnancy where banks or or take porn sites. What about those?

00;16;35;07 - 00;17;03;06
Speaker 2
That is such an important topic, Joe. That and that's really what the crux of it is. Your privacy. We all know what's happening in politics. And, you know, abortion law and porn site is what we focus on because I think we knew everyone could really relate to that. But when we talked about this and you brought up Planned Parenthood, that's exactly the kind of thing where in the wrong hands, a user's name, their password, where they live, that they're trying to find a location to have an abortion or that they're seeking information.

00;17;03;26 - 00;17;31;04
Speaker 2
These are these could be this information could be weaponized very easily by groups who have political agendas against that. And likewise, it could be weaponized by nation state actors and people who have an agenda. You know, to sway voters, to sway government officials and so on. And so it's it's really it's really broad. The potential weaponization of being able to get access to this stuff.

00;17;31;10 - 00;17;45;25
Speaker 2
So it's a very important burning issue that we really need to get our hands around and that education is probably the best thing we can do right now in light of the legislation being, you know, still roughly a year and a half away from taking this on full steam.

00;17;46;12 - 00;18;11;20
Speaker 1
With a lot at stake. Thank you for that. Well, let's a final topic. What can individuals do about this? They can't hire their own companies. They're worried about all these things. And I know we're the auto jobs website. Here is a good explanation for consumers and developers as well about how to, for example, deal with the spell check feature of Google and Microsoft Edge spell check.

00;18;12;13 - 00;18;28;10
Speaker 1
But let me ask you each in turn, what is the top advice you would have for individuals about protecting their personal privacy because of this spell jacking phenomenon that exists? Josh, let's start with you. What would be your top tips?

00;18;28;16 - 00;18;53;21
Speaker 3
So, one, I would caution not enabling features in the browser that maybe you don't understand what they do and also limit extensions that you run in the browser to ones that are trusted vendors. You know, because any one of these can also be used to steal your data. There's also some extensions that can do things like privacy badger, which is done by EFI, and it can do things like lock tracking scripts and stuff like that that could be using to harvest your data.

00;18;54;02 - 00;19;03;10
Speaker 1
They're not terribly expensive for individuals, right? Right. Maggie, what would be your top advice to people who do care about their personal privacy?

00;19;03;10 - 00;19;22;15
Speaker 2
Well, I would echo what Josh said. I don't I'd also say to that check and see what what browser extensions you've added, because all of them have access to get this kind of information from any site that you go to. So be aware that browser extensions have a lot of power. Make sure it's a trusted developer or company that's put that out.

00;19;22;26 - 00;19;41;19
Speaker 2
But just as you mentioned, we've got free solution so people can use our Chrome extension and it'll surface. When you go to a site, it'll let you know if it's reading sensitive input feel. So if you're like me, I'm dyslexic, so I have to often enhance spell check or show password. And if you're like that, this is a good way to keep it enabled.

00;19;41;19 - 00;20;04;27
Speaker 2
But get a reminder if you're putting your password at risk and for developers, they can use our same free tools to look at their site, see if they have issues. And and it's the exact same kind of enterprise level detection that we use in our in our core software that works for companies to mitigate this stuff. But even if, you know, you can look around on the on different browsers like Ghostery is a great app.

00;20;04;27 - 00;20;25;00
Speaker 2
It's been around for years. And actually the former CEO Scott Meyer is an advisor for us and that's a great similar kind of app. There's is not really focus on security as much as trackers, but there are a number of them out there that you can install that will at least give you the visibility if you're a consumer and be aware.

00;20;25;00 - 00;20;40;21
Speaker 2
So you will want to take, you know, take precautions, turn off spell check if you're on a site that could be exposing you. And just remember that the first question you've got to ask yourself is, do you do you trust this app? Do you trust this site? Exercise some caution.

00;20;41;09 - 00;20;57;26
Speaker 1
All right. Very good. But one last question. Maybe I'll throw it to you, George. How about other other browsers like Firefox and DuckDuckGo, the ones that claim newer privacy centric? So, you know, from Google or the Giants, you know, I mean, what about those? Are they different?

00;20;58;05 - 00;21;04;24
Speaker 3
Some of them like for instance, when we looked at this, we notice both Safari and.

00;21;05;07 - 00;21;06;09
Speaker 1
Apple fight Apple.

00;21;06;10 - 00;21;25;06
Speaker 3
Right, right. The Apple Safari and and Firefox didn't have any didn't have any of these sort of an enhanced spell check type features. Firefox, I don't know if this was intentional, but I thought it was more kind of clever. And how they solved the problem is that they didn't even enable spell check unless you entered more than one line.

00;21;25;14 - 00;21;29;16
Speaker 3
So that effectively stopped spell checking on sensitive form fields and things like that.

00;21;29;17 - 00;21;33;10
Speaker 1
A clever way to do it is you can say I'm sandboxing the problem.

00;21;33;10 - 00;21;34;04
Speaker 3
Yeah, exactly.

00;21;34;05 - 00;21;56;02
Speaker 1
Right. Very good. Well, I can't thank you enough today for introducing all of us to this emerging issue at any time. Our data is in flight and it is when things are sent to the cloud. There are these issues and it can expose very sensitive information that can lead to identity theft and loss of money and certainly loss of privacy.

00;21;56;11 - 00;22;16;01
Speaker 1
Maggie, Josh, thank you so much for exposing us to this this continuing world of data and how it travels and how it affects our privacy. And as always, I will remind our listeners, as I close every time, protecting your personal privacy begins with you.

This podcast was created for general informational purposes only as of the time of its creation and does not constitute legal advice, the formation of an attorney client relationship, or a solicitation to provide legal services. The laws governing legal advertising in some states require the following statement in any publication of this kind: “THIS IS AN ADVERTISEMENT.” All rights reserved